Skip to content

Guide for data sources and getting data in

Important Data Sources

The OT environment is a combination of traditional and legacy IT technologies (e.g. firewalls, servers, workstations) combined with OT specific technologies (e.g. PLC's, RTU's). The following table outlines common data sources that should be integrated to provide full functionality with the existing OT Security Add-on, along with related data models.

The following data sources often produce value and are recommended to be collected

Data Source Criticality to App Data Models
Windows Security Events Critical Authentication, Change
LDAP (e.g. Active Directory) Critical Authentication, Change
Firewall Traffic & System Logs Critical Network Traffic, Network Session, Change, Authentication
OT Security Solutions High Authentication, Intrusion Detection, OT Asset, Vulnerability
Endpoint Protection High OT Asset, Malware
Network Traffic & System Logs Medium Network Traffic, Network Session, Change, Authentication
Patching Logs Medium Updates, OT Asset
Host Information (Application, Services, OS) Low Inventory, Change, OT Asset, OT Software

Other Important Information

There are a number of data sources which may provide contextual value around the OT environment. These data sources can help with macros and lookups which the OT Add-on leverages or to ensure the dashboards are reporting accurately. While not all of these data sources are required, having them can help enhance reporting.

Data Sources Value
OT VLAN's and Subnets In many organizations, the use of particular subnets is common practice for OT environments. This information can be used for some of the following purposes: Identification of sites, prohibited and allowed network traffic, identify locations of devices based on subnet.
Identification of Perimeter Assets and Characteristics Within the OT Add-on several macros attempt to identify devices classified as perimeter or boundary devices. Identifying these assets either based on asset type, ip, or other characteristics is essential for the perimeter monitoring dashboards
List of prohibitted traffic types and flow direction For the OT Prohibited Traffic Dashboards, the dashboards rely on building a simplified lookup that identifies all traffic that should be explicitly prohibited (for example, http or https) including direction (inbound vs outbound) and is crucial in identifying this suspicious traffic
List of internal IP ranges (IT and OT) This list is used to identify not only the environment traffic sources and destinations, but also to identify traffic that might be outside a organization. I list of known IP's ranges can help in configuring macros when identifying OT, IT, or External traffic
Normal operating hours per site_id Several reports attempt to identify activity normal working hours for sites. Having the normal operating hours including time offsets from GMT helps those reports to be accurate when reporting activity after normal working hours.
List of permitted External Media devices To help identify external media devices that are allowed within the OT environment, having information on those devices perimeted included parameters like host restrictions can help in identifying allowed and prohibbited activity
National Vulnerability Database - CVEs CVE defintions from the National Vulnerability Database (NVD) are used to correlate against detected vulnerabilities as well as identifying potential vulnerabilities. Various plugins exist to pull in this data into Splunk.

Getting Data In (OT Specific Considerations)

In OT environments and use of agents and similar mechanisms may not be approved by the system vendor. The following table outlines various mechanisms to pull in data from OT environments based on customer implementations. For more details

Method Data Source Notes Access Type
Universal Forwarder Host bases logs including Windows events, applications installed, service configuration, ICS logs, performance monitoring More versatility and control is data types collected from hosts Agent based
Existing agents Depending on agent but could include malware, security events, and asset information Examples include (but not limited to) Snare, Endpoint Protection, WhatsUp Gold, SCCM, and SCOM Agent based
SFTP/FTP Typically text based logs SFTP (preferred) and FTP can be used to export logs periodically to systems using a Universal Forwarder to forward the logs Agent based
Windows Event Forwarding Windows Events Can be used to collect Security, System, Application, or other specific windows events Configuration
Syslog Network and firewall logs, netflow, security alerts from other products Best practice to leverage a syslog server rather than sending directly to Splunk Configuration
Zeek Networking information Zeek can collect information on network activity such as network traffic and in some cases may support industrial protocols Configuration
OT Security Solutions asset information, alerts, vulnerabilities Most OT Security Solutions have the ability to send asset, alerts, and vulnerability information to Splunk but may change as capabilities mature. In most cases they provide this information via syslog and REST API's Remote
REST API's Depending on application but could include malware, security events, vulnerability information, and asset information Common mechanism leveraged by OT Security Products to collect alerts, asset info, and vulnerabilities Remote
DBConnect ICS Logs, Alarm Information, Configuration, Patching Info, Host Based Information Leveraged by data historians, patching solutions, ICS systems, and other systems Remote
WMI Collector OS components, process & service information, applications, user accounts, security settings Consideration should be given to scaling of WMI for large environments Remote
HTTP Event Collector (HEC) System health and state information Newer products are providing methods to collect via HEC but depends on the application Remote

For more general information about indexing data in Splunk Enterprise, please refer to the following documentation: Getting Data In.​

Integrating with specific OT Data Models

Splunk for OT Security includes several data models that can be leveraged to automatically generate asset lookups. In addition, OT partners of Splunk should populate any hardware and software data captured or created by their add-ons to these data models.

Two data models have been created to facilitate populating assets into Splunk for Enterprise Security. The most critical model​ for asset information in the Splunk OT Security Solution is​ the OT Asset​ ​ data model contained in the Splunk for OT Security app. This data model is designed to be used with hardware assets such as servers, PLC's, workstations, etc. and contains all fields in the OT Asset Framework. An additional data model also exists called OT Software Asset​ which is used to populate additional information regarding firmware, operating system, and software present on each OT asset. Together data from each can be combined to provide additional context around an asset as well as components installed on each asset.

The OT Add-on for Splunk does have specific requirements for parts of the ES Asset Framework field values and formats. These fields are used to tag and identify assets as belonging to OT systems or specific classifications. The following outline these requirements

Field Restriction\Format Sample
asset_system Asset systems are often collections of site that may refer to a grouping of assets. While not require it is suggested for filtering purposes. Western Operations
asset_type Asset types classification the purpose or function of the asset Historian
category The use of static text "ot" (without quotes) is used broadly to denote which assets are part of the OT Environment. ot | windows | nerc
classification Classifications related to specific frameworks should follow the format - <framework>:<value> cip:high | cip:BCA
site_id Ideally this should be populated with a name of a facility of site where the asset may reside. It is used on multiple dashboards as a filter. Johnson Refinery
zone Purdue zone mappings should following the following format -- purdue:level<level #> purdue:level3