Guide for data sources and getting data in¶

Important Data Sources¶

The OT environment is a combination of traditional and legacy IT technologies (e.g. firewalls, servers, workstations) combined with OT specific technologies (e.g. PLC's, RTU's). The following table outlines common data sources that should be integrated to provide full functionality with the existing OT Security Add-on, along with related data models.

The following data sources often produce value and are recommended to be collected

Data Source	Criticality to App	Data Models
Windows Security Events	Critical	Authentication, Change
LDAP (e.g. Active Directory)	Critical	Authentication, Change
Firewall Traffic & System Logs	Critical	Network Traffic, Network Session, Change, Authentication
OT Security Solutions	High	Authentication, Intrusion Detection, OT Asset, Vulnerability
Endpoint Protection	High	OT Asset, Malware
Network Traffic & System Logs	Medium	Network Traffic, Network Session, Change, Authentication
Patching Logs	Medium	Updates, OT Asset
Host Information (Application, Services, OS)	Low	Inventory, Change, OT Asset, OT Software

Other Important Information¶

There are a number of data sources which may provide contextual value around the OT environment. These data sources can help with macros and lookups which the OT Add-on leverages or to ensure the dashboards are reporting accurately. While not all of these data sources are required, having them can help enhance reporting.

Data Sources	Value
OT VLAN's and Subnets	In many organizations, the use of particular subnets is common practice for OT environments. This information can be used for some of the following purposes: Identification of sites, prohibited and allowed network traffic, identify locations of devices based on subnet.
Identification of Perimeter Assets and Characteristics	Within the OT Add-on several macros attempt to identify devices classified as perimeter or boundary devices. Identifying these assets either based on asset type, ip, or other characteristics is essential for the perimeter monitoring dashboards
List of prohibitted traffic types and flow direction	For the OT Prohibited Traffic Dashboards, the dashboards rely on building a simplified lookup that identifies all traffic that should be explicitly prohibited (for example, http or https) including direction (inbound vs outbound) and is crucial in identifying this suspicious traffic
List of internal IP ranges (IT and OT)	This list is used to identify not only the environment traffic sources and destinations, but also to identify traffic that might be outside a organization. I list of known IP's ranges can help in configuring macros when identifying OT, IT, or External traffic
Normal operating hours per site_id	Several reports attempt to identify activity normal working hours for sites. Having the normal operating hours including time offsets from GMT helps those reports to be accurate when reporting activity after normal working hours.
List of permitted External Media devices	To help identify external media devices that are allowed within the OT environment, having information on those devices perimeted included parameters like host restrictions can help in identifying allowed and prohibbited activity
National Vulnerability Database - CVEs	CVE defintions from the National Vulnerability Database (NVD) are used to correlate against detected vulnerabilities as well as identifying potential vulnerabilities. Various plugins exist to pull in this data into Splunk.

Getting Data In (OT Specific Considerations)¶

In OT environments and use of agents and similar mechanisms may not be approved by the system vendor. The following table outlines various mechanisms to pull in data from OT environments based on customer implementations. For more details

Method	Data Source	Notes	Access Type
Universal Forwarder	Host bases logs including Windows events, applications installed, service configuration, ICS logs, performance monitoring	More versatility and control is data types collected from hosts	Agent based
Existing agents	Depending on agent but could include malware, security events, and asset information	Examples include (but not limited to) Snare, Endpoint Protection, WhatsUp Gold, SCCM, and SCOM	Agent based
SFTP/FTP	Typically text based logs	SFTP (preferred) and FTP can be used to export logs periodically to systems using a Universal Forwarder to forward the logs	Agent based
Windows Event Forwarding	Windows Events	Can be used to collect Security, System, Application, or other specific windows events	Configuration
Syslog	Network and firewall logs, netflow, security alerts from other products	Best practice to leverage a syslog server rather than sending directly to Splunk	Configuration
Zeek	Networking information	Zeek can collect information on network activity such as network traffic and in some cases may support industrial protocols	Configuration
OT Security Solutions	asset information, alerts, vulnerabilities	Most OT Security Solutions have the ability to send asset, alerts, and vulnerability information to Splunk but may change as capabilities mature. In most cases they provide this information via syslog and REST API's	Remote
REST API's	Depending on application but could include malware, security events, vulnerability information, and asset information	Common mechanism leveraged by OT Security Products to collect alerts, asset info, and vulnerabilities	Remote
DBConnect	ICS Logs, Alarm Information, Configuration, Patching Info, Host Based Information	Leveraged by data historians, patching solutions, ICS systems, and other systems	Remote
WMI Collector	OS components, process & service information, applications, user accounts, security settings	Consideration should be given to scaling of WMI for large environments	Remote
HTTP Event Collector (HEC)	System health and state information	Newer products are providing methods to collect via HEC but depends on the application	Remote

For more general information about indexing data in Splunk Enterprise, please refer to the following documentation: Getting Data In.

Integrating with specific OT Data Models¶

Splunk for OT Security includes several data models that can be leveraged to automatically generate asset lookups. In addition, OT partners of Splunk should populate any hardware and software data captured or created by their add-ons to these data models.

Two data models have been created to facilitate populating assets into Splunk for Enterprise Security. The most critical model for asset information in the Splunk OT Security Solution is the OT Asset data model contained in the Splunk for OT Security app. This data model is designed to be used with hardware assets such as servers, PLC's, workstations, etc. and contains all fields in the OT Asset Framework. An additional data model also exists called OT Software Asset which is used to populate additional information regarding firmware, operating system, and software present on each OT asset. Together data from each can be combined to provide additional context around an asset as well as components installed on each asset.

The OT Add-on for Splunk does have specific requirements for parts of the ES Asset Framework field values and formats. These fields are used to tag and identify assets as belonging to OT systems or specific classifications. The following outline these requirements

Field	Restriction\Format	Sample
`asset_system`	Asset systems are often collections of site that may refer to a grouping of assets. While not require it is suggested for filtering purposes.	Western Operations
`asset_type`	Asset types classification the purpose or function of the asset	Historian
`category`	The use of static text "ot" (without quotes) is used broadly to denote which assets are part of the OT Environment.	ot \| windows \| nerc
`classification`	Classifications related to specific frameworks should follow the format - <framework>:<value>	cip:high \| cip:BCA
`site_id`	Ideally this should be populated with a name of a facility of site where the asset may reside. It is used on multiple dashboards as a filter.	Johnson Refinery
`zone`	Purdue zone mappings should following the following format -- purdue:level<level #>	purdue:level3