5. Transform Data

10 minutes  

The Transform Processor lets you modify telemetry data—logs, metrics, and traces—as it flows through the pipeline. Using the OpenTelemetry Transformation Language (OTTL), you can filter, enrich, and transform data on the fly without touching your application code.

In this exercise we’ll update gateway.yaml to include a Transform Processor that will:

  • Filter log resource attributes.
  • Parse JSON structured log data into attributes.
  • Set log severity levels based on the log message body.

You may have noticed that in previous logs, fields like SeverityText and SeverityNumber were undefined. This is typical of the filelog receiver. However, the severity is embedded within the log body e.g.:

SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(2025-01-31 15:49:29 [WARN] - Do or do not, there is no try.)

Logs often contain structured data encoded as JSON within the log body. Extracting these fields into attributes allows for better indexing, filtering, and querying. Instead of manually parsing JSON in downstream systems, OTTL enables automatic transformation at the telemetry pipeline level.

Exercise
Important

Change ALL terminal windows to the 5-transform-data directory and run the clear command.

Copy *.yaml from the 4-sensitve-data directory into 5-transform-data. Your updated directory structure will now look like this:

.
├── agent.yaml
└── gateway.yaml
Last Modified Aug 8, 2025

Subsections of 5. Transform Data

5.1 Configuration

Exercise

Add a transform processor: Switch to your Gateway terminal window and edit the gateway.yaml and add the following transform processor:

  transform/logs:                      # Processor Type/Name
    log_statements:                    # Log Processing Statements
      - context: resource              # Log Context
        statements:                    # List of attribute keys to keep
          - keep_keys(attributes, ["com.splunk.sourcetype", "host.name", "otelcol.service.mode"])

By using the -context: resource key we are targeting the resourceLog attributes of logs.

This configuration ensures that only the relevant resource attributes (com.splunk.sourcetype, host.name, otelcol.service.mode) are retained, improving log efficiency and reducing unnecessary metadata.

Adding a Context Block for Log Severity Mapping: To properly set the severity_text and severity_number fields of a log record, we add a log context block within log_statements. This configuration extracts the level value from the log body, maps it to severity_text, and assigns the corresponding severity_number based on the log level:

      - context: log                   # Log Context
        statements:                    # Transform Statements Array
          - set(cache, ParseJSON(body)) where IsMatch(body, "^\\{")  # Parse JSON log body into a cache object
          - flatten(cache, "")                                        # Flatten nested JSON structure
          - merge_maps(attributes, cache, "upsert")                   # Merge cache into attributes, updating existing keys
          - set(severity_text, attributes["level"])                   # Set severity_text from the "level" attribute
          - set(severity_number, 1) where severity_text == "TRACE"    # Map severity_text to severity_number
          - set(severity_number, 5) where severity_text == "DEBUG"
          - set(severity_number, 9) where severity_text == "INFO"
          - set(severity_number, 13) where severity_text == "WARN"
          - set(severity_number, 17) where severity_text == "ERROR"
          - set(severity_number, 21) where severity_text == "FATAL"

The merge_maps function is used to combine two maps (dictionaries) into one. In this case, it merges the cache object (containing parsed JSON data from the log body) into the attributes map.

  • Parameters:
    • attributes: The target map where the data will be merged.
    • cache: The source map containing the parsed JSON data.
    • "upsert": This mode ensures that if a key already exists in the attributes map, its value will be updated with the value from cache. If the key does not exist, it will be inserted.

This step is crucial because it ensures that all relevant fields from the log body (e.g., level, message, etc.) are added to the attributes map, making them available for further processing or exporting.

Summary of Key Transformations:

  • Parse JSON: Extracts structured data from the log body.
  • Flatten JSON: Converts nested JSON objects into a flat structure.
  • Merge Attributes: Integrates extracted data into log attributes.
  • Map Severity Text: Assigns severity_text from the log’s level attribute.
  • Assign Severity Numbers: Converts severity levels into standardized numerical values.
Important

You should have a single transform processor containing two context blocks: one whose context is for resource and one whose context is for log.

This configuration ensures that log severity is correctly extracted, standardized, and structured for efficient processing.

Tip

This method of mapping all JSON fields to top-level attributes should only be used for testing and debugging OTTL. It will result in high cardinality in a production scenario.

Update the logs pipeline: Add the transform/logs: processor into the logs: pipeline so your configuration looks like this:

    logs:                         # Logs pipeline
      receivers:
      - otlp                      # OTLP receiver
      processors:                 # Processors for logs
      - memory_limiter
      - resource/add_mode
      - transform/logs
      - batch
      exporters:
      - debug                     # Debug exporter
      - file/logs
Last Modified Jul 11, 2025

5.2 Setup Environment

Exercise

Start the Gateway: In the Gateway terminal run:

../otelcol --config=gateway.yaml

Start the Agent: In the Agent terminal run:

../otelcol --config=agent.yaml

Start the Load Generator: In the Loadgen terminal window, execute the following command to start the load generator with JSON enabled:

../loadgen -logs -json -count 5

The loadgen will write 5 log lines to ./quotes.log in JSON format.

Last Modified Jul 11, 2025

5.3 Test Transform Processor

This test verifies that the com.splunk/source and os.type metadata have been removed from the log resource attributes before being exported by the Agent. Additionally, the test ensures that:

  1. The log body is parsed to extract severity information.
    • SeverityText and SeverityNumber are set on the LogRecord.
  2. JSON fields from the log body are promoted to log attributes.

This ensures proper metadata filtering, severity mapping, and structured log enrichment before exporting.

Exercise

Check the debug output: For both the Agent and Gateway confirm that com.splunk/source and os.type have been removed:

Resource attributes:
   -> com.splunk.sourcetype: Str(quotes)
   -> host.name: Str(workshop-instance)
   -> otelcol.service.mode: Str(agent)
Resource attributes:
   -> com.splunk.source: Str(./quotes.log)
   -> com.splunk.sourcetype: Str(quotes)
   -> host.name: Str(workshop-instance)
   -> os.type: Str(linux)
   -> otelcol.service.mode: Str(agent)

For both the Agent and Gateway confirm that SeverityText and SeverityNumber in the LogRecord is now defined with the severity level from the log body. Confirm that the JSON fields from the body can be accessed as top-level log Attributes:

<snip>
SeverityText: WARN
SeverityNumber: Warn(13)
Body: Str({"level":"WARN","message":"Your focus determines your reality.","movie":"SW","timestamp":"2025-03-07 11:17:26"})
Attributes:
     -> log.file.path: Str(quotes.log)
     -> level: Str(WARN)
     -> message: Str(Your focus determines your reality.)
     -> movie: Str(SW)
     -> timestamp: Str(2025-03-07 11:17:26)
</snip>
<snip>
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str({"level":"WARN","message":"Your focus determines your reality.","movie":"SW","timestamp":"2025-03-07 11:17:26"})
Attributes:
     -> log.file.path: Str(quotes.log)
</snip>

Check file output: In the new gateway-logs.out file verify the data has been transformed:

jq '[.resourceLogs[].scopeLogs[].logRecords[] | {severityText, severityNumber, body: .body.stringValue}]' gateway-logs.out
[
  {
    "severityText": "DEBUG",
    "severityNumber": 5,
    "body": "{\"level\":\"DEBUG\",\"message\":\"All we have to decide is what to do with the time that is given us.\",\"movie\":\"LOTR\",\"timestamp\":\"2025-03-07 11:56:29\"}"
  },
  {
    "severityText": "WARN",
    "severityNumber": 13,
    "body": "{\"level\":\"WARN\",\"message\":\"The Force will be with you. Always.\",\"movie\":\"SW\",\"timestamp\":\"2025-03-07 11:56:29\"}"
  },
  {
    "severityText": "ERROR",
    "severityNumber": 17,
    "body": "{\"level\":\"ERROR\",\"message\":\"One does not simply walk into Mordor.\",\"movie\":\"LOTR\",\"timestamp\":\"2025-03-07 11:56:29\"}"
  },
  {
    "severityText": "DEBUG",
    "severityNumber": 5,
    "body": "{\"level\":\"DEBUG\",\"message\":\"Do or do not, there is no try.\",\"movie\":\"SW\",\"timestamp\":\"2025-03-07 11:56:29\"}"
  }
]
[
  {
    "severityText": "ERROR",
    "severityNumber": 17,
    "body": "{\"level\":\"ERROR\",\"message\":\"There is some good in this world, and it's worth fighting for.\",\"movie\":\"LOTR\",\"timestamp\":\"2025-03-07 11:56:29\"}"
  }
]
Important

Stop the Agent and the Gateway processes by pressing Ctrl-C in their respective terminals.