The goal of this workshop is to help you become comfortable creating and modifying OpenTelemetry Collector configuration files. Youβll start with a minimal agent.yaml file and gradually configure several common advanced scenarios.
The workshop also explores how to configure the OpenTelemetry Collector to store telemetry data locally instead of transmitting it to a third-party vendor backend. Furthermore, this approach significantly enhances the debugging and troubleshooting process and is useful for testing and development environments where you donβt want to send data to a production system.
To get the most out of this workshop, you should have a basic understanding of the OpenTelemetry Collector and its configuration file format. Additionally, proficiency in editing YAML files is required. The entire workshop is designed to run locally.
Workshop Overview
During this workshop, we will cover the following topics:
Setting up the agent locally: Add metadata, and introduce the debug and file exporters.
Configuring a gateway: Route traffic from the agent to the gateway.
Configuring the Filelog receiver: Collect log data from various log files.
Enhancing agent resilience: Basic configurations for fault tolerance.
Configuring processors:
Filter out noise by dropping specific spans (e.g., health checks).
Remove unnecessary tags, and handle sensitive data.
Transform data using OTTL in the pipeline before exporting.
Configuring Connectors: Route data to different endpoints based on the values received.
By the end of this workshop, you’ll be familiar with configuring the OpenTelemetry Collector for a variety of real-world use cases.
Prerequisites
Create a directory on your machine for the workshop (e.g., advanced-otel). We will refer to this directory as [WORKSHOP] in the instructions.
Download the latest OpenTelemetry Collector release for your platform and place it in the [WORKSHOP] directory:
Mac users must trust the executable when running otelcol for the first time. For more details, refer to Apple’s support page.
Optional Tools
For this workshop, using a good YAML editor like Visual Studio Code will be beneficial.
Additionally, having access to jq is recommended. This lightweight command-line tool helps process and format JSON data, making it easier to inspect traces, metrics, and logs from the OpenTelemetry Collector.
Subsections of Advanced OpenTelemetry
1. Agent Configuration
10 minutes
Tip
During this workshop, you will be using up to four terminal windows simultaneously. To stay organized, consider customizing each terminal or shell with unique names and colors. This will help you quickly identify and switch between them as needed.
We will refer to these terminals as: Agent, Gateway, Tests and Log-gen.
Exercise
In your [WORKSHOP] directory, create a subdirectory called 1-agent and change into that directory.
cd [WORKSHOP]
mkdir 1-agent
cd 1-agent
In the 1-agent directory, create a file named agent.yaml. This file will define the basic structure of an OpenTelemetry Collector configuration.
Copy and paste the following initial configuration into agent.yaml:
########################### This section holds all the## Configuration section ## configurations that can be ########################### used in this OpenTelemetry Collectorextensions:# Array of Extensionshealth_check:# Configures the health check extensionendpoint:0.0.0.0:13133# Endpoint to collect health check datareceivers:# Array of Receivershostmetrics:# Receiver Typecollection_interval:3600s # Scrape metrics every hourscrapers:# Array of hostmetric scraperscpu:# Scraper for cpu metricsexporters:# Array of Exportersprocessors:# Array of Processorsmemory_limiter:# Limits memory usage by Collectors pipelinecheck_interval:2s # Interval to check memory usagelimit_mib:512# Memory limit in MiB########################### This section controls what### Activation Section ### configurations will be used########################### by this OpenTelemetry Collectorservice:# Services configured for this Collectorextensions:# Enabled extensions- health_checkpipelines:# Array of configured pipelinestraces:receivers:processors:- memory_limiter # Memory Limiter processorexporters:metrics:receivers:processors:- memory_limiter # Memory Limiter processorexporters:logs:receivers:processors:- memory_limiter # Memory Limiter processorexporters:
Let’s walk through a few modifications to our agent configuration to get things started:
Exercise
Add an otlp receiver: The OTLP receiver will listen for incoming telemetry data over HTTP (or gRPC).
otlp:# Receiver Typeprotocols:# list of Protocols used http:# This wil enable the HTTP Protocolendpoint:"0.0.0.0:4318"# Endpoint for incoming telemetry data
Add a debug exporter: The Debug exporter will output detailed debug information for every telemetry record.
Update Pipelines: Ensure that the otlp receiver, memory_limiter processor, and debug exporter are added to the pipelines for traces, metrics, and logs. You can choose to use the format below or use array brackets [memory_limiter]:
During this workshop, we will use otelbin.io to quickly validate YAML syntax and ensure OpenTelemetry configurations are correct. This helps prevent errors before running tests during this workshop.
To validate your configuration:
Open otelbin.io and replace the existing configuration by pasting your own YAML into the left pane.
At the top of the page, ensure that Splunk OpenTelemetry Collector is selected as the validation target.
Once validated, refer to the image representation below to verify if your pipelines are correctly set up.
In most cases, we will display only the key pipeline. However, if all three pipelines (Traces, Metrics, and Logs) share the same structure, we will indicate this instead of displaying each one separately.
%%{init:{"fontFamily":"monospace"}}%%
graph LR
%% Nodes
REC1( otlp <br>fa:fa-download):::receiver
PRO1(memory_limiter<br>fa:fa-microchip):::processor
EXP1( debug <br>fa:fa-upload):::exporter
%% Links
subID1:::sub-traces
subgraph " "
subgraph subID1[**Traces/Metrics/Logs**]
direction LR
REC1 --> PRO1
PRO1 --> EXP1
end
end
classDef receiver,exporter fill:#8b5cf6,stroke:#333,stroke-width:1px,color:#fff;
classDef processor fill:#6366f1,stroke:#333,stroke-width:1px,color:#fff;
classDef con-receive,con-export fill:#45c175,stroke:#333,stroke-width:1px,color:#fff;
classDef sub-traces stroke:#fff,stroke-width:1px, color:#fff,stroke-dasharray: 3 3;
1.2 Test Agent Configuration
Once you’ve updated the configuration, youβre ready to proceed to running the OpenTelemetry Collector with your new setup. This exercise sets the foundation for understanding how data flows through the OpenTelemetry Collector.
Exercise
Find your Agent terminal window:
Change into the [WORKSHOP]/1-agent folder
Run the following command:
../otelcol --config=agent.yaml
In this workshop, we use macOS/Linux commands by default. If youβre using Windows, adjust the commands as needed i.e. use ./otelcol.exe.
Note
On Windows, a dialog box may appear asking if you want to grant public and private network access to otelcol.exe. Click “Allow” to proceed.
Exercise
Verify debug output: If everything is set up correctly, the first and last lines of the output should display:
2025/01/13T12:43:51 settings.go:478: Set config to [agent.yaml]
<snip to the end>
2025-01-13T12:43:51.747+0100 info service@v0.117.0/service.go:261 Everything is ready. Begin running and processing data.
Create a test span file:
Instead of instrumenting an application, we will simulate sending trace data to the OpenTelemetry Collector using cURL. The trace data, formatted in JSON, represents what an instrumentation library would typically generate and send.
Find your Tests Terminal window and change into the [WORKSHOP]/1-agent directory.
Copy and paste the following span data into a new file named trace.json:
This file will allow us to test how the OpenTelemetry Collector processes and send spans that are part of a trace, without requiring actual application instrumentation.
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5B8EFFF798038103D269B633813FC60C","spanId":"EEE19B7EC3C1B174","parentSpanId":"EEE19B7EC3C1B173","name":"I'm a server span","startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","kind":2,"attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR>StarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}]}]}]}]}
Send a test span: Run the following command to send a span to the agent:
curl -X POST -i http://localhost:4318/v1/traces -H "Content-Type: application/json" -d "@trace.json"
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 27 Jan 2025 09:51:02 GMT
Content-Length: 21
{"partialSuccess":{}}%
Info
HTTP/1.1 200 OK: Confirms the request was processed successfully.
{"partialSuccess":{}}: Indicates 100% success, as the field is empty. In case of a partial failure, this field will include details about any failed parts.
Note
On Windows, you may encounter the following error:
2025-02-03T12:46:25.675+0100 info ResourceSpans #0
Resource SchemaURL:
Resource attributes:
-> service.name: Str(my.service)
-> deployment.environment: Str(my.environment)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope my.library 1.0.0
InstrumentationScope attributes:
-> my.scope.attribute: Str(some scope attribute)
Span #0
Trace ID : 5b8efff798038103d269b633813fc60c
Parent ID : eee19b7ec3c1b173
ID : eee19b7ec3c1b174
Name : I'm a server span
Kind : Server
Start time : 2018-12-13 14:51:00 +0000 UTC
End time : 2018-12-13 14:51:01 +0000 UTC
Status code : Unset
Status message :
Attributes:
-> user.name: Str(George Lucas)
-> user.phone_number: Str(+1555-867-5309)
-> user.email: Str(george@deathstar.email)
-> user.account_password: Str(LOTR>StarWars1-2-3)
-> user.visa: Str(4111 1111 1111 1111)
-> user.amex: Str(3782 822463 10005)
-> user.mastercard: Str(5555 5555 5555 4444)
{"kind": "exporter", "data_type": "traces", "name": "debug"}
1.3 File Exporter
To capture more than just debug output on the screen, we also want to generate output during the export phase of the pipeline. For this, we’ll add a File Exporter to write OTLP data to files for comparison. The difference between the OpenTelemetry debug exporter and the file exporter lies in their purpose and output destination:
Feature
Debug Exporter
File Exporter
Output Location
Console/Log
File on disk
Purpose
Real-time debugging
Persistent offline analysis
Best for
Quick inspection during testing
Temporary storage and sharing
Production Use
No
Rare, but possible
Persistence
No
Yes
In summary, the Debug Exporter is great for real-time, in-development troubleshooting, while the File Exporter is better suited for storing telemetry data locally for later use.
Exercise
Find your Agent terminal window, and stop the running collector by pressing Ctrl-C. Once the Agent has stopped, open the agent.yaml and configure the File Exporter:
Configuring a file exporter: The File Exporter writes telemetry data to files on disk.
file:# Exporter Typepath:"./agent.out"# Save path (OTLP JSON)append:false# Overwrite the file each time
Update the Pipelines Section: Add the file exporter to the metrics, traces and logs pipelines (leave debug as the first in the array).
On Windows, an open file may appear empty or cause issues when attempting to read it. To prevent this, make sure to stop the Agent or the Gateway before inspecting the file, as instructed.
Verify the span format:
Check the Format that The File Exporter has used to write the span to the agent.out.
It should be a single line in OTLP/JSON format.
Since no modifications have been made to the pipeline yet, this file should be identical to trace.json.
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5B8EFFF798038103D269B633813FC60C","spanId":"EEE19B7EC3C1B174","parentSpanId":"EEE19B7EC3C1B173","name":"I'm a server span","startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","kind":2,"attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR>StarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}]}]}]}]}
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5B8EFFF798038103D269B633813FC60C","spanId":"EEE19B7EC3C1B174","parentSpanId":"EEE19B7EC3C1B173","name":"I'm a server span","startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","kind":2,"attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR>StarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}]}]}]}]}
Tip
If you want to view the fileβs content, simply run:
cat agent.out
For a formatted JSON output, you can use the same command but pipe it through jq (if installed):
cat ./agent.out | jq
1.4 Resource Metadata
So far, we’ve simply exported an exact copy of the span sent through the OpenTelemetry Collector.
Now, let’s improve the base span by adding metadata with processors. This extra information can be helpful for troubleshooting and correlation.
Find your Agent terminal window, and stop the running collector by pressing Ctrl-C. Once the Agent has stopped, open the agent.yaml and configure the resourcedetection and resource processors:
Exercise
Add the resourcedetection Processor: The Resource Detection Processor can be used to detect resource information from the host and append or override the resource value in telemetry data with this information.
Add resource Processor and name it add_mode: The Resource Processor can be used to apply changes on resource attributes.
resource/add_mode:# Processor Type/Nameattributes:# Array of attributes and modifications- action:insert # Action is to insert a keykey:otelcol.service.mode # Key namevalue:"agent"# Key value
Update All Pipelines: Add both processors (resourcedetection and resource/add_mode) to the processors array in all pipelines (traces, metrics, and logs). Ensure memory_limiter remains the first processor.
By adding these processors, we enrich the data with system metadata and the agentβs operational mode, which aids in troubleshooting and provides useful context for related content.
Validate the agent configuration using otelbin.io:
Verify that metadata is added to spans in the new agent.out file:
Check for the existence of theotelcol.service.mode attribute in the resourceSpans section and that it has a value of agent.
Verify that the resourcedetection attributes (host.name and os.type) exist too.
These values are automatically added based on your device by the processors configured in the pipeline.
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5b8efff798038103d269b633813fc60c","spanId":"eee19b7ec3c1b174","parentSpanId":"eee19b7ec3c1b173","name":"I'm a server span","kind":2,"startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR\u003eStarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5b8efff798038103d269b633813fc60c","spanId":"eee19b7ec3c1b174","parentSpanId":"eee19b7ec3c1b173","name":"I'm a server span","kind":2,"startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR>StarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
Stop the Agent process by pressing Ctrl-C in the terminal window.
2. Gateway Configuration
10 minutes
Exercise
Inside the [WORKSHOP] directory, create a new subdirectory named 2-gateway.
Next, copy all contents from the 1-agent directory into 2-gateway.
After copying, remove agent.out.
Create a file called gateway.yaml and add the following initial configuration:
Change all terminal windows to the [WORKSHOP]/2-gateway directory.
########################### This section holds all the## Configuration section ## configurations that can be ########################### used in this OpenTelemetry Collectorextensions:# Array of Extensionshealth_check:# Configures the health check extensionendpoint:0.0.0.0:14133# Port changed to prevent conflict with agent!!!receivers:otlp:# Receiver Typeprotocols:# list of Protocols usedhttp:# This wil enable the HTTP Protocolendpoint:"0.0.0.0:5318"# Port changed to prevent conflict with agent!!!include_metadata:true# Needed for token pass through modeexporters:# Array of Exportersdebug:# Exporter Typeverbosity:detailed # Enabled detailed debug outputprocessors:# Array of Processorsmemory_limiter:# Limits memory usage by Collectors pipelinecheck_interval:2s # Interval to check memory usagelimit_mib:512# Memory limit in MiBbatch:# Processor to Batch data before sendingmetadata_keys:# Include token in batches- X-SF-Token # Batch data grouped by Tokenresource/add_mode:# Processor Type/Nameattributes:# Array of Attributes and modifications- action:upsert # Action taken is to `insert' or 'update' a keykey:otelcol.service.mode # key Namevalue:"gateway"# Key Value########################### This section controls what### Activation Section ### configuration will be used########################### by the OpenTelemetry Collectorservice:# Services configured for this Collectorextensions:[health_check] # Enabled extensions for this collectorpipelines:# Array of configured pipelinestraces:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Memory Limiter processor- resource/add_mode # Add metadata about collector mode- batch # Batch Processor, groups data before send exporters:- debug # Debug Exportermetrics:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Memory Limiter processor- resource/add_mode # Add metadata about collector mode- batch # Batch Processor, groups data before send exporters:- debug # Debug Exporterlogs:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Memory Limiter processor- resource/add_mode # Add metadata about collector mode- batch # Batch Processor, groups data before sendexporters:- debug # Debug Exporter
In this section, we will extend the gateway.yaml configuration you just created to separate metric, traces & logs into different files.
Create a file exporter and name it traces: Separate exporters need to be configured for traces, metrics, and logs. Below is the YAML configuration for traces:
file/traces:# Exporter Type/Namepath:"./gateway-traces.out"# Path where data will be saved in OTLP json formatappend:false# Overwrite the file each time
Create additional exporters for metrics and logs: Follow the example above, and set appropriate exporter names. Update the file paths to ./gateway-metrics.out for metrics and ./gateway-logs.out for logs.
Add exporters to each pipeline: Ensure that each pipeline includes its corresponding file exporter, placing it after the debug exporter.
logs:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Memory Limiter processor- resource/add_mode # Adds collector mode metadata- batch # Groups Data before sendexporters:- debug # Debug Exporter- file/logs # File Exporter for logs
Validate the agent configuration using otelbin.io. For reference, the logs: section of your pipelines will look similar to this:
Run the following command to test the gateway configuration:
../otelcol --config=gateway.yaml
If everything is set up correctly, the first and last lines of the output should look like:
2025/01/15 15:33:53 settings.go:478: Set config to [gateway.yaml]
<snip to the end>
2025-01-13T12:43:51.747+0100 info service@v0.116.0/service.go:261 Everything is ready. Begin running and processing data.
Next, we will configure the Agent to send data to the newly created Gateway.
2.2 Configure Agent
Exercise
Update agent.yaml:
Switch to your Agent terminal window.
Make sure you are in the [WORKSHOP]/2-gateway directory.
Open the agent.yaml file that you copied earlier in your editor.
Add the otlphttp exporter:
The OTLP/HTTP Exporter is used to send data from the agent to the gateway using the OTLP/HTTP protocol. This is now the preferred method for exporting data to Splunk Observability Cloud (more details in Section 2.4 Addendum).
Ensure the endpoint is set to the gateway endpoint and port number.
Add the X-SF-Token header with a random value. During this workshop, you can use any value for X-SF-TOKEN. However, if you are connecting to Splunk Observability Cloud, this is where you will need to enter your Splunk Access Token (more details in Section 2.4 Addendum).
otlphttp:# Exporter Typeendpoint:"http://localhost:5318"# Gateway OTLP endpointheaders:# Headers to add to the HTTPcall X-SF-Token:"ACCESS_TOKEN"# Splunk ACCESS_TOKEN header
Add a Batch Processor configuration: Use the Batch Processor. It will accept spans, metrics, or logs and places them into batches. Batching helps better compress the data and reduce the number of outgoing connections required to transmit the data. It is highly recommended configuring the batch processor on every collector.
batch:# Processor Typemetadata_keys:[X-SF-Token] # Array of metadata keys to batch
Update the pipelines:
Add hostmetrics to the metrics pipeline. The HostMetrics Receiver will generate host metrics.
Add the batch processor after the resource/add_mode processor in the traces, metrics, and logs pipelines.
Replace the file exporter with the otlphttp exporter in the traces, metrics, and logs pipelines.
Verify the gateway is still running: Check your Gateway terminal window and make sure the Gateway collector is running.
Start the Agent: In the Agent terminal window start the agent with the updated configuration:
../otelcol --config=agent.yaml
Verify CPU Metrics:
Check that when the Agent starts, it immediately starts sending CPU metrics.
Both the Agent and the Gateway will display this activity in their debug output. The output should resemble the following snippet:
<snip>
NumberDataPoints #37
Data point attributes:
-> cpu: Str(cpu9)
-> state: Str(system)
StartTimestamp: 2024-12-09 14:18:28 +0000 UTC
Timestamp: 2025-01-15 15:27:51.319526 +0000 UTC
Value: 9637.660000
At this stage, the Agent continues to collect CPU metrics once per hour or upon each restart and sends them to the gateway. The OpenTelemetry Collector, running in Gateway mode, processes these metrics and exports them to a file named ./gateway-metrics.out. This file stores the exported metrics as part of the pipeline service.
Verify Data arrived at Gateway:
Open the newly created gateway-metrics.out file.
Check that it contains CPU metrics.
The Metrics should include details similar to those shown below (We’re only displaying the resourceMetrics section and the first set of CPU metrics β You will likely see more):
{"resourceMetrics":[{"resource":{"attributes":[{"key":"host.name","value":{"stringValue":"YOUR_HOST_NAME"}},{"key":"os.type","value":{"stringValue":"YOUR_OS"}},{"key":"otelcol.service.mode","value":{"stringValue":"gateway"}}]},"scopeMetrics":[{"scope":{"name":"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/cpuscraper","version":"v0.116.0"},"metrics":[{"name":"system.cpu.time","description":"Total seconds each logical CPU spent on each mode.","unit":"s","sum":{"dataPoints":[{"attributes":[{"key":"cpu","value":{"stringValue":"cpu0"}},{"key":"state","value":{"stringValue":"user"}}],"startTimeUnixNano":"1733753908000000000","timeUnixNano":"1737133726158376000","asDouble":1168005.59}]}}]}]}]}
{"resourceMetrics":[{"resource":{"attributes":[{"key":"host.name","value":{"stringValue":"YOUR_HOST_NAME"}},{"key":"os.type","value":{"stringValue":"YOUR_OS"}},{"key":"otelcol.service.mode","value":{"stringValue":"gateway"}}]},"scopeMetrics":[{"scope":{"name":"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/cpuscraper","version":"v0.116.0"},"metrics":[{"name":"system.cpu.time","description":"Total seconds each logical CPU spent on each mode.","unit":"s","sum":{"dataPoints":[{"attributes":[{"key":"cpu","value":{"stringValue":"cpu0"}},{"key":"state","value":{"stringValue":"user"}}],"startTimeUnixNano":"1733753908000000000","timeUnixNano":"1737133726158376000","asDouble":1168005.59},]}}]}]}]}
Validate both collectors are running:
Find the Agent terminal window. If the Agent is stopped, restart it.
Find the Gateway terminal window. Check if the Gateway is running, otherwise restart it.
Send a Test Trace:
Find your Tests terminal window
Navigate it to the [WORKSHOP]/2-gateway directory.
Ensure that you have copied trace.json to the 2-gateway directory.
Run the cURL command to send the span.
Below, we show the first and last lines of the debug output. Use the Complete Debug Output button below to verify that both the Agent and Gateway produced similar debug output.
{"kind": "exporter", "data_type": "metrics", "name": "debug"}
2025-02-05T15:55:18.966+0100 info Traces {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 1}
2025-02-05T15:55:18.966+0100 info ResourceSpans #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Resource attributes:
-> service.name: Str(my.service)
-> deployment.environment: Str(my.environment)
-> host.name: Str(PH-Windows-Box.hagen-ict.nl)
-> os.type: Str(windows)
-> otelcol.service.mode: Str(agent)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope my.library 1.0.0
InstrumentationScope attributes:
-> my.scope.attribute: Str(some scope attribute)
Span #0
Trace ID : 5b8efff798038103d269b633813fc60c
Parent ID : eee19b7ec3c1b173
ID : eee19b7ec3c1b174
Name : I'm a server span
Kind : Server
Start time : 2018-12-13 14:51:00 +0000 UTC
End time : 2018-12-13 14:51:01 +0000 UTC
Status code : Unset
Status message :
Attributes:
-> user.name: Str(George Lucas)
-> user.phone_number: Str(+1555-867-5309)
-> user.email: Str(george@deathstar.email)
-> user.account_password: Str(LOTR>StarWars1-2-3)
Gateway has handled the span: Verify that the gateway has generated a new file named ./gateway-traces.out.
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5b8efff798038103d269b633813fc60c","spanId":"eee19b7ec3c1b174","parentSpanId":"eee19b7ec3c1b173","name":"I'm a server span","kind":2,"startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR\u003eStarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"my.service"}},{"key":"deployment.environment","value":{"stringValue":"my.environment"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5b8efff798038103d269b633813fc60c","spanId":"eee19b7ec3c1b174","parentSpanId":"eee19b7ec3c1b173","name":"I'm a server span","kind":2,"startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","attributes":[{"key":"user.name","value":{"stringValue":"George Lucas"}},{"key":"user.phone_number","value":{"stringValue":"+1555-867-5309"}},{"key":"user.email","value":{"stringValue":"george@deathstar.email"}},{"key":"user.account_password","value":{"stringValue":"LOTR>StarWars1-2-3"}},{"key":"user.visa","value":{"stringValue":"4111 1111 1111 1111"}},{"key":"user.amex","value":{"stringValue":"3782 822463 10005"}},{"key":"user.mastercard","value":{"stringValue":"5555 5555 5555 4444"}}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
Ensure that both gateway-metrics.out and gateway-traces.out include a resource attribute key-value pair for otelcol.service.mode with the value gateway.
Note
In the provided gateway.yaml configuration, we modified the resource/add_mode processor to use the upsert action instead of insert.
The upsert action updates the value of the resource attribute key if it already exists, setting it to gateway. If the key is not present, the upsert action will add it.
Stop the Agent and Gateway processes by pressing Ctrl-C in their respective terminals.
2.4 Addendum - Info on Access Tokens and Batch Processing
Tip
Introduction to the otlphttp Exporter
The otlphttp exporter is now the default method for sending metrics and traces to Splunk Observability Cloud. This exporter provides a standardized and efficient way to transmit telemetry data using the OpenTelemetry Protocol (OTLP) over HTTP.
When deploying the Splunk Distribution of the OpenTelemetry Collector in host monitoring (agent) mode, the otlphttp exporter is included by default. This replaces older exporters such as sapm and signalfx, which are gradually being phased out.
Configuring Splunk Access Tokens
To authenticate and send data to Splunk Observability Cloud, you need to configure access tokens properly.
In OpenTelemetry, authentication is handled via HTTP headers. To pass an access token, use the headers: key with the sub-key X-SF-Token:. This configuration works in both agent and gateway mode.
If you need to forward headers through the pipeline, enable pass-through mode by setting include_metadata: to true in the OTLP receiver configuration. This ensures that any authentication headers received by the collector are retained and forwarded along with the data.
This is particularly useful in gateway mode, where data from multiple agents may pass through a centralized gateway before being sent to Splunk.
Understanding Batch Processing
The Batch Processor is a key component in optimizing data transmission efficiency. It groups traces, metrics, and logs into batches before sending them to the backend. Batching improves performance by:
Reducing the number of outgoing requests.
Improving compression efficiency.
Lowering network overhead.
Configuring the Batch Processor
To enable batching, configure the batch: section and include the X-SF-Token: key. This ensures that data is grouped correctly before being sent to Splunk Observability Cloud.
Example:
processors:batch:metadata_keys:[X-SF-Token] # Array of metadata keys to batch send_batch_size:100timeout:5s
Best Practices for Batch Processing
For optimal performance, it is recommended to use the Batch Processor in every collector deployment. The best placement for the Batch Processor is after the memory limiter and sampling processors. This ensures that only necessary data is batched, avoiding unnecessary processing of dropped data.
Gateway Configuration with Batch Processor
When deploying a gateway, ensure that the Batch Processor is included in the pipeline:
The otlphttp exporter is now the preferred method for sending telemetry data to Splunk Observability Cloud. Properly configuring Splunk Access Tokens ensures secure data transmission, while the Batch Processor helps optimize performance by reducing network overhead. By implementing these best practices, you can efficiently collect and transmit observability data at scale.
3. Filelog Setup
10 minutes
The FileLog Receiver in the OpenTelemetry Collector is used to ingest logs from files.
It monitors specified files for new log entries and streams those logs into the Collector for further processing or exporting. It is useful for testing and development purposes.
For this part of the workshop, there is script that will generate log lines in a file. The Filelog receiver will read these log lines and send them to the OpenTelemetry Collector.
Exercise
Move to the log-gen terminal window.
Navigate to the [WORKSHOP] directory and create a new subdirectory named 3-filelog.
Next, copy all contents from the 2-gateway directory into 3-filelog.
After copying, remove any *.out and *.log files.
Change all terminal windows to the [WORKSHOP]/3-filelog directory.
Your updated directory structure will now look like this:
Create the log-gen script: In the 3-filelog directory create the script log-gen.sh (macOS/Linux), or log-gen.ps1 (Windows) using the appropriate script below for your operating system:
#!/bin/bash
# Define the log fileLOG_FILE="quotes.log"# Define quotesLOTR_QUOTES=("One does not simply walk into Mordor.""Even the smallest person can change the course of the future.""All we have to decide is what to do with the time that is given us.""There is some good in this world, and it's worth fighting for.")STAR_WARS_QUOTES=("Do or do not, there is no try.""The Force will be with you. Always.""I find your lack of faith disturbing.""In my experience, there is no such thing as luck.")# Function to get a random quoteget_random_quote(){if(( RANDOM % 2==0));thenecho"${LOTR_QUOTES[RANDOM % ${#LOTR_QUOTES[@]}]}"elseecho"${STAR_WARS_QUOTES[RANDOM % ${#STAR_WARS_QUOTES[@]}]}"fi}# Function to get a random log levelget_random_log_level(){LOG_LEVELS=("INFO""WARN""ERROR""DEBUG")echo"${LOG_LEVELS[RANDOM % ${#LOG_LEVELS[@]}]}"}# Function to generate log entrygenerate_log_entry(){TIMESTAMP=$(date "+%Y-%m-%d %H:%M:%S")LEVEL=$(get_random_log_level)MESSAGE=$(get_random_quote)if["$JSON_OUTPUT"=true];thenecho"{\"timestamp\": \"$TIMESTAMP\", \"level\": \"$LEVEL\", \"message\": \"$MESSAGE\"}"elseecho"$TIMESTAMP [$LEVEL] - $MESSAGE"fi}# Parse command line argumentsJSON_OUTPUT=falsewhile[["$#" -gt 0]];docase$1 in
-json)JSON_OUTPUT=true;;esacshiftdone# Main loop to write logsecho"Writing logs to $LOG_FILE. Press Ctrl+C to stop."while true;do generate_log_entry >> "$LOG_FILE" sleep 1# Adjust this value for log frequencydone
# Define the log file$LOG_FILE="quotes.log"# Define quotes$LOTR_QUOTES=@("One does not simply walk into Mordor.""Even the smallest person can change the course of the future.""All we have to decide is what to do with the time that is given us.""There is some good in this world, and it's worth fighting for.")$STAR_WARS_QUOTES=@("Do or do not, there is no try.""The Force will be with you. Always.""I find your lack of faith disturbing.""In my experience, there is no such thing as luck.")# Function to get a random quotefunctionGet-RandomQuote{if((Get-Random-Minimum0-Maximum2)-eq0){return$LOTR_QUOTES[(Get-Random-Minimum0-Maximum$LOTR_QUOTES.Length)]}else{return$STAR_WARS_QUOTES[(Get-Random-Minimum0-Maximum$STAR_WARS_QUOTES.Length)]}}# Function to get a random log levelfunctionGet-RandomLogLevel{$LOG_LEVELS=@("INFO","WARN","ERROR","DEBUG")return$LOG_LEVELS[(Get-Random-Minimum0-Maximum$LOG_LEVELS.Length)]}# Function to generate log entryfunctionGenerate-LogEntry{$TIMESTAMP=Get-Date-Format"yyyy-MM-dd HH:mm:ss"$LEVEL=Get-RandomLogLevel$MESSAGE=Get-RandomQuoteif($JSON_OUTPUT){$logEntry=@{timestamp=$TIMESTAMP;level=$LEVEL;message=$MESSAGE}|ConvertTo-Json-Compress}else{$logEntry="$TIMESTAMP [$LEVEL] - $MESSAGE"}return$logEntry}# Parse command line arguments$JSON_OUTPUT=$falseif($args-contains"-json"){$JSON_OUTPUT=$true}# Main loop to write logsWrite-Host"Writing logs to $LOG_FILE. Press Ctrl+C to stop."while($true){$logEntry=Generate-LogEntry# Ensure UTF-8 encoding is used (without BOM) to avoid unwanted characters$logEntry|Out-File-Append-FilePath$LOG_FILE-Encodingutf8Start-Sleep-Seconds1# Adjust log frequency}
For macOS/Linux make sure the script is executable:
chmod +x log-gen.sh
3.2 Start Log-Gen
Exercise
Start the appropriate script for your system. The script will begin writing lines to a file named quotes.log:
./log-gen.sh
Writing logs to quotes.log. Press Ctrl+C to stop.
Note
On Windows, you may encounter the following error:
.\log-gen.ps1 : File .\log-gen.ps1 cannot be loaded because running scripts is disabled on this system …
To resolve this run:
powershell-ExecutionPolicyBypass-Filelog-gen.ps1
3.3 Filelog Configuration
Exercise
Move to the Agent terminal window and change into the [WORKSHOP]/3-filelog directory. Open the agent.yaml copied across earlier and in your editor add the filelog receiver to the agent.yaml.
Create the filelog receiver and name it quotes: The FileLog receiver reads log data from a file and includes custom resource attributes in the log data:
filelog/quotes:# Receiver Type/Nameinclude:./quotes.log # The file to read log data frominclude_file_path:true# Include file path in the log datainclude_file_name:false# Exclude file name from the log dataresource:# Add custom resource attributescom.splunk.source:./quotes.log # Source of the log datacom.splunk.sourcetype:quotes # Source type of the log data
Add filelog/quotes receiver: In the logs: pipeline add the filelog/quotes: receiver.
logs:receivers:- otlp # OTLP Receiver- filelog/quotes # Filelog Receiver reading quotes.logprocessors:- memory_limiter # Memory Limiter Processor- resourcedetection # Adds system attributes to the data- resource/add_mode # Adds collector mode metadata- batch # Batch Processor, groups data before sendexporters:- debug # Debug Exporter- otlphttp # OTLP/HTTP EXporter
Validate the agent configuration using otelbin.io. For reference, the logs: section of your pipelines will look similar to this:
Check the log-gen script is running: Find the log-gen Terminal window, and check the script is still running, and the last line is still stating the below, if it not, restart it in the [WORKSHOP]/3-filelog directory:
Writing logs to quotes.log. Press Ctrl+C to stop.
Start the Gateway:
Find your Gateway terminal window.
Navigate to the [WORKSHOP]/3-filelog directory.
Start the Gateway.
Start the Agent:
Switch to your Agent terminal window.
Navigate to the [WORKSHOP]/3-filelog directory.
Start the Agent.
Ignore the initial CPU metrics in the debug output and wait until the continuous stream of log data from the quotes.log appears. The debug output should look similar to the following (use the Check Full Debug Log to see all data):
<snip>
Body: Str(2025-02-05 18:05:16 [INFO] - All we have to decide is what to do with the time that is given)us.
Attributes:
-> log.file.path: Str(quotes.log)
</snip>
Check Full Debug Log
2025-02-05T18:05:17.050+0100 info Logs {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2025-02-05T18:05:17.050+0100 info ResourceLog #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Resource attributes:
-> com.splunk.source: Str(./quotes.log)
-> com.splunk.sourcetype: Str(quotes)
-> host.name: Str(PH-Windows-Box.hagen-ict.nl)
-> os.type: Str(windows)
-> otelcol.service.mode: Str(gateway)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope
LogRecord #0
ObservedTimestamp: 2025-02-05 17:05:16.6926816 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(2025-02-05 18:05:16 [INFO] - All we have to decide is what to do with the time that is given)us.
Attributes:
-> log.file.path: Str(quotes.log)
Trace ID:
Span ID:
Flags: 0
{"kind": "exporter", "data_type": "logs", "name": "debug"}
Verify the gateway has handled the logs:
Windows only: Stop the Agent and Gateway to flush the files.
Check if the Gateway has written a ./gateway-logs.out file.
At this point, your directory structure will appear as follows:
WORKSHOP
βββ 1-agent
βββ 2-gateway
βββ 3-filelog
βΒ Β βββ agent.yaml # Agent Collector configuration file
βΒ Β βββ gateway-logs.out # Output from the gateway logs pipeline
βΒ Β βββ gateway-metrics.out # Output from the gateway metrics pipeline
βΒ Β βββ gateway.yaml # Gateway Collector configuration file
βΒ Β βββ log-gen.(sh or ps1) # Script to write a file with logs lines
βΒ Β βββ quotes.log # File containing Random log lines
βΒ Β βββ trace.json # Example trace file
βββ otelcol # OpenTelemetry Collector binary
Examine a log line in gateway-logs.out: Compare a log line with the snippet below. It is a preview showing the beginning and a single log line; your actual output will contain many, many more:
{"resourceLogs":[{"resource":{"attributes":[{"key":"com.splunk.sourcetype","value":{"stringValue":"quotes"}},{"key":"com.splunk/source","value":{"stringValue":"./quotes.log"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1737231901720160600","body":{"stringValue":"2025-01-18 21:25:01 [WARN] - Do or do not, there is no try."},"attributes":[{"key":"log.file.path","value":{"stringValue":"quotes.log"}}],"traceId":"","spanId":""}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}{"resourceLogs":[{"resource":{"attributes":[{"key":"com.splunk/source","value":{"stringValue":"./quotes.log"}},{"key":"com.splunk.sourcetype","value":{"stringValue":"quotes"}},{"key":"host.name","value":{"stringValue":"PH-Windows-Box.hagen-ict.nl"}},{"key":"os.type","value":{"stringValue":"windows"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1737231902719133000","body":{"stringValue":"2025-01-18 21:25:02 [DEBUG] - One does not simply walk into Mordor."},"attributes":[{"key":"log.file.path","value":{"stringValue":"quotes.log"}}],"traceId":"","spanId":""}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
{"resourceLogs":[{"resource":{"attributes":[{"key":"com.splunk/source","value":{"stringValue":"./quotes.log"}},{"key":"com.splunk.sourcetype","value":{"stringValue":"quotes"}},{"key":"host.name","value":{"stringValue":"[YOUR_HOST_NAME]"}},{"key":"os.type","value":{"stringValue":"[YOUR_OS]"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1737231902719133000","body":{"stringValue":"2025-01-18 21:25:02 [DEBUG] - One does not simply walk into Mordor."},"attributes":[{"key":"log.file.path","value":{"stringValue":"quotes.log"}}],"traceId":"","spanId":""}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]}
Examine the resourceLogs section: Verify that the files include the same attributes we observed in the traces and metrics sections.
You may also have noticed that every log line contains empty placeholders for "traceId":"" and "spanId":"". The FileLog receiver will populate these fields only if they are not already present in the log line.
For example, if the log line is generated by an application instrumented with an OpenTelemetry instrumentation library, these fields will already be included and will not be overwritten.
Stop the Agent, Gateway and the Quotes generating script as well using Ctrl-C.
4. Building In Resilience
10 minutes
The OpenTelemetry Collectorβs FileStorage Extension enhances the resilience of your telemetry pipeline by providing reliable checkpointing, managing retries, and handling temporary failures effectively.
With this extension enabled, the OpenTelemetry Collector can store intermediate states on disk, preventing data loss during network disruptions and allowing it to resume operations seamlessly.
Note
This solution will work for metrics as long as the connection downtime is briefβup to 15 minutes. If the downtime exceeds this, Splunk Observability Cloud will drop data due to datapoints being out of order.
For logs, there are plans to implement a more enterprise-ready solution in one of the upcoming Splunk OpenTelemetry Collector releases.
Exercise
Inside the [WORKSHOP] directory, create a new subdirectory named 4-resilience.
Next, copy all contents from the 3-filelog directory into 4-resilience.
After copying, remove any *.out and *.log files.
Change all terminal windows to the [WORKSHOP]/4-reslilience directory.
Your updated directory structure will now look like this:
In this exercise, we will update the extensions: section of the agent.yaml file. This section is part of the OpenTelemetry configuration YAML and defines optional components that enhance or modify the OpenTelemetry Collectorβs behavior.
While these components do not process telemetry data directly, they provide valuable capabilities and services to improve the Collectorβs functionality.
Exercise
Update the agent.yaml: Add the file_storage extension and name it checkpoint:
file_storage/checkpoint:# Extension Type/Namedirectory:"./checkpoint-dir"# Define directorycreate_directory:true# Create directorytimeout:1s # Timeout for file operationscompaction:# Compaction settingson_start:true# Start compaction at Collector startup# Define compaction directorydirectory:"./checkpoint-dir/tmp"# Max. size limit before compaction occursmax_transaction_size:65536
Add file_storage to existing otlphttp exporter: Modify the otlphttp: exporter to configure retry and queuing mechanisms, ensuring data is retained and resent if failures occur:
otlphttp:# Exporter Typeendpoint:"http://localhost:5318"# Gateway OTLP endpointheaders:# Headers to add to the HTTPcall X-SF-Token:"ACCESS_TOKEN"# Splunk ACCESS_TOKEN headerretry_on_failure:# Retry on failure settingsenabled:true# Enables retryingsending_queue:# Sending queue settingsenabled:true# Enables Sending queuenum_consumers:10# Number of consumersqueue_size:10000# Maximum queue size# File storage extensionstorage:file_storage/checkpoint
Update the services section: Add the file_storage/checkpoint extension to the existing extensions: section. This will cause the extension to be enabled:
service:extensions:- health_check- file_storage/checkpoint # Enabled extensions for this collector
Update the metrics pipeline: For this exercise we are going to remove the hostmetrics receiver from the Metric pipeline to reduce debug and log noise:
Next, we will configure our environment to be ready for testing the File Storage configuration.
Exercise
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/4-resilience directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/4-resilience directory and run:
../otelcol --config=agent.yaml
Send a test trace: In the Test terminal window navigate to the [WORKSHOP]/4-resilience directory and run:
curl -X POST -i http://localhost:4318/v1/traces -H "Content-Type: application/json" -d "@trace.json"
Both the Agent and Gateway should display debug logs, and the Gateway should create a ./gateway-traces.out file.
If everything functions correctly, we can proceed with testing system resilience.
4.3 Simulate Failure
To assess the Agent’s resilience, we’ll simulate a temporary Gateway outage and observe how the Agent handles it:
Summary:
Send Traces to the Agent β Generate traffic by sending traces to the Agent.
Stop the Gateway β This will trigger the Agent to enter retry mode.
Restart the Gateway β The Agent will recover traces from its persistent queue and forward them successfully. Without the persistent queue, these traces would have been lost permanently.
Exercise
Simulate a network failure: In the Gateway terminal stop the Gateway with Ctrl-C and wait until the gateway console shows that it has stopped:
2025-01-28T13:24:32.785+0100 info service@v0.116.0/service.go:309 Shutdown complete.
Send traces: In the Test terminal window send two traces using the curl command we used earlier.
Notice that the agentβs retry mechanism is activated as it continuously attempts to resend the data. In the agentβs console output, you will see repeated messages similar to the following:
2025-01-28T14:22:47.020+0100 info internal/retry_sender.go:126 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "failed to make an HTTP request: Post \"http://localhost:5318/v1/traces\": dial tcp 127.0.0.1:5318: connect: connection refused", "interval": "9.471474933s"}
Stop the Agent: Use Ctrl-C to stop the agent. Wait until the agentβs console confirms it has stopped:
2025-01-28T14:40:28.702+0100 info extensions/extensions.go:66 Stopping extensions...
2025-01-28T14:40:28.702+0100 info service@v0.116.0/service.go:309 Shutdown complete.
Tip
Stopping the agent will halt its retry attempts and prevent any future retry activity.
If the agent runs for too long without successfully delivering data, it may begin dropping traces, depending on the retry configuration, to conserve memory. By stopping the agent, any metrics, traces, or logs currently stored in memory are lost before being dropped, ensuring they remain available for recovery.
This step is essential for clearly observing the recovery process when the agent is restarted.
4.4 Simulate Recovery
In this exercise, weβll test how the OpenTelemetry Collector recovers from a network outage by restarting the Gateway. When the Gateway becomes available again, the Agent will resume sending data from its last checkpointed state, ensuring no data loss.
Exercise
Restart the Gateway: In the Gateway terminal window run:
../otelcol --config=gateway.yaml
Restart the Agent: In the Agent terminal window run:
../otelcol --config=agent.yaml
After the Agent is up and running, the File_Storage extension will detect buffered data in the checkpoint folder. It will start to dequeue the stored spans from the last checkpoint folder, ensuring no data is lost.
Exercise
Verify the Agent Debug output Note that the Agent Debug Screen does NOT change and still shows the following line indicating no new data is being exported.
2025-02-07T13:40:12.195+0100 info service@v0.117.0/service.go:253 Everything is ready. Begin running and processing data.
Watch the Gateway Debug output You should see from the Gateway debug screen, it has started receiving the previously missed traces without requiring any additional action on your part.
2025-02-07T12:44:32.651+0100 info service@v0.117.0/service.go:253 Everything is ready. Begin running and processing data.
2025-02-07T12:47:46.721+0100 info Traces {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 4, "spans": 4}
2025-02-07T12:47:46.721+0100 info ResourceSpans #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Resource attributes:
Check the gateway-traces.out file Count the number of traces in the recreated ./gateway-traces.out. It should match the number you send when the Gateway was down
Conclusion
This exercise demonstrated how to enhance the resilience of the OpenTelemetry Collector by configuring the file_storage extension, enabling retry mechanisms for the otlp exporter, and using a file-backed queue for temporary data storage.
By implementing file-based checkpointing and queue persistence, you ensure the telemetry pipeline can gracefully recover from temporary interruptions, making it a more robust and reliable for production environments.
Stop the Agent and Gateway using Ctrl-C.
5. Dropping Spans
10 minutes
In this section, we will explore how to use the Filter Processor to selectively drop spans based on certain conditions.
Specifically, we will drop traces based on the span name, which is commonly used to filter out unwanted spans such as health checks or internal communication traces. In this case, we will be filtering out spans whose name is "/_healthz", typically associated with health check requests and usually are quite “noisy”.
Exercise
Inside the [WORKSHOP] directory, create a new subdirectory named 5-dropping-spans.
Next, copy all contents from the 4-resilience directory into 5-dropping-spans.
After copying, remove any *.out and *.log files.
Change all terminal windows to the [WORKSHOP]/5-dropping-spans directory.
Your updated directory structure will now look like this:
Next, we will configure the filter processor and the respective pipelines.
Subsections of 5. Dropping Spans
5.1 Configuration
Exercise
Switch to your Gateway terminal window. Navigate to the [WORKSHOP]/5-dropping-spans directory and open the gateway.yaml and add the following configuration to the processors section:
Add a filter processor: Configure the OpenTelemetry Collector to drop spans with the name "/_healthz":
filter/health:# Defines a filter processorerror_mode:ignore # Ignore errorstraces:# Filtering rules for tracesspan:# Exclude spans named "/_healthz" - 'name == "/_healthz"'
Add the filter processor: Make sure you add the filter to the traces pipeline. Filtering should be applied as early as possible, ideally right after the memory_limiter and before the batch processor:
traces:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Manage memory usage- filter/health # Filter Processor. Filter's out Data based on rules- resource/add_mode # Add metadata about collector mode- batch # Groups Data before sendexporters:- debug # Debug Exporter- file/traces # File Exporter for Trace
Validate the agent configuration using otelbin.io. For reference, the traces: section of your pipelines will look similar to this:
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/5-dropping-spans directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/5-dropping-spans directory and run:
../otelcol --config=agent.yaml
Send the new health.json payload: In the Test terminal window navigate to the [WORKSHOP]/5-dropping-spans directory and run the curl command below. (Windows use curl.exe).
curl -X POST -i http://localhost:4318/v1/traces -H "Content-Type: application/json" -d "@health.json"
Verify Agent Debug output shows the healthz span: Confirm that the span span payload is sent, Check the agentβs debug output to see the span data like the snippet below:
<snip>
Span #0
Trace ID : 5b8efff798038103d269b633813fc60c
Parent ID : eee19b7ec3c1b173
ID : eee19b7ec3c1b174
Name : /_healthz
Kind : Server
<snip>
The Agent has forward the span to the Gateway.
Check the Gateway Debug output:
The Gateway should NOT show any span data received. This is because the Gateway is configured with a filter to drop spans named "/_healthz", so the span will be discarded/dropped and not processed further.
Confirm normal span are processed by using the cURL command with the trace.json file again. This time, you should see both the agent and gateway process the spans successfully.
Tip
When using the Filter processor make sure you understand the look of your incoming data and test the configuration thoroughly. In general, use as specific a configuration as possible to lower the risk of the wrong data being dropped.
You can further extend this configuration to filter out spans based on different attributes, tags, or other criteria, making the OpenTelemetry Collector more customizable and efficient for your observability needs.
Stop the Agent and Gateway using Ctrl-C.
6. Redacting Sensitive Data
10 minutes
In this section, you’ll learn how to configure the OpenTelemetry Collector to remove specific tags and redact sensitive data from telemetry spans. This is crucial for protecting sensitive information such as credit card numbers, personal data, or other security-related details that must be anonymized before being processed or exported.
We’ll walk through configuring key processors in the OpenTelemetry Collector, including:
In this step, we’ll modify agent.yaml to include the attributes and redaction processors. These processors will help ensure that sensitive data within span attributes is properly handled before being logged or exported.
Previously, you may have noticed that some span attributes displayed in the console contained personal and sensitive data. We’ll now configure the necessary processors to filter out and redact this information effectively.
Switch to your Agent terminal window. Navigate to the [WORKSHOP]/6-sensitive-data directory and open the agent.yaml file in your editor.
Add an attributes Processor: This processor allows you to update, delete, or hash specific attributes (tags) within spans. We’ll update the user.phone_number, hash the user.email, and delete the user.account_password:
attributes/update:# Processor Type/Nameactions:# List of actions- key:user.phone_number # Target keyaction:update # Replace value with "UNKNOWN NUMBER"value:"UNKNOWN NUMBER"- key:user.email # Hash the email valueaction:hash - key:user.account_password# Remove the passwordaction:delete
Add a redaction Processor: This processor will detect and redact sensitive data values based on predefined patterns. We’ll block credit card numbers using regular expressions.
redaction/redact:# Processor Type/Nameallow_all_keys:true# If false, only allowed keys will be retainedblocked_values:# List of regex patterns to hash- '\b4[0-9]{3}[\s-]?[0-9]{4}[\s-]?[0-9]{4}[\s-]?[0-9]{4}\b'# Visa card- '\b5[1-5][0-9]{2}[\s-]?[0-9]{4}[\s-]?[0-9]{4}[\s-]?[0-9]{4}\b'# MasterCardsummary:debug # Show debug details about redaction
Update the traces Pipeline: Integrate both processors into the traces pipeline. Make sure that you comment out the redaction processor at first: (We will enable it later)
traces:receivers:- otlp # OTLP Receiverprocessors:- memory_limiter # Manage memory usage- attributes/update # Update, hash, and remove attributes#- redaction/redact # Redact sensitive fields using regex- resourcedetection # Add system attributes- resource/add_mode # Add metadata about collector mode- batch # Batch Processor, groups data before sendexporters:- debug # Debug Exporter- otlphttp # OTLP/HTTP EXporter used by Splunk O11Y
Validate the agent configuration using otelbin.io. For reference, the traces: section of your pipelines will look similar to this:
In this exercise, we will delete the user.account_password, update the user.phone_numberattribute and hash the user.email in the span data before it is exported by the Agent.
Exercise
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/6-sensitive-data directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/6-sensitive-data directory and run:
../otelcol --config=agent.yaml
Send a span:
In the Test terminal window change into the 6-sensitive-data directory.
Send the span containing sensitive data by running the curl command to send trace.json.
curl -X POST -i http://localhost:4318/v1/traces -H "Content-Type: application/json" -d "@trace.json"
Check the debug output: For both the Agent and Gateway debug output, confirm that user.account_password has been removed, and both user.phone_number & user.email have been updated.
Check file output: In the new gateway-traces.out file confirm that user.account_password has been removed, and user.phone_number & user.email have been updated:
The redaction processor gives precise control over which attributes and values are permitted or removed from telemetry data.
Earlier we configured the agent collector to:
Block sensitive data: Any values (in this case Credit card numbers) matching the provided regex patterns (Visa and MasterCard) are automatically detected and redacted.
This is achieved using the redaction processor you added earlier, where we define regex patterns to filter out unwanted data:
redaction/redact:# Processor Type/Nameallow_all_keys:true# False removes all key unless in allow list blocked_values:# List of regex to check and hash# Visa card regex. - Please note the '' around the regex- '\b4[0-9]{3}[\s-]?[0-9]{4}[\s-]?[0-9]{4}[\s-]?[0-9]{4}\b'# MasterCard card regex - Please note the '' around the regex- '\b5[1-5][0-9]{2}[\s-]?[0-9]{4}[\s-]?[0-9]{4}[\s-]?[0-9]{4}\b'summary:debug # Show detailed debug information about the redaction
Test the Redaction Processor
In this exercise, we will redact the user.visa & user.mastercardvalues in the span data before it is exported by the Agent.
Exercise
Prepare the terminals: Delete the *.out files and clear the screen.
Enable the redaction/redact processor: Edit agent.yaml and remove the # we inserted in the previous exercise.
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/6-sensitive-data directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/6-sensitive-data directory and run:
../otelcol --config=agent.yaml
Send a span: Run the curl command and in the Test terminal window to send trace.json.
curl -X POST -i http://localhost:4318/v1/traces -H "Content-Type: application/json" -d "@trace.json"
Check the debug output: For both the Agent and Gateway confirm the values for user.visa & user.mastercard have been updated. Notice user.amex attribute value was NOT redacted because a matching regex pattern was not added to blocked_values
By including summary:debug in the redaction processor, the debug output will include summary information about which matching keys values were redacted, along with the count of values that were masked.
These are just a few examples of how attributes and redaction processors can be configured to protect sensitive data.
Stop the Agent and Gateway using Ctrl-C.
7. Transform Data
10 minutes
The Transform Processor lets you modify telemetry dataβlogs, metrics, and tracesβas it flows through the pipeline. Using the OpenTelemetry Transformation Language (OTTL), you can filter, enrich, and transform data on the fly without touching your application code.
In this exercise weβll update agent.yaml to include a Transform Processor that will:
Filter log resource attributes.
Parse JSON structured log data into attributes.
Set log severity levels based on the log message body.
You may have noticed that in previous logs, fields like SeverityText and SeverityNumber were undefined (this is typical of the filelog receiver). However, the severity is embedded within the log body:
<snip>
LogRecord #0
ObservedTimestamp: 2025-01-31 21:49:29.924017 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(2025-01-31 15:49:29 [WARN] - Do or do not, there is no try.)
</snip>
Logs often contain structured data encoded as JSON within the log body. Extracting these fields into attributes allows for better indexing, filtering, and querying. Instead of manually parsing JSON in downstream systems, OTTL enables automatic transformation at the telemetry pipeline level.
Exercise
Inside the [WORKSHOP] directory, create a new subdirectory named 7-transform.
Next, copy all contents from the 6-sensitve-data directory into 7-transform.
After copying, remove any *.out and *.log files.
Change all terminal windows to the [WORKSHOP]/7-transform directory.
Your updated directory structure will now look like this:
Switch to your Agent terminal window. Navigate to the [WORKSHOP]/7-transform-data directory and open the agent.yaml file in your editor.
Configure the transform processor and name it /logs: By using the -context: resource key we are targeting the resourceLog attributes of logs.
This configuration ensures that only the relevant resource attributes (com.splunk.sourcetype, host.name, otelcol.service.mode) are retained, improving log efficiency and reducing unnecessary metadata.
transform/logs:# Processor Type/Namelog_statements:# Log Processing Statements- context:resource # Log Contextstatements:# List of attribute keys to keep- keep_keys(attributes, ["com.splunk.sourcetype", "host.name", "otelcol.service.mode"])
Adding a Context Block for Log Severity Mapping: To properly set the severity_text and severity_number fields of a log record, we add another log context block within log_statements.
This configuration extracts the level value from the log body, maps it to severity_text, and assigns the appropriate severity_number:
- context:log # Log Contextstatements:# Transform Statements Array- set(cache, ParseJSON(body)) where IsMatch(body, "^\\{")- flatten(cache, "") - merge_maps(attributes, cache, "upsert")- set(severity_text, attributes["level"])- set(severity_number, 1) where severity_text == "TRACE"- set(severity_number, 5) where severity_text == "DEBUG"- set(severity_number, 9) where severity_text == "INFO"- set(severity_number, 13) where severity_text == "WARN"- set(severity_number, 17) where severity_text == "ERROR"- set(severity_number, 21) where severity_text == "FATAL"
Summary of Key Transformations:
Parse JSON: Extracts structured data from the log body.
Flatten JSON: Converts nested JSON objects into a flat structure.
Merge Attributes: Integrates extracted data into log attributes.
Map Severity Text: Assigns severity_text from the logβs level attribute.
Assign Severity Numbers: Converts severity levels into standardized numerical values.
You should have a singletransform processor containing two context blocks: one for resource and one for log.
This configuration ensures that log severity is correctly extracted, standardized, and structured for efficient processing.
Tip
This method of mapping all JSON fields to top-level attributes should only be used for testing and debugging OTTL. It will result in high cardinality in a production scenario.
Update the logs pipeline: Add the transform/logs: processor into the logs: pipeline:
logs:receivers:- otlp # OTLP Receiver- filelog/quotes # Filelog Receiver reading quotes.logprocessors:- memory_limiter # Memory Limiter Processor- resourcedetection # Adds system attributes to the data- resource/add_mode # Adds collector mode metadata- transform/logs # Transform Processor to update log lines- batch # Batch Processor, groups data before send
Validate the agent configuration using otelbin.io. For reference, the logs: section of your pipelines will look similar to this:
Start the Log Generator: In the Test terminal window, navigate to the [WORKSHOP]/7-transform-data directory and start the appropriate log-gen script for your system. We want to work with structured JSON logs, so add the -json flag.
./log-gen.sh -json
The script will begin writing lines to a file named ./quotes.log, while displaying a single line of output in the console.
Writing logs to quotes.log. Press Ctrl+C to stop.
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/7-transform-data directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/7-transform-data directory and run:
../otelcol --config=agent.yaml
7.3 Test Transform Processor
This test verifies that the com.splunk/source and os.type metadata have been removed from the log resource attributes before being exported by the Agent. Additionally, the test ensures that:
The log body is parsed to extract severity information.
SeverityText and SeverityNumber are set on the LogRecord.
JSON fields from the log body are promoted to log attributes.
This ensures proper metadata filtering, severity mapping, and structured log enrichment before export.
Exercise
Check the debug output: For both the Agent and Gateway confirm that com.splunk/source and os.type have been removed:
Check the debug output: For both the Agent and Gateway confirm that SeverityText and SeverityNumber in the LogRecord is now defined with the severity level from the log body. Confirm that the JSON fields from the body can be accessed as top-level log Attributes:
LogRecord #0
ObservedTimestamp: 2025-01-31 21:49:29.924017 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText: WARN
SeverityNumber: Warn(13)
Body: Str(2025-01-31 15:49:29 [WARN] - Do or do not, there is no try.)
Attributes:
-> log.file.path: Str(quotes.log)
-> timestamp: Str(2025-01-31 15:49:29)
-> level: Str(WARN)
-> message: Str(Do or do not, there is no try.)
Trace ID:
Span ID:
Flags: 0
{"kind": "exporter", "data_type": "logs", "name": "debug"}
LogRecord #0
ObservedTimestamp: 2025-01-31 21:49:29.924017 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(2025-01-31 15:49:29 [WARN] - Do or do not, there is no try.)
Attributes:
-> log.file.path: Str(quotes.log)
Trace ID:
Span ID:
Flags: 0
{"kind": "exporter", "data_type": "logs", "name": "debug"}
Check file output: In the new gateway-logs.out file verify the data has been transformed:
"resource":{"attributes":[{"key":"com.splunk.sourcetype","value":{"stringValue":"quotes"}},{"key":"host.name","value":{"stringValue":"YOUR_HOST_NAME"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1738360169924017000","severityText":"WARN","body":{"stringValue":"2025-01-31 15:49:29 [WARN] - Do or do not, there is no try."},"attributes":[{"key":"log.file.path","value":{"stringValue":"quotes.log"}},{"key":"timestamp","value":{"stringValue":"2025-01-31 15:49:29"}},{"key":"level","value":{"stringValue":"WARN"}},{"key":"message","value":{"stringValue":"Do or do not, there is no try."}}],"traceId":"","spanId":""}]}]
"resource":{"attributes":[{"key":"com.splunk.sourcetype","value":{"stringValue":"quotes"}},{"key":"com.splunk.source","value":{"stringValue":"./quotes.log"}},{"key":"host.name","value":{"stringValue":"YOUR_HOST_NAME"}},{"key":"os.type","value":{"stringValue":"YOUR_OS"}},{"key":"otelcol.service.mode","value":{"stringValue":"agent"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1738349801265812000","body":{"stringValue":"2025-01-31 12:56:41 [INFO] - There is some good in this world, and it's worth fighting for."},"attributes":[{"key":"log.file.path","value":{"stringValue":"quotes.log"}}],"traceId":"","spanId":""}]}]
8. Routing Data
10 minutes
The Routing Connector in OpenTelemetry is a powerful feature that allows you to direct data (traces, metrics, or logs) to different pipelines based on specific criteria. This is especially useful in scenarios where you want to apply different processing or exporting logic to subsets of your telemetry data.
For example, you might want to send production data to one exporter while directing test or development data to another. Similarly, you could route certain spans based on their attributes, such as service name, environment, or span name, to apply custom processing or storage logic.
Exercise
Inside the [WORKSHOP] directory, create a new subdirectory named 8-routing.
Next, copy all contents from the 7-transform-data directory into 8-routing.
After copying, remove any *.out and *.log files.
Change all terminal windows to the [WORKSHOP]/8-routing directory.
Your updated directory structure will now look like this:
Next, we will configure the routing connector and the respective pipelines.
Subsections of 8. Routing Data
8.1 Configure the Routing Connector
In this exercise, you will configure the routing connector in the gateway.yaml file. This setup enables the Gateway to route traces based on the deployment.environment attribute in the spans you send. By implementing this, you can process and handle traces differently depending on their attributes.
Exercise
Add the routing connector: In the Gateway terminal window edit gateway.yaml and add the following below the receivers: and processors: stanzas and above the exporters: stanza:
connectors:routing:default_pipelines:[traces/standard]# Default pipeline if no rule matcheserror_mode:ignore # Ignore errors in routingtable:# Define routing rules# Routes spans to a target pipeline if the resourceSpan attribute matches the rule- statement:route() where attributes["deployment.environment"] == "security_applications"pipelines:[traces/security] # Target pipeline
In OpenTelemetry configuration files, connectors have their own dedicated section, similar to receivers and processors. This approach also applies to metrics and logs, allowing them to be routed based on attributes in resourceMetrics or resourceLogs.
Configure file: exporters: The routing connector requires separate targets for routing. Add two file exporters, file/traces/security and file/traces/standard, to ensure data is directed correctly:
file/traces/standard:# Exporter for regular tracespath:"./gateway-traces-standard.out"# Path for saving trace dataappend:false# Overwrite the file each timefile/traces/security:# Exporter for security tracespath:"./gateway-traces-security.out"# Path for saving trace dataappend:false# Overwrite the file each time
With the routing configuration complete, the next step is to configure the pipelines to apply these routing rules.
8.2 Configuring the Pipelines
Exercise
Add both the standard and security traces pipelines:
Standard pipeline: This pipeline processes all spans that do not match the routing rule. Add it below the existing traces: pipeline, keeping the configuration unchanged for now:
traces/standard:# Default pipeline for unmatched spansreceivers:- routing # Receive data from the routing connectorprocessors:- memory_limiter # Limits memory usage- resource/add_mode # Adds collector mode metadataexporters:- debug # Debug exporter- file/traces/standard # File exporter for unmatched spans
Security pipeline: This pipeline will handle all spans that match the routing rule:
traces/security:# New Security Traces/Spans Pipeline receivers:- routing # Routing Connector, Only receives data from Connectorprocessors:- memory_limiter # Memory Limiter Processor- resource/add_mode # Adds collector mode metadataexporters:- debug # Debug Exporter - file/traces/security # File Exporter for spans matching rule
Update the traces pipeline to use routing:
To enable routing, update the original traces: pipeline by adding routing as an exporter. This ensures all span data is sent through the routing connector for evaluation.
Remove all processors as these are now defined in the traces/standard and traces/security pipelines.
By excluding the batch processor, spans are written immediately instead of waiting for multiple spans to accumulate before processing. This improves responsiveness, making the workshop run faster and allowing you to see results sooner.
Validate the agent configuration using otelbin.io. For reference, the traces: section of your pipelines will look similar to this:
In this section, we will test the routing rule configured for the Gateway. The expected result is that thespan from the security.json file will be sent to the gateway-traces-security.out file.
Exercise
Start the Gateway: In the Gateway terminal window navigate to the [WORKSHOP]/8-routing directory and run:
../otelcol --config=gateway.yaml
Start the Agent: In the Agent terminal window navigate to the [WORKSHOP]/8-routing directory and run:
../otelcol --config=agent.yaml
Create new security trace: In the Tests terminal window navigate to the [WORKSHOP]/8-routing directory.
The following JSON contains attributes which will trigger the routing rule. Copy the content from the tab below and save into a file named security.json.
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"password_check"}},{"key":"deployment.environment","value":{"stringValue":"security_applications"}}]},"scopeSpans":[{"scope":{"name":"my.library","version":"1.0.0","attributes":[{"key":"my.scope.attribute","value":{"stringValue":"some scope attribute"}}]},"spans":[{"traceId":"5B8EFFF798038103D269B633813FC60C","spanId":"EEE19B7EC3C1B174","parentSpanId":"EEE19B7EC3C1B173","name":"I'm a server span","startTimeUnixNano":"1544712660000000000","endTimeUnixNano":"1544712661000000000","kind":2,"attributes":[{"keytest":"my.span.attr","value":{"stringValue":"some value"}}]}]}]}]}
8.4 Test Routing Connector
Exercise
Send a Regular Span:
Locate the Test terminal and navigate to the [WORKSHOP]/8-routing directory.
Send a regular span using the trace.json file to confirm proper communication.
Both the Agent and Gateway should display debug information, including the span you just sent. The gateway will also generate a new gateway-traces-standard.out file, as this is now the designated destination for regular spans.
Tip
If you check gateway-traces-standard.out, it should contain the span sent using the cURL command. You will also see an empty gateway-traces-security.out file, as the routing configuration creates output files immediately, even if no matching spans have been processed yet.
Send a Security Span:
Ensure both the Agent and Gateway are running.
Send a security span using the security.json file to test the gatewayβs routing rule.
Again, both the Agent and Gateway should display debug information, including the span you just sent. This time, the Gateway will write a line to the gateway-traces-security.out file, which is designated for spans where the deployment.environment resource attribute matches "security_applications".
The gateway-traces-standard.out should be unchanged.
Tip
If you verify the ./gateway-traces-security.out it should only contain the spans from the "security_applications" deployment.environment.
You can repeat this scenario multiple times, and each trace will be written to its corresponding output file.
Conclusion
In this section, we successfully tested the routing connector in the gateway by sending different spans and verifying their destinations.
Regular spans were correctly routed to gateway-traces-standard.out, confirming that spans without a matching deployment.environment attribute follow the default pipeline.
Security-related spans from security.json were routed to gateway-traces-security.out, demonstrating that the routing rule based on "deployment.environment": "security_applications" works as expected.
By inspecting the output files, we confirmed that the OpenTelemetry Collector correctly evaluates span attributes and routes them to the appropriate destinations. This validates that routing rules can effectively separate and direct telemetry data for different use cases.
You can now extend this approach by defining additional routing rules to further categorize spans, metrics, and logs based on different attributes.
Stop the Agent, Gateway and the log-gen script in their respective terminals.