Monitoring Agentic AI Applications with Splunk Observability Cloud
Instrument the Agentic AI Application
Note: this section of the workshop requires changes to multiple files. If you’re not sure where to make the changes, or your application is no longer working, please refer to the expected solution for this section which is in the
~/workshop/agentic-ai/app-with-instrumentationfolder.
There are a few steps required to instrument our Agentic AI application with OpenTelemetry and deploy it to Kubernetes:
- Add the instrumentation packages to the
requirements.txtfile - Update the Dockerfile that invokes the application using
opentelemetry-instrument - Build a new Docker image with the instrumentation packages
- Update the Kubernetes manifest with environment variables
- Deploy the Kubernetes manifest
Add Instrumentation Packages #
Next, we need to install several instrumentation packages. We can achieve this by
opening the ~/workshop/agentic-ai/base-app/requirements.txt for editing and adding
the following packages to the bottom of the file:
splunk-opentelemetry==2.8.0
splunk-otel-instrumentation-langchain==0.1.7
splunk-otel-genai-emitters-splunk==0.1.7
splunk-otel-util-genai==0.1.9
opentelemetry-instrumentation-flask==0.59b0These packages can be described as follows:
splunk-opentelemetry: this is the Splunk distribution of OpenTelemetry Python, which instruments a Python application to capture and report distributed traces to Splunk APM.splunk-otel-instrumentation-langchain: this package provides OpenTelemetry instrumentation for LangChain LLM/chat workflows.splunk-otel-genai-emitters-splunk: this package provides emitters for Splunk schema for Evaluation Results logs to optimize storage and filtering in Splunk Platform.splunk-otel-util-genai: this package includes utility functions to provide APIs and data types to ease instrumentation of Generative AI workloads using OpenTelemetry semantic conventions.opentelemetry-instrumentation-flask: this library builds on the OpenTelemetry WSGI middleware to track web requests in Flask applications.
Check your work before proceeding
Run the following command to compare your changes with the expected solution:
diff ~/workshop/agentic-ai/base-app/requirements.txt ~/workshop/agentic-ai/app-with-instrumentation/requirements.txtUpdate the Dockerfile #
Then, we need to enable OpenTelemetry instrumentation. This is done by updating the Dockerfile to
ensure the application is started with opentelemetry-instrument. Open the ~/workshop/agentic-ai/base-app/Dockerfile
file for editing and update the last line as follows:
# Run the server with instrumentation
CMD ["opentelemetry-instrument", "python", "main.py"]Check your work before proceeding
Run the following command to compare your changes with the expected solution:
diff ~/workshop/agentic-ai/base-app/Dockerfile ~/workshop/agentic-ai/app-with-instrumentation/DockerfileBuild an Updated Docker Image #
Build an updated Docker image with a new tag:
cd ~/workshop/agentic-ai/base-app
docker build --platform linux/amd64 -t localhost:9999/agentic-ai-app:app-with-instrumentation .
docker push localhost:9999/agentic-ai-app:app-with-instrumentationTip: if the image is taking too long to build, consider using the pre-built image instead. To do so, update the image name in the
~/workshop/agentic-ai/base-app/k8s.yamlfile toghcr.io/splunk/agentic-ai-app:app-with-instrumentationinstead oflocalhost:9999/agentic-ai-app:app-with-instrumentation.
Define the Config Map #
When we deploy our application to Kubernetes, we want telemetry (metrics, traces, and logs) to be sent to Splunk Observability Cloud with a clear and unique environment identifier. This makes it easier to filter, compare, and troubleshoot data across different deployments.
To do this, we’ll set the OpenTelemetry resource attribute named deployment.environment. Rather
than hard-coding the value, we’ll derive it from the INSTANCE environment variable that
already exists on our EC2 instance. This ensures each deployment is automatically tagged
with the correct environment name.
We’ll store this configuration in a Kubernetes ConfigMap, which can later be injected into our application pods as an environment variable.
Create the ConfigMap with the following command:
kubectl create configmap instance-config \
--from-literal=OTEL_RESOURCE_ATTRIBUTES=deployment.environment=agentic-ai-$INSTANCE \
-n travel-agentWhat this does:
- Defines the
OTEL_RESOURCE_ATTRIBUTESenvironment variable expected by OpenTelemetry. - Sets
deployment.environmentto a value likeagentic-ai-shw-1c43, depending on the value of$INSTANCE. - Creates the ConfigMap in the
travel-agentnamespace.
We’ll reference this ConfigMap in the next step when we configure our Kubernetes deployment.
Update the Kubernetes Manifest #
OpenTelemetry instrumentation, and AI Agent Monitoring in particular, require a number of environment variables to be set that define how instrumentation data is collected, processed, and exported.
Open the ~/workshop/agentic-ai/base-app/k8s.yaml file for editing. Update the image
tag to ensure we’re using the image with the instrumentation:
image: localhost:9999/agentic-ai-app:app-with-instrumentationIn the same file, add the following environment variables between the comments that say
Begin: Add Environment Variables and End: Add Environment Variables:
Hint: Type
:set pastebefore pasting the contents, to preventvifrom auto-indenting the pasted code.
# Begin: Add Environment Variables
# Service Name
- name: OTEL_SERVICE_NAME
value: "travel-planner"
# Additional OTEL configuration
- name: OTEL_RESOURCE_ATTRIBUTES
valueFrom:
configMapKeyRef:
name: instance-config
key: OTEL_RESOURCE_ATTRIBUTES
- name: SPLUNK_OTEL_AGENT
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://$(SPLUNK_OTEL_AGENT):4317"
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: "grpc"
- name: OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE
value: "DELTA"
- name: OTEL_PYTHON_EXCLUDED_URLS
value: "^(https?://)?[^/]+(/health)?$"
- name: OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
value: "true"
- name: OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE
value: "SPAN"
- name: OTEL_INSTRUMENTATION_GENAI_EMITTERS
value: "span_metric,splunk"
- name: SPLUNK_PROFILER_ENABLED
value: "true"
# End: Add Environment VariablesNote: some of the text may not be visible without scrolling. Use the
Copy text to clipboardbutton on the top right-hand corner to ensure you’ve copied all of the text.
Note: indentation is critical with yaml; ensure the new environment variables align with the existing environment variables.
The following environment variables are specific to Agentic AI monitoring and can be described as follows:
OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE: this determines if the OTLP metric exporter reports cumulative totals, deltas, or low-memory-friendly temporality for emitted metrics. Setting this toDELTAis recommended for Agentic AI monitoring.OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT: this is used to enable/disable message capture from Agentic AI applications. We’ve set it totruefor this workshop.OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE: this defines how messages should be captured. We’ve set it toSPANfor this workshop, which ensures messages are captured using the span event store.OTEL_INSTRUMENTATION_GENAI_EMITTERS: we’ve set this tospan_metric,splunkfor the workshop, which ensures that both span and metric data are captured, as well as Splunk-specific features.
Check your work before proceeding
Run the following command to compare your changes with the expected solution:
diff ~/workshop/agentic-ai/base-app/k8s.yaml ~/workshop/agentic-ai/app-with-instrumentation/k8s.yamlDeploy the Updated Application #
We can deploy the updated application using the manifest file as follows:
kubectl apply -f ~/workshop/agentic-ai/base-app/k8s.yamlTest the Application in Kubernetes #
Ensure the new application pod has started successfully and the old pod is no longer present:
kubectl get pods -n travel-agentNAME READY STATUS RESTARTS AGE
travel-planner-langchain-68977dc5c4-4w7p9 1/1 Running 0 41sThen, run the following command to test the application:
curl http://travel-planner.localhost/travel/plan \
-H "Content-Type: application/json" \
-d '{
"origin": "Seattle",
"destination": "Tokyo",
"user_request": "We are planning a week-long trip to Seattle from Tokyo. Looking for boutique hotel, business-class flights and unique experiences.",
"travelers": 2
}'Troubleshooting #
If you need to troubleshoot, use the following command to view the application logs:
kubectl logs -l app=travel-planner-langchain -n travel-agent -f
