Monitoring Cisco AI Pods with Splunk Observability Cloud

2 minutes   Author Derek Mitchell

Cisco’s AI-ready PODs combine the best of hardware and software technologies to create a robust, scalable, and efficient AI-ready infrastructure tailored to diverse needs.

Splunk Observability Cloud provides comprehensive visibility into all of this infrastructure along with all the application components that are running on this stack.

The steps to configure Splunk Observability Cloud for a Cisco AI POD environment are fully documented (see here for details).

However, it’s not always possible to get access to a Cisco AI POD environment to practice the installation steps.

This workshop provides hands-on experience deploying and working with several of the technologies that are used to monitor Cisco AI PODs with Splunk Observability Cloud, without requiring access to an actual Cisco AI POD. This includes:

  • Practice deploying the OpenTelemetry Collector in the Red Hat OpenShift cluster.
  • Practice adding Prometheus receivers to the collector to ingest infrastructure metrics.
  • Practice deploying the Weaviate vector database to the cluster.
  • Practice instrumenting Python services that interact with Large Language Models (LLMs) with OpenTelemetry.
  • Understanding which details which OpenTelemetry captures in the trace from applications that interact with LLMs.

Note: the workshop setup section only needs to be executed by the workshop organizer

Tip

The easiest way to navigate through this workshop is by using:

  • the left/right arrows (< | >) on the top right of this page
  • the left (◀️) and right (▶️) cursor keys on your keyboard
Last Modified Jan 19, 2026

Subsections of Monitoring Cisco AI Pods with Splunk Observability Cloud

Workshop Setup

This section includes the steps that the workshop organizer should follow to setup the workshop:

  • AWS account setup
  • OpenShift prerequisites
  • Deploy a RedHat OpenShift cluster with GPU-based worker nodes using AWS ROSA.
  • Deploy the NVIDIA NIM Operator and NVIDIA GPU Operator.
  • Deploy a Large Language Model (LLM) using NVIDIA NIM to the cluster.
  • Create OpenShift logins and namespaces for each workshop user, with appropriate permissions.
  • Install the Cluster Receiver component of the Splunk OpenTelemetry collector.
  • Deploy a Weviate vector database to the cluster.
  • Deploy a service that mimics the Portworx Prometheus exporter.
Last Modified Jan 21, 2026

Subsections of 1. Workshop Setup

AWS Setup

10 minutes  

Enable the Red Hat OpenShift Service in AWS

To deploy OpenShift in your AWS account, we’ll need to first enable the Red Hat OpenShift service using the AWS console.

Next, follow the instructions to connect your AWS account with your Red Hat account.

Provision an EC2 Instance

Let’s provision an EC2 instance that we’ll use to deploy the Red Hat cluster. This avoids the limitations running the ROSA command-line interface on Mac OS.

We used a t3.xlarge instance type using Ubuntu 24.04 LTS while creating the workshop, but a smaller instance type can also be used.

ssh into the instance once it’s up and running.

Clone the GitHub Repository

Clone the GitHub repository to your EC2 instance:

git clone https://github.com/splunk/observability-workshop.git

cd observability-workshop/workshop/cisco-ai-pods 
Last Modified Jan 19, 2026

OpenShift Prerequisites

15 minutes  

The steps below are required before deploying the OpenShift cluster in AWS.

Create a Red Hat Login

The first thing we’ll need to do is create an account with Red Hat, which we can do by filling out the form here.

Install the AWS CLI

To install the AWS CLI on the EC2 instance provisioned previously, run the following commands:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
sudo apt install unzip
unzip awscliv2.zip
sudo ./aws/install

Use the following command to ensure it was installed successfully:

aws --version

It should return something like the following:

aws-cli/2.30.5 Python/3.13.7 Linux/6.14.0-1011-aws exe/x86_64.ubuntu.24

Login to your AWS account using your preferred method. Refer to the documentation for guidance. For example, you can login by running the aws configure command.

Confirm you’re logged in successfully by running a command such as aws ec2 describe-instances.

Then, verify your account identity with:

aws sts get-caller-identity

Check whether the service role for ELB (Elastic Load Balancing) exists:

aws iam get-role --role-name "AWSServiceRoleForElasticLoadBalancing"

If the role does not exist, create it by running the following command:

aws iam create-service-linked-role --aws-service-name "elasticloadbalancing.amazonaws.com"

Install the ROSA CLI

We’ll use the ROSA command-line interface (CLI) for the deployment. The instructions are based on Red Hat documentation.

You can download the latest release of the ROSA CLI for your operating system here.

Alternatively, we can use the following command to download the CLI binary directly to our EC2 instance:

curl -L -O https://mirror.openshift.com/pub/cgw/rosa/latest/rosa-linux.tar.gz

Extract the contents:

tar -xvzf rosa-linux.tar.gz

Move the resulting file (rosa) to a location that’s included as part of your path. For example:

sudo mv rosa /usr/local/bin/rosa

Log in to your Red Hat account by running the command below, then follow the instructions in the command output:

rosa login --use-device-code

Install the OpenShift CLI (oc)

We can use the following command to download the OpenShift CLI binary directly to our EC2 instance:

curl -L -O https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/stable/openshift-client-linux.tar.gz

Extract the contents:

tar -xvzf openshift-client-linux.tar.gz

Move the resulting files (oc and kubectl) to a location that’s included as part of your path. For example:

sudo mv oc /usr/local/bin/oc
sudo mv kubectl /usr/local/bin/kubectl

Create Account-Wide Roles and Policies

Use the following command to create the necessary account-wide roles and policies:

rosa create account-roles --mode auto

Create an AWS VPC for ROSA HCP

We’re going to use the Hosted Control Plane (HCP) deployment option to deploy our OpenShift cluster. To do this, we’ll need to create a new VPC in our AWS account using the following command:

Note: update the region as appropriate for your environment.

rosa create network network-template --param Region=us-east-2 --param Name=rosa-network-stack --template-dir='.'

Important: make note of the subnet ids created as a result of this command as you’ll need them when creating the cluster. Make a note of the CloudFormation stack name as well, which will be needed later if you want to delete the network.

Note: by default, each AWS region is limited to 5 elastic IP addresses.
If you receive the following error: “The maximum number of addresses has been reached.” then you’ll need to contact AWS to request an increase to this limit, or choose another AWS region to create the VPC for ROSA.

Create an OpenID Connect configuration

Before creating a Red Hat OpenShift Service on AWS cluster, let’s create the OpenID Connect (OIDC) configuration with the following command:

rosa create oidc-config --mode=auto --yes

Important: make note of the oidc-provider id that is created.

Last Modified Jan 19, 2026

Deploy OpenShift Cluster in AWS

25 minutes  

Deploy an OpenShift Cluster

We’ll use the ROSA CLI to deploy an OpenShift Cluster.

First, we’ll need to set a few environment variables:

Note: be sure to fill in the Subnet IDs and OIDC ID before running the EXPORT commands

export CLUSTER_NAME=rosa-test
export AWS_REGION=us-east-2
export AWS_INSTANCE_TYPE=g5.4xlarge
export SUBNET_IDS=<comma separated list of subnet IDs from earlier rosa create network command>
export OIDC_ID=<the oidc-provider id returned from the rosa create oidc-config command> 
export OPERATOR_ROLES_PREFIX=rosa-test-a6x9

Create operator roles for the OIDC configuration using the following command:

Note: just accept the default values when prompted.

rosa create operator-roles --hosted-cp --prefix $OPERATOR_ROLES_PREFIX --oidc-config-id $OIDC_ID

Then we can create the cluster as follows:

rosa create cluster \
    --cluster-name $CLUSTER_NAME \
    --mode auto \
    --hosted-cp \
    --sts \
    --create-admin-user \
    --operator-roles-prefix $OPERATOR_ROLES_PREFIX \
    --oidc-config-id $OIDC_ID \
    --subnet-ids $SUBNET_IDS \
    --compute-machine-type $AWS_INSTANCE_TYPE \
    --replicas 2 \
    --region $AWS_REGION \
    --tags "splunkit_environment_type:non-prd,splunkit_data_classification:private"

Note that we’ve specified the g5.4xlarge instance type, which includes NVIDIA GPUs that we’ll be using later in the workshop. This instance type is relatively expensive, about $1.64 per hour at the time of writing, and we’ve requested 2 replicas, so be mindful of how long your cluster is running for, as costs will accumulate quickly.

To determine when your cluster is Ready, run:

rosa describe cluster -c $CLUSTER_NAME

To watch your cluster installation logs, run:

rosa logs install -c $CLUSTER_NAME --watch

Connect to the OpenShift Cluster

Use the command below to connect the oc CLI to your OpenShift cluster:

Note: Run the rosa describe cluster -c $CLUSTER_NAME command and substitute the resulting API Server URL into the command below before running it. For example, the server name might be something like https://api.rosa-test.aaa.bb.openshiftapps.com:443.

 oc login <API Server URL> -u cluster-admin

Once connected to your cluster, confirm that the nodes are up and running:

oc get nodes

NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-1-184.us-east-2.compute.internal   Ready    worker   14m   v1.31.11
ip-10-0-1-50.us-east-2.compute.internal    Ready    worker   20m   v1.31.11
Last Modified Jan 30, 2026

Deploy the NVIDIA NIM Operator

20 minutes  

The NVIDIA GPU Operator is a Kubernetes Operator that automates the deployment, configuration, and management of all necessary NVIDIA software components to provision GPUs within a Kubernetes cluster.

The NVIDIA NIM Operator is used to deploy LLMs in Kubernetes environments, such as the OpenShift cluster we created earlier in this workshop.

This section of the workshop walks through the steps necessary to deploy both the NVIDIA GPU and NIM operators in our OpenShift cluster.

Create a NVIDIA NGC Account

An NVIDIA GPU CLOUD (NGC) account is required to download LLMs and deploy them using the NVIDIA NIM operator. You can register here to create an account.

Register with the NVIDIA Developer Program

Registering with the NVIDIA Developer Program allows us to get access to NVIDIA NIM, which we’ll use later in the workshop to deploy LLMs.

Ensure that NVIDIA Developer Program appears on your list of NVIDIA subscriptions in NGC:

NVIDIA Subscriptions NVIDIA Subscriptions

Generate an NGC API Key

Once you’re logged in to the NGC website, click on your user account icon on the top-right corner of the screen and select Setup.

Then click Generate API Key and follow the instructions. Ensure the key is associated with the NGC Catalog and Secrets Manager services.

Save the generated key in a safe place as we’ll use it later in the workshop.

Refer to NVIDIA Documentation for further details on generating an NGC API key.

Install the Node Feature Discovery Operator

The steps in this section are based on Installing the NFD Operator using the CLI .

Run the following script to install the Node Feature Discovery Operator:

cd nvidia
./install-nfd-operator.sh

To verify that the Operator deployment is successful, run:

oc get pods
NAME                                      READY   STATUS    RESTARTS   AGE
nfd-controller-manager-7f86ccfb58-vgr4x   2/2     Running   0          10m

Create a NodeFeatureDiscovery CR

The steps in this section are based on Creating a NodeFeatureDiscovery CR by using the CLI .

Run the following script to create the Node Feature Discovery CR:

./create-nfd-cr.sh

Install the NVIDIA GPU Operator

The steps in this section are based on Installing the NVIDIA GPU Operator on OpenShift.

Run the following script to install the NVIDIA GPU Operator:

./install-nvidia-gpu-operator.sh

Wait until the install plan has been created:

oc get installplan -n nvidia-gpu-operator
NAME            CSV                              APPROVAL   APPROVED
install-mmlxq   gpu-operator-certified.v25.3.4   Manual     false

Approve the install plan with the following commands:

INSTALL_PLAN=$(oc get installplan -n nvidia-gpu-operator -oname)
oc patch $INSTALL_PLAN -n nvidia-gpu-operator --type merge --patch '{"spec":{"approved":true }}'
installplan.operators.coreos.com/install-rc9xq patched

Create the Cluster Policy

The steps in this section are based on Create the cluster policy using the CLI.

./create-cluster-policy.sh

Verify the NVIDIA GPU Operator Installation

Verify the successful installation of the NVIDIA GPU Operator using the following command:

oc get pods,daemonset -n nvidia-gpu-operator
NAME                                                      READY   STATUS      RESTARTS      AGE
pod/gpu-feature-discovery-sblkn                           1/1     Running     0             5m5s
pod/gpu-feature-discovery-zpt94                           1/1     Running     0             4m58s
pod/gpu-operator-6579bc6fdc-cp28l                         1/1     Running     0             23m
pod/nvidia-container-toolkit-daemonset-qfcl9              1/1     Running     0             5m5s
pod/nvidia-container-toolkit-daemonset-zbwb6              1/1     Running     0             4m59s
pod/nvidia-cuda-validator-f7tl2                           0/1     Completed   0             78s
pod/nvidia-cuda-validator-t7n9g                           0/1     Completed   0             71s
pod/nvidia-dcgm-exporter-gk66x                            1/1     Running     0             4m59s
pod/nvidia-dcgm-exporter-w8kr8                            1/1     Running     2 (52s ago)   5m5s
pod/nvidia-dcgm-lrnzr                                     1/1     Running     0             4m58s
pod/nvidia-dcgm-tvrdm                                     1/1     Running     0             5m5s
pod/nvidia-device-plugin-daemonset-d62nk                  1/1     Running     0             5m5s
pod/nvidia-device-plugin-daemonset-fnv4j                  1/1     Running     0             4m59s
pod/nvidia-driver-daemonset-418.94.202509100653-0-5xbvq   2/2     Running     0             5m48s
pod/nvidia-driver-daemonset-418.94.202509100653-0-hmkdl   2/2     Running     0             5m48s
pod/nvidia-node-status-exporter-2kqwr                     1/1     Running     0             5m44s
pod/nvidia-node-status-exporter-n8d9s                     1/1     Running     0             5m44s
pod/nvidia-operator-validator-r2nm2                       1/1     Running     0             5m5s
pod/nvidia-operator-validator-w2fpn                       1/1     Running     0             4m59s

NAME                                                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                                                                         AGE
daemonset.apps/gpu-feature-discovery                           2         2         2       2            2           nvidia.com/gpu.deploy.gpu-feature-discovery=true                                                                      5m45s
daemonset.apps/nvidia-container-toolkit-daemonset              2         2         2       2            2           nvidia.com/gpu.deploy.container-toolkit=true                                                                          5m48s
daemonset.apps/nvidia-dcgm                                     2         2         2       2            2           nvidia.com/gpu.deploy.dcgm=true                                                                                       5m46s
daemonset.apps/nvidia-dcgm-exporter                            2         2         2       2            2           nvidia.com/gpu.deploy.dcgm-exporter=true                                                                              5m46s
daemonset.apps/nvidia-device-plugin-daemonset                  2         2         2       2            2           nvidia.com/gpu.deploy.device-plugin=true                                                                              5m47s
daemonset.apps/nvidia-device-plugin-mps-control-daemon         0         0         0       0            0           nvidia.com/gpu.deploy.device-plugin=true,nvidia.com/mps.capable=true                                                  5m47s
daemonset.apps/nvidia-driver-daemonset-418.94.202509100653-0   2         2         2       2            2           feature.node.kubernetes.io/system-os_release.OSTREE_VERSION=418.94.202509100653-0,nvidia.com/gpu.deploy.driver=true   5m48s
daemonset.apps/nvidia-mig-manager                              0         0         0       0            0           nvidia.com/gpu.deploy.mig-manager=true                                                                                5m45s
daemonset.apps/nvidia-node-status-exporter                     2         2         2       2            2           nvidia.com/gpu.deploy.node-status-exporter=true                                                                       5m44s
daemonset.apps/nvidia-operator-validator                       2         2         2       2            2           nvidia.com/gpu.deploy.operator-validator=true                                                                         5m48s

Install the Operator SDK

The steps in this section are based on Install from GitHub release.

Download the release binary

Set platform information:

export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64) echo -n arm64 ;; *) echo -n $(uname -m) ;; esac)
export OS=$(uname | awk '{print tolower($0)}')

Download the binary for your platform:

export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/download/v1.41.1
curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH}

Verify the downloaded binary

Import the operator-sdk release GPG key from keyserver.ubuntu.com:

gpg --keyserver keyserver.ubuntu.com --recv-keys 052996E2A20B5C7E

Download the checksums file and its signature, then verify the signature:

curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt
curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt.asc
gpg -u "Operator SDK (release) <cncf-operator-sdk@cncf.io>" --verify checksums.txt.asc

You should see something similar to the following:

gpg: assuming signed data in 'checksums.txt'
gpg: Signature made Fri 30 Oct 2020 12:15:15 PM PDT
gpg:                using RSA key ADE83605E945FA5A1BD8639C59E5B47624962185
gpg: Good signature from "Operator SDK (release) <cncf-operator-sdk@cncf.io>" [ultimate]

Make sure the checksums match:

grep operator-sdk_${OS}_${ARCH} checksums.txt | sha256sum -c -

You should see something similar to the following:

operator-sdk_linux_amd64: OK

Install the release binary in your PATH

chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk

Install the NGC CLI

The steps in this section are based on NGC CLI Install.

Click Download CLI to download the zip file that contains the binary, then transfer the zip file to a directory where you have permissions and then unzip and execute the binary. You can also download, unzip, and install from the command line by moving to a directory where you have execute permissions and then running the following command:

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.3.0/files/ngccli_linux.zip -O ngccli_linux.zip && unzip ngccli_linux.zip

Check the binary’s md5 hash to ensure the file wasn’t corrupted during download:

find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5

Check the binary’s SHA256 hash to ensure the file wasn’t corrupted during download. Run the following command

sha256sum ngccli_linux.zip

Compare with the following value, which can also be found in the Release Notes of the Resource:

5f01eff85a66c895002f3c87db2933c462f3b86e461e60d515370f647b4ffc21

After verifying value, make the NGC CLI binary executable and add your current directory to path:

chmod u+x ngc-cli/ngc
echo "export PATH=\"\$PATH:$(pwd)/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile

You must configure NGC CLI for your use so that you can run the commands.

Enter the following command, including your API key when prompted:

ngc config set

Define an environment variable with your NGC API key:

export NGC_API_KEY=<your NGC API key> 

Install the NVIDIA NIM Operator

The steps in this section are based on Installing NIM Operator on Red Hat OpenShift Using operator-sdk (for Development-Only).

Run the following script to install the NIM operator:

./install-nim-operator.sh

Confirm the controller pod is running:

oc get pods -n nvidia-nim-operator
NAME                                                              READY   STATUS      RESTARTS   AGE
ec60a4439c710b89fc2582f5384382b4241f9aee62bb3182b8d128e69dx54dc   0/1     Completed   0          61s
ghcr-io-nvidia-k8s-nim-operator-bundle-latest-main                1/1     Running     0          71s
k8s-nim-operator-86d478b55c-w5cf5                                 1/1     Running     0          50s
Last Modified Jan 19, 2026

Deploy an LLM

20 minutes  

In this section, we’ll use the NVIDIA NIM Operator to deploy two Large Language Models to our OpenShift Cluster.

Create a Namespace

oc create namespace nim-service

Add Secrets with NGC API Key

Add a Docker registry secret for downloading container images from NVIDIA NGC:

oc create secret -n nim-service docker-registry ngc-secret \
    --docker-server=nvcr.io \
    --docker-username='$oauthtoken' \
    --docker-password=$NGC_API_KEY

Add a generic secret that model puller containers use to download the model from NVIDIA NGC:

oc create secret -n nim-service generic ngc-api-secret \
    --from-literal=NGC_API_KEY=$NGC_API_KEY

Deploy an LLM

Run the following command to create the NIMCache and NIMService:

oc apply -n nim-service -f nvidia-llm.yaml

Confirm that the Persistent Volume was created and the Persistent Volume Claim was bound to is successfully:

Note: this can take several minutes to occur

oc get pv,pvc -n nim-service
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                   STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
persistentvolume/pvc-1af12c04-29ad-497f-b018-7d9a3aea3019   100Gi      RWO            Delete           Bound    openshift-monitoring/prometheus-data-prometheus-k8s-1   gp3-csi        <unset>                          4h15m
persistentvolume/pvc-9c389d79-13fb-4169-9d99-a77efd6e7919   100Gi      RWO            Delete           Bound    openshift-monitoring/prometheus-data-prometheus-k8s-0   gp3-csi        <unset>                          4h15m
persistentvolume/pvc-a603b8a7-1445-4b03-945a-3ed68338834c   50Gi       RWO            Delete           Bound    nim-service/meta-llama-3-2-1b-instruct-pvc              gp3-csi        <unset>                          114s

NAME                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
persistentvolumeclaim/meta-llama-3-2-1b-instruct-pvc   Bound    pvc-a603b8a7-1445-4b03-945a-3ed68338834c   50Gi       RWO            gp3-csi        <unset>                 7m8s

Confirm that the NIMCache is Ready:

oc get nimcache.apps.nvidia.com -n nim-service
NAME                         STATUS   PVC                              AGE
meta-llama-3-2-1b-instruct   Ready    meta-llama-3-2-1b-instruct-pvc   9m50s

Confirm that the NIMService is Ready:

oc get nimservices.apps.nvidia.com -n nim-service
NAME                         STATUS   AGE
meta-llama-3-2-1b-instruct   Ready    11m

Test the LLM

Let’s ensure the LLM is working as expected.

Start a pod that has access to the curl command:

oc run --rm -it -n default curl --image=curlimages/curl:latest -- sh

Then run the following command to send a prompt to the LLM:

curl -X "POST" \
 'http://meta-llama-3-2-1b-instruct.nim-service:8000/v1/chat/completions' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
        "model": "meta/llama-3.2-1b-instruct",
        "messages": [
        {
          "content":"What is the capital of Canada?",
          "role": "user"
        }],
        "top_p": 1,
        "n": 1,
        "max_tokens": 1024,
        "stream": false,
        "frequency_penalty": 0.0,
        "stop": ["STOP"]
      }'
{
  "id": "chatcmpl-2ccfcd75a0214518aab0ef0375f8ca21",
  "object": "chat.completion",
  "created": 1758919002,
  "model": "meta/llama-3.2-1b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": null,
        "content": "The capital of Canada is Ottawa.",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "total_tokens": 50,
    "completion_tokens": 8,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

Deploy an Embeddings Model

We’re also going to deploy an embeddings model in our cluster, which will be used later in the workshop to implement Retrieval Augmented Generation (RAG).

Run the following command to deploy the embeddings model:

oc apply -n nim-service -f nvidia-embeddings.yaml

Confirm that the NIMService is Ready:

oc get nimservices.apps.nvidia.com llama-32-nv-embedqa-1b-v2 -n nim-service
NAME                        STATUS   AGE
llama-32-nv-embedqa-1b-v2   Ready    82s

Test the Embeddings Model

Let’s ensure the embeddings is working as expected.

Start a pod that has access to the curl command:

oc run --rm -it -n default curl --image=curlimages/curl:latest -- sh

Then run the following command to send a prompt to the LLM:

  curl -X POST http://llama-32-nv-embedqa-1b-v2.nim-service:8000/v1/embeddings \
  -H 'Accept: application/json' \
  -H "Content-Type: application/json" \
  -d '{
    "input": ["What is the capital of France?"],
    "model": "nvidia/llama-3.2-nv-embedqa-1b-v2",
    "input_type": "query",
    "encoding_format": "float",
    "truncate": "NONE"
  }'
{"object":"list","data":[{"index":0,"embedding":[-0.016632080078125,0.041259765625,-0.0156707763671875,0.032379150390625,0.045074462890625,0.0169830322265625,-0.03546142578125,-0.0003402233123779297,-0.038909912109375,-0.0023651123046875,-0.0001741647720336914,-0.01377105712890625,-0.01200103759765625,-0.00659942626953125,0.002536773681640625,0.0185394287109375,0.01546478271484375,0.0216827392578125,0.0139923095703125,-0.0121612548828125,-0.015869140625,0.005313873291015625,-0.020599365234375,0.02984619140625,-0.031982421875,0.0005679130554199219,-0.021697998046875,-0.0305938720703125,0.027618408203125,0.005340576171875,0.011993408203125,0.0135345458984375,-0.015625,0.036651611328125,-0.0210113525390625,0.0033321380615234375,0.033172607421875,-0.009552001953125,0.0226287841796875,0.01448822021484375,-0.05474853515625,-0.00861358642578125,-0.01513671875,-0.028656005859375,-0.0095977783203125,-0.025146484375,-0.0352783203125,0.03106689453125,-0.00726318359375,0.0157623291015625,0.01319122314453125,-0.005218505859375,-0.0013561248779296875,0.01277923583984375,-0.007328033447265625,0.01486968994140625,-0.05413818359375,-0.022125244140625,-0.015869140625,-0.00917816162109375,0.0186309814453125,-0.00814056396484375,-0.04730224609375,0.01406097412109375,-0.0248260498046875,0.0094757080078125,0.0309600830078125,0.0196533203125,-0.0270843505859375,-0.01113128662109375,-0.0056915283203125,-0.01154327392578125,0.037750244140625,-0.0028896331787109375,-0.00376129150390625,0.03692626953125,0.020416259765625,0.00200653076171875,0.0396728515625,0.004985809326171875,-0.04425048828125,-0.034820556640625,-0.0102386474609375,0.0218505859375,0.003604888916015625,0.00940704345703125,0.01468658447265625,0.0089111328125,-0.0032196044921875,-0.043243408203125,0.015411376953125,-0.00653839111328125,-0.01128387451171875,-0.052734375,0.032684326171875,-0.0101470947265625,-0.018218994140625,-0.003955841064453125,0.007648468017578125,-0.02044677734375,-0.0285186767578125,0.0153350830078125,0.03692626953125,0.0147247314453125,0.01043701171875,0.0007462501525878906,0.0261077880859375,0.024169921875,-0.040283203125,-0.005828857421875,0.00736236572265625,-0.00909423828125,0.00920867919921875,-0.00243377685546875,-0.03204345703125,-0.0232696533203125,0.01131439208984375,-0.0192413330078125,-0.025482177734375,-0.0267333984375,-0.03350830078125,-0.0008440017700195312,-0.040496826171875,0.0281982421875,-0.03533935546875,-0.0005216598510742188,0.061859130859375,0.0439453125,-0.00656890869140625,0.004627227783203125,-0.05615234375,0.027801513671875,-0.0027790069580078125,-0.011322021484375,-0.00841522216796875,0.027191162109375,-0.0018453598022460938,-0.00560760498046875,0.020263671875,-0.0295867919921875,0.03759765625,-0.00951385498046875,0.014190673828125,-0.039764404296875,-0.0199127197265625,-0.007427215576171875,-0.021392822265625,0.00946807861328125,0.03955078125,0.0015478134155273438,0.0227203369140625,0.020843505859375,-0.00531768798828125,0.0087432861328125,-0.01617431640625,0.002574920654296875,0.0104217529296875,0.01215362548828125,0.00970458984375,0.003803253173828125,0.005451202392578125,-0.001972198486328125,-0.01171875,-0.0216217041015625,0.033355712890625,-0.007137298583984375,0.00890350341796875,-0.01293182373046875,0.0189666748046875,-0.0168609619140625,-0.0153350830078125,0.0113525390625,-0.0123443603515625,0.046905517578125,0.0082244873046875,-0.019500732421875,0.004482269287109375,0.01007080078125,0.0037708282470703125,0.053619384765625,0.0171051025390625,-0.0305023193359375,-0.0240478515625,0.007648468017578125,-0.03973388671875,0.00847625732421875,-0.0207061767578125,-0.025115966796875,0.0168914794921875,0.03619384765625,0.00720977783203125,0.00803375244140625,0.022064208984375,-0.01111602783203125,-0.0024890899658203125,0.0047760009765625,-0.02374267578125,-0.0197296142578125,0.0203704833984375,0.00572967529296875,0.0254058837890625,0.0186920166015625,-0.022796630859375,-0.02191162109375,-0.0182952880859375,-0.00365447998046875,-0.01262664794921875,-0.0128326416015625,0.0157318115234375,-0.004596710205078125,0.033843994140625,0.01313018798828125,0.00656890869140625,0.0004596710205078125,-0.020355224609375,0.03411865234375,0.00034546852111816406,0.0205230712890625,0.02960205078125,-0.0157318115234375,-0.051483154296875,-0.0255584716796875,0.039825439453125,0.0258636474609375,0.038238525390625,-0.03424072265625,0.0188140869140625,0.03216552734375,-0.048126220703125,-0.0227203369140625,0.0032215118408203125,-0.0156097412109375,-0.003170013427734375,0.01444244384765625,-0.0232086181640625,0.002437591552734375,0.012451171875,-0.0066680908203125,0.01158905029296875,0.026397705078125,-0.00373077392578125,0.008087158203125,-0.00798797607421875,-0.0173187255859375,-0.013580322265625,0.033660888671875,-0.0028629302978515625,0.046295166015625,0.0299530029296875,-0.0159759521484375,-0.005580902099609375,0.0015277862548828125,0.01123046875,0.0031585693359375,-0.01151275634765625,0.00814056396484375,-0.0369873046875,0.0267181396484375,0.0013856887817382812,0.028656005859375,0.01409149169921875,0.035614013671875,-0.01189422607421875,-0.01190185546875,-0.053619384765625,0.037139892578125,-0.02288818359375,-0.024139404296875,0.01678466796875,-0.01020050048828125,-0.0222930908203125,-0.01059722900390625,0.044525146484375,0.006526947021484375,0.006084442138671875,-0.0015411376953125,-0.040618896484375,0.027801513671875,-0.00839996337890625,-0.01126861572265625,-0.01000213623046875,0.01197052001953125,-0.01062774658203125,-0.0155487060546875,0.011688232421875,0.01044464111328125,-0.0528564453125,0.031005859375,-0.0007734298706054688,8.52346420288086e-6,-0.00894927978515625,-0.005870819091796875,0.04254150390625,-0.03216552734375,0.0215911865234375,-0.01029205322265625,-0.01015472412109375,-0.036285400390625,0.0111083984375,0.0018520355224609375,-0.01177215576171875,0.01256561279296875,-0.004901885986328125,-0.006866455078125,-0.0084381103515625,-0.01160430908203125,-0.044891357421875,0.00632476806640625,0.0015001296997070312,-0.0016727447509765625,-0.013031005859375,-0.01404571533203125,-0.035888671875,-0.013397216796875,-0.0006346702575683594,0.00981903076171875,-0.0134735107421875,-0.022705078125,-0.02606201171875,-0.018402099609375,-0.046966552734375,-0.012542724609375,0.0183563232421875,0.0179901123046875,-0.0225067138671875,0.02801513671875,-0.032379150390625,-0.0079803466796875,0.0070953369140625,0.017333984375,0.026611328125,-0.03778076171875,-0.023590087890625,-0.005245208740234375,0.024383544921875,-0.02105712890625,0.020843505859375,0.033905029296875,0.0225372314453125,0.00942230224609375,0.0005245208740234375,-0.0284881591796875,0.01499176025390625,-0.0124664306640625,-0.0267333984375,0.023040771484375,0.01265716552734375,-0.0026264190673828125,0.00955963134765625,-0.0036773681640625,-0.0394287109375,0.015716552734375,-0.01300048828125,0.0187225341796875,-0.01275634765625,-0.0273590087890625,0.045562744140625,-0.00913238525390625,-0.004268646240234375,-0.005107879638671875,-0.026702880859375,0.0015077590942382812,-0.02862548828125,0.0003228187561035156,0.0099334716796875,-0.0305328369140625,-0.0362548828125,0.0114898681640625,-0.00025653839111328125,-0.0022735595703125,0.0106201171875,0.01090240478515625,0.00992584228515625,0.00998687744140625,-0.00634002685546875,0.00711822509765625,-0.02337646484375,-0.01367950439453125,-0.006389617919921875,-0.006000518798828125,0.01027679443359375,-0.00838470458984375,0.004673004150390625,0.002841949462890625,0.014404296875,-0.02838134765625,0.023834228515625,-0.00823974609375,-0.038970947265625,0.003002166748046875,-0.04510498046875,-0.0265655517578125,-0.0036182403564453125,-0.046661376953125,-0.01062774658203125,-0.05804443359375,-0.02117919921875,-0.029815673828125,0.036712646484375,-0.0069122314453125,0.0079345703125,0.0164794921875,-0.007534027099609375,-0.01111602783203125,0.0135650634765625,-0.0017242431640625,0.009490966796875,-0.0222320556640625,0.043853759765625,0.054718017578125,-0.003208160400390625,-0.004199981689453125,0.01529693603515625,-0.007190704345703125,0.00637054443359375,-0.004749298095703125,-0.0217132568359375,-0.0093841552734375,-0.0335693359375,-0.0017490386962890625,0.0081939697265625,0.0247802734375,0.0148468017578125,0.026763916015625,0.002079010009765625,0.0292816162109375,0.04705810546875,0.02166748046875,-0.0120697021484375,0.01050567626953125,0.0131988525390625,0.0169525146484375,0.0291595458984375,-0.00270843505859375,-0.0095062255859375,-0.0211944580078125,-0.035980224609375,0.006805419921875,0.002735137939453125,0.043731689453125,-0.01515960693359375,0.0010576248168945312,-0.00913238525390625,0.001293182373046875,-0.00027489662170410156,-0.00868988037109375,0.007389068603515625,0.0023212432861328125,-0.01528167724609375,0.017852783203125,-0.03643798828125,0.045623779296875,-0.0030364990234375,-0.0271453857421875,0.0268402099609375,-0.0033473968505859375,0.0186920166015625,-0.0225067138671875,0.0125732421875,-0.01386260986328125,-0.0218658447265625,0.01248931884765625,0.025848388671875,0.021453857421875,0.008056640625,0.025421142578125,0.01224517822265625,0.0208740234375,-0.003856658935546875,-0.021209716796875,-0.00545501708984375,-0.0254058837890625,0.04388427734375,0.0204315185546875,-0.0072174072265625,-0.0110626220703125,0.0007481575012207031,-0.0022411346435546875,-0.046905517578125,-0.028472900390625,0.0196533203125,0.014129638671875,0.0130615234375,-0.01288604736328125,-0.03607177734375,-0.01568603515625,-0.00814056396484375,-0.01499176025390625,0.0112152099609375,-0.00360870361328125,0.024688720703125,-0.0189361572265625,-0.007122039794921875,0.00634002685546875,-0.00626373291015625,-0.000766754150390625,0.0193939208984375,-0.002841949462890625,0.041717529296875,-0.00016701221466064453,-0.043365478515625,-0.023773193359375,0.0283660888671875,0.0245208740234375,-0.055450439453125,0.01096343994140625,-0.0180511474609375,0.0189056396484375,0.0164947509765625,-0.033111572265625,0.0262603759765625,0.0294189453125,0.00084686279296875,0.0279388427734375,-0.003910064697265625,0.002910614013671875,0.00890350341796875,-0.033843994140625,0.004856109619140625,0.00033974647521972656,-0.056549072265625,-0.0110626220703125,-0.0178375244140625,0.006381988525390625,0.018798828125,0.0205230712890625,-0.05609130859375,-0.01023101806640625,-0.001201629638671875,-0.02227783203125,0.01910400390625,0.006931304931640625,0.0017032623291015625,-0.01849365234375,-0.0249786376953125,-0.0176849365234375,0.007389068603515625,-0.01025390625,0.036407470703125,-0.0275421142578125,0.021514892578125,-0.0198822021484375,-0.0189056396484375,-0.0156402587890625,0.01025390625,0.02197265625,-0.007740020751953125,-0.034515380859375,0.0011262893676757812,0.024566650390625,0.0229339599609375,0.004810333251953125,-0.01171875,-0.0238189697265625,0.021392822265625,0.0008301734924316406,0.019378662109375,-0.00894927978515625,-0.01496124267578125,0.01558685302734375,-0.0229339599609375,0.00020587444305419922,-0.0202178955078125,0.0298919677734375,0.00969696044921875,-0.0011949539184570312,-0.007144927978515625,-0.0198211669921875,0.0030422210693359375,-0.037811279296875,-0.039306640625,-0.027587890625,-0.0274810791015625,0.025390625,-0.0333251953125,-0.0062103271484375,-0.016876220703125,0.002651214599609375,-0.0020275115966796875,0.042144775390625,0.013092041015625,0.01690673828125,0.0268707275390625,0.0082244873046875,0.066650390625,0.0053253173828125,0.08526611328125,-0.0146331787109375,-0.0261688232421875,-0.04266357421875,0.004474639892578125,-0.005229949951171875,-0.01806640625,0.00479888916015625,0.00183868408203125,-0.01030731201171875,0.0028285980224609375,-0.0239410400390625,0.0166778564453125,0.0006723403930664062,-0.00923919677734375,-0.00504302978515625,0.0159759521484375,-0.0248260498046875,0.03179931640625,-0.01517486572265625,-0.0006771087646484375,-0.0117645263671875,0.016510009765625,0.00168609619140625,-0.016387939453125,0.0421142578125,-0.00951385498046875,-0.00388336181640625,-0.04559326171875,-0.0194091796875,0.043853759765625,-0.007541656494140625,0.0275421142578125,-0.005645751953125,0.003803253173828125,-0.01438140869140625,0.018218994140625,-0.006381988525390625,-0.012664794921875,-0.011962890625,0.035186767578125,0.0225067138671875,-0.005321502685546875,-0.007659912109375,0.0022792816162109375,-0.00830078125,-0.0092926025390625,-0.0278778076171875,-0.00011402368545532227,0.0027523040771484375,0.0082855224609375,0.0175933837890625,0.0029430389404296875,0.0721435546875,0.01525115966796875,-0.059967041015625,-0.0626220703125,0.0222625732421875,-0.05810546875,-0.01192474365234375,-0.0056610107421875,0.0173492431640625,-0.0008497238159179688,-0.01050567626953125,-0.01558685302734375,0.0032196044921875,0.00745391845703125,-0.05029296875,0.00310516357421875,0.0333251953125,-0.01166534423828125,-0.0347900390625,-0.00830078125,0.01305389404296875,0.01030731201171875,0.017730712890625,-0.007415771484375,-0.00287628173828125,0.01197052001953125,-0.004016876220703125,-0.038421630859375,0.000743865966796875,-0.006237030029296875,0.0511474609375,-0.003826141357421875,-0.00838470458984375,-0.007572174072265625,0.00522613525390625,0.01514434814453125,0.00557708740234375,-0.035186767578125,0.0077056884765625,-0.0330810546875,-0.0043487548828125,-0.0307464599609375,-0.00670623779296875,0.01395416259765625,-0.0247039794921875,-0.03399658203125,0.0176849365234375,-0.00827789306640625,-0.0132293701171875,0.011016845703125,0.00740814208984375,-0.022735595703125,0.01110076904296875,-0.0127105712890625,-0.01074981689453125,-0.04150390625,-0.05438232421875,-0.0014743804931640625,-0.00507354736328125,-0.05291748046875,-0.0126800537109375,0.032135009765625,0.0266571044921875,-0.0240020751953125,-0.0033702850341796875,0.0021076202392578125,0.0206756591796875,0.01454925537109375,-0.00954437255859375,0.0178680419921875,0.004734039306640625,-0.0014028549194335938,0.0109710693359375,-0.0200042724609375,-0.030029296875,0.04022216796875,-0.0190887451171875,0.028594970703125,0.0205841064453125,-0.0028095245361328125,0.0024242401123046875,-0.0151214599609375,0.0025386810302734375,-0.006633758544921875,0.01265716552734375,-0.019073486328125,0.0030384063720703125,-0.024871826171875,-0.01148223876953125,0.00914764404296875,-0.004367828369140625,-0.0186920166015625,0.021514892578125,-0.027435302734375,0.00736236572265625,0.037872314453125,-0.00222015380859375,0.0041351318359375,-0.0224151611328125,-0.0255279541015625,0.03271484375,-0.0242919921875,0.0097198486328125,-0.02008056640625,-0.01003265380859375,-0.0215606689453125,-0.00974273681640625,-0.0428466796875,-0.0343017578125,-0.0006017684936523438,-0.0230865478515625,0.020782470703125,0.01134490966796875,0.0107421875,-0.0165863037109375,-0.0043487548828125,0.0165252685546875,0.0276947021484375,0.0051116943359375,0.03497314453125,0.0288848876953125,0.0205230712890625,-0.0099029541015625,0.0014505386352539062,-0.045074462890625,-0.0226898193359375,0.002422332763671875,0.0013151168823242188,-0.0031642913818359375,-0.0247344970703125,0.013885498046875,-0.002410888671875,0.046051025390625,0.0328369140625,0.04193115234375,0.006710052490234375,-0.004138946533203125,-0.031768798828125,0.024658203125,0.00417327880859375,-0.01116943359375,0.0097198486328125,-0.021270751953125,0.0285491943359375,0.02581787109375,0.0167083740234375,0.0206298828125,0.009185791015625,0.00794219970703125,-0.0022792816162109375,0.004337310791015625,-0.01166534423828125,-0.01227569580078125,0.00905609130859375,0.0156707763671875,-0.04217529296875,0.025054931640625,-0.01058197021484375,0.0171356201171875,0.001369476318359375,0.003917694091796875,-0.00817108154296875,0.026123046875,0.0200042724609375,-0.0294189453125,0.032440185546875,-0.0297393798828125,-0.0109100341796875,-0.00856781005859375,0.0034465789794921875,0.0186920166015625,0.0199737548828125,-0.03558349609375,-0.025146484375,-0.009307861328125,0.0081024169921875,0.0131378173828125,0.0117340087890625,0.0063018798828125,0.0000546574592590332,0.01898193359375,-0.0167694091796875,0.01666259765625,0.0374755859375,0.02374267578125,-0.0103912353515625,0.01207733154296875,-0.032989501953125,-0.004108428955078125,-0.0026798248291015625,0.01166534423828125,0.0257568359375,-0.056732177734375,0.0282745361328125,-0.0034351348876953125,-0.007415771484375,0.0081634521484375,0.029998779296875,0.0019369125366210938,-0.0014734268188476562,0.004573822021484375,0.04296875,0.025665283203125,-0.0121307373046875,0.029266357421875,0.016815185546875,-0.002536773681640625,-0.015045166015625,-0.0211334228515625,0.0020351409912109375,0.008087158203125,-0.004528045654296875,-0.0172882080078125,0.023712158203125,0.0305633544921875,0.0213470458984375,-0.0154266357421875,-0.035675048828125,0.0002543926239013672,0.01149749755859375,0.00833892822265625,0.01506805419921875,0.019500732421875,-0.01265716552734375,0.01947021484375,0.0242767333984375,-0.017486572265625,-0.01294708251953125,-0.012603759765625,-0.0093994140625,-0.00226593017578125,0.020355224609375,-0.0369873046875,0.0166168212890625,0.034332275390625,-0.0240631103515625,-0.03558349609375,0.036376953125,-0.009246826171875,0.0041656494140625,0.0439453125,-0.023284912109375,0.004749298095703125,-0.0232391357421875,-0.0105743408203125,-0.01030731201171875,-0.01318359375,0.0220184326171875,0.005840301513671875,0.0217437744140625,-0.01007080078125,0.01398468017578125,0.0019063949584960938,-0.011383056640625,-0.00424957275390625,-0.0208282470703125,0.012237548828125,0.01526641845703125,0.00959014892578125,0.027191162109375,0.001735687255859375,0.0177154541015625,-0.01139068603515625,0.0218963623046875,0.03814697265625,-0.018951416015625,0.011016845703125,-0.01287078857421875,0.046875,-0.007415771484375,0.01198577880859375,-0.02532958984375,0.00311279296875,0.018524169921875,0.005390167236328125,-0.01435089111328125,0.0018949508666992188,0.0421142578125,0.0045928955078125,-0.006099700927734375,0.007049560546875,0.00502777099609375,-0.00963592529296875,0.00894927978515625,-0.034515380859375,-0.0035114288330078125,-0.0142974853515625,-0.034515380859375,-0.02142333984375,0.017608642578125,-0.014892578125,-0.01244354248046875,-0.017486572265625,0.00013899803161621094,0.00011283159255981445,-0.00756072998046875,-0.0132293701171875,0.0108489990234375,0.0305328369140625,-0.001163482666015625,-0.002880096435546875,-0.0007386207580566406,0.00370025634765625,0.00797271728515625,-0.010528564453125,-0.0073089599609375,-0.0279693603515625,-0.01343536376953125,-0.005908966064453125,-0.0003764629364013672,0.053955078125,0.0237884521484375,-0.053497314453125,-0.01165771484375,-0.037628173828125,0.0099639892578125,-0.02386474609375,0.032958984375,0.0239715576171875,0.0016231536865234375,-0.033111572265625,0.0007448196411132812,0.0245819091796875,-0.0094757080078125,-0.03131103515625,-0.02459716796875,0.021453857421875,0.01398468017578125,-0.0017442703247070312,0.054107666015625,0.0193328857421875,0.0057373046875,0.03485107421875,0.0258636474609375,0.004131317138671875,-0.02239990234375,-0.002368927001953125,0.01102447509765625,-0.017181396484375,0.01454925537109375,-0.0119781494140625,-0.0017871856689453125,-0.0166778564453125,0.008544921875,-0.0135345458984375,-0.03192138671875,0.0030956268310546875,-0.0279083251953125,0.0235595703125,-0.017974853515625,0.0108184814453125,0.0031032562255859375,-0.003093719482421875,-0.014129638671875,0.01361083984375,-0.03619384765625,-0.00826263427734375,0.033477783203125,-0.004150390625,0.0157012939453125,0.0011501312255859375,0.059844970703125,-0.01555633544921875,0.031219482421875,0.0177001953125,-0.0307464599609375,0.01264190673828125,0.0291290283203125,0.01045989990234375,-0.0097503662109375,0.01226806640625,0.00598907470703125,0.01849365234375,-0.02801513671875,-0.0112152099609375,-0.006011962890625,-0.006664276123046875,0.00928497314453125,0.0002186298370361328,-0.0012874603271484375,-0.0233001708984375,-0.0065155029296875,-0.0220947265625,-0.00310516357421875,0.049041748046875,-0.04925537109375,0.0262451171875,-0.0028095245361328125,-0.0091400146484375,0.0240631103515625,-0.002864837646484375,0.0120391845703125,-0.021942138671875,0.0347900390625,0.023834228515625,-0.0134429931640625,0.00028228759765625,0.0277557373046875,0.03082275390625,0.006237030029296875,-0.015350341796875,-0.005039215087890625,0.0145416259765625,0.01226806640625,-0.01474761962890625,-0.004917144775390625,-0.005733489990234375,-0.010986328125,0.0223236083984375,0.0224609375,-0.035736083984375,-0.008544921875,-0.0009150505065917969,-0.0119476318359375,0.0178070068359375,-0.005352020263671875,-0.01558685302734375,-0.0208740234375,-0.0160675048828125,0.0069122314453125,-0.0357666015625,0.01319122314453125,-0.00457000732421875,0.00502777099609375,-0.0006170272827148438,0.0032196044921875,-0.008209228515625,0.0026721954345703125,-0.022705078125,0.01666259765625,-0.0217132568359375,-0.024017333984375,-0.00527191162109375,0.0005908012390136719,0.0028228759765625,-0.0205841064453125,-0.05108642578125,0.02947998046875,-0.00861358642578125,-0.035552978515625,-0.0090484619140625,-0.044464111328125,-0.0284881591796875,0.004901885986328125,0.00669097900390625,0.020538330078125,0.01218414306640625,0.01477813720703125,0.0011930465698242188,0.027587890625,-0.037811279296875,0.0273284912109375,-0.0006680488586425781,0.0179901123046875,0.047393798828125,0.033355712890625,-0.018646240234375,-0.031585693359375,-0.0190887451171875,0.0059051513671875,-0.005916595458984375,0.0247802734375,0.00881195068359375,-0.004108428955078125,-0.0091552734375,0.021697998046875,-0.0207061767578125,0.0207977294921875,-0.048095703125,-0.01544189453125,0.015533447265625,0.0228424072265625,0.0255126953125,-0.0172119140625,-0.0450439453125,0.0005936622619628906,0.0027103424072265625,0.03704833984375,-0.018218994140625,-0.00972747802734375,0.0067901611328125,-0.000598907470703125,-0.00482940673828125,-0.00786590576171875,0.0011510848999023438,0.0364990234375,-0.0128631591796875,-0.0198822021484375,0.0000896453857421875,-0.022735595703125,0.01479339599609375,-0.0034351348876953125,0.0120086669921875,0.0070037841796875,-0.01971435546875,0.04010009765625,0.0034389495849609375,-0.0109100341796875,0.01395416259765625,0.03509521484375,0.01096343994140625,-0.0209808349609375,-0.0009293556213378906,-0.00043487548828125,0.005519866943359375,-0.016448974609375,0.032470703125,0.0284881591796875,0.0144195556640625,-0.0307464599609375,0.0217437744140625,-0.0303497314453125,-0.05926513671875,0.01444244384765625,-0.01264190673828125,0.040313720703125,-0.012603759765625,-0.0178375244140625,-0.04339599609375,0.01222991943359375,-0.0025005340576171875,-0.010406494140625,-0.003086090087890625,-0.0214385986328125,0.01045989990234375,0.005886077880859375,-0.0175933837890625,0.04840087890625,-0.0168914794921875,0.01800537109375,-0.01354217529296875,-0.01383209228515625,0.04083251953125,0.034271240234375,0.021514892578125,0.04022216796875,0.0231781005859375,-0.01110076904296875,-0.0224151611328125,0.0021991729736328125,-0.01206207275390625,-0.01557159423828125,0.0548095703125,0.02618408203125,0.023956298828125,-0.00994110107421875,-0.004299163818359375,0.007030487060546875,-0.0113372802734375,0.0140228271484375,-0.01084136962890625,0.010711669921875,-0.0236358642578125,0.01776123046875,0.04461669921875,-0.0460205078125,-0.012969970703125,0.0078277587890625,-0.040313720703125,-0.004344940185546875,-0.00681304931640625,-0.00937652587890625,0.00601959228515625,-0.0086669921875,0.038238525390625,-0.00726318359375,-0.00667572021484375,-0.0282745361328125,-0.01448822021484375,-0.004566192626953125,0.002193450927734375,0.0408935546875,-0.018951416015625,-0.0347900390625,-0.0038661956787109375,0.0011167526245117188,0.00603485107421875,0.004985809326171875,0.004299163818359375,0.009552001953125,-0.04736328125,0.018310546875,0.004238128662109375,0.028839111328125,-0.02349853515625,0.00798797607421875,0.021270751953125,-0.01384735107421875,-0.02392578125,0.03662109375,0.0032825469970703125,0.056182861328125,-0.007129669189453125,-0.0014019012451171875,0.030426025390625,-0.017974853515625,-0.0118560791015625,0.0104827880859375,-0.0132293701171875,0.01959228515625,-0.0006871223449707031,-0.038055419921875,0.03125,0.01332855224609375,0.0675048828125,0.0005002021789550781,0.0117950439453125,0.0179901123046875,-0.0034618377685546875,-0.029205322265625,0.0136871337890625,-0.01409149169921875,-0.020111083984375,-0.06976318359375,-0.03985595703125,-0.020965576171875,0.002532958984375,-0.000797271728515625,0.00029206275939941406,-0.04278564453125,0.01293182373046875,-0.0178375244140625,-0.01496124267578125,-0.0289154052734375,-0.00551605224609375,-0.0135498046875,-0.0019350051879882812,-0.0008111000061035156,0.032958984375,0.005794525146484375,-0.00988006591796875,0.0147247314453125,0.0008878707885742188,-0.0347900390625,0.04827880859375,0.03656005859375,0.0005245208740234375,0.0078887939453125,0.0218048095703125,0.0177764892578125,0.02093505859375,-0.028656005859375,0.0273284912109375,-0.038818359375,0.01300811767578125,0.0174102783203125,0.01216888427734375,-0.0258941650390625,0.028778076171875,-0.024658203125,0.00337982177734375,-0.00594329833984375,-0.00948333740234375,0.036773681640625,-0.006595611572265625,-0.01033782958984375,0.001506805419921875,-0.03656005859375,-0.0239105224609375,0.041229248046875,-0.04071044921875,-0.0152435302734375,0.0151214599609375,0.037994384765625,-0.01058197021484375,-0.01062774658203125,0.002964019775390625,0.0294189453125,0.01041412353515625,0.038299560546875,-0.036163330078125,-0.036346435546875,-0.00850677490234375,-0.0098876953125,-0.051788330078125,0.02398681640625,-0.0219268798828125,0.023406982421875,0.008941650390625,0.010772705078125,-0.0265960693359375,-0.0099639892578125,-0.00727081298828125,0.0234222412109375,0.0023441314697265625,-0.01409912109375,0.01169586181640625,0.0023250579833984375,-0.0189208984375,-0.01013946533203125,-0.01739501953125,-0.0309295654296875,-0.00823974609375,0.029205322265625,0.01111602783203125,-0.01509857177734375,-0.01160430908203125,0.0173187255859375,0.0169830322265625,-0.00464630126953125,0.0253448486328125,0.0095062255859375,-0.0179443359375,0.0223846435546875,-0.0219879150390625,-0.0004260540008544922,-0.025421142578125,-0.007659912109375,-0.01485443115234375,-0.0166168212890625,0.011444091796875,0.0185394287109375,-0.02984619140625,0.061767578125,0.0189971923828125,-0.016693115234375,0.002613067626953125,-0.01242828369140625,0.0262298583984375,0.029388427734375,-0.0711669921875,-0.0263519287109375,0.01184844970703125,0.00977325439453125,-0.0232696533203125,-0.0131072998046875,0.00910186767578125,0.0251617431640625,0.04644775390625,-0.00033926963806152344,0.00894927978515625,0.01216888427734375,-0.00942230224609375,0.01220703125,0.002918243408203125,0.0167694091796875,0.0286865234375,0.01436614990234375,-0.02581787109375,-0.0123138427734375,-0.0143890380859375,0.0200042724609375,-0.020660400390625,-0.017791748046875,-0.006740570068359375,0.02484130859375,-0.028472900390625,-0.0142364501953125,-0.007534027099609375,0.021697998046875,-0.013580322265625,-0.003910064697265625,0.01214599609375,-0.01267242431640625,-0.005466461181640625,0.0239410400390625,0.01348876953125,0.0171661376953125,-0.00982666015625,-0.009613037109375,0.0189208984375,-0.01146697998046875,-0.01364898681640625,-0.021820068359375,-0.017181396484375,0.0097503662109375,-0.0240478515625,0.031829833984375,0.0172271728515625,0.01308441162109375,0.006938934326171875,0.0212249755859375,-0.007843017578125,-0.041839599609375,0.003757476806640625,-0.01332855224609375,-0.0081024169921875,-0.0252227783203125,0.0125732421875,0.00164794921875,-0.009490966796875,-0.0182647705078125,-0.03497314453125,-0.0187225341796875,-0.001026153564453125,-0.06793212890625,-0.05291748046875,-0.0297393798828125,-0.005031585693359375,-0.026519775390625,-0.00891876220703125,0.0096893310546875,-0.0189056396484375,0.01444244384765625,-0.0270233154296875,-0.0010528564453125,0.006771087646484375,-0.00942230224609375,0.03399658203125,-0.0203094482421875,-0.004795074462890625,0.0025959014892578125,0.01538848876953125,-0.00620269775390625,-0.035675048828125,-0.01142120361328125,0.0011234283447265625,-0.0278778076171875,0.00807952880859375,-0.017547607421875,0.0211639404296875,0.037139892578125,-0.0108642578125,-0.0287017822265625,-0.0008664131164550781,-0.00862884521484375,-0.006320953369140625,-0.00901031494140625,-0.012451171875,0.017913818359375,0.005092620849609375,-0.04345703125,-0.027801513671875,0.023040771484375,0.007328033447265625,-0.013916015625,-0.007678985595703125,-0.0031185150146484375,0.01546478271484375,0.02020263671875,-0.01259613037109375,0.0040130615234375,0.005023956298828125,0.00421142578125,-0.0018835067749023438,0.0369873046875,-0.0006284713745117188,0.007049560546875,-0.0213165283203125,-0.02215576171875,-0.05023193359375,-0.006420135498046875,0.001811981201171875,0.01995849609375,0.007694244384765625,-0.0081329345703125,-0.0347900390625,0.01042938232421875,-0.03131103515625,0.0312042236328125,-0.00971221923828125,-0.0352783203125,0.021209716796875,-0.009490966796875,0.00710296630859375,-0.004848480224609375,-0.01030731201171875,0.0037136077880859375,0.0234222412109375,0.004337310791015625,-0.03436279296875,0.0008835792541503906,-0.036712646484375,0.007740020751953125,0.003978729248046875,-0.0178985595703125,-0.0027065277099609375,0.035491943359375,0.01148223876953125,0.01496124267578125,-0.0025386810302734375,0.014404296875,0.007572174072265625,0.016876220703125,-0.0023212432861328125,0.002727508544921875,-0.005374908447265625,0.01690673828125,-0.020599365234375,-0.00002384185791015625,0.0305328369140625,-0.052734375,0.01496124267578125,0.0039215087890625,-0.00762176513671875,0.031585693359375,-0.01617431640625,-0.01222991943359375,0.00873565673828125,-0.033966064453125,0.01061248779296875,-0.0209197998046875,-0.0198516845703125,0.035247802734375,0.0244598388671875,0.0082550048828125,-0.00787353515625,-0.01544952392578125,0.01302337646484375,-0.0166168212890625,-0.0147247314453125,0.02618408203125,-0.0158233642578125,-0.0394287109375,0.0151214599609375,-0.004146575927734375,-0.035369873046875,0.045928955078125,0.04241943359375,0.01354217529296875,0.0343017578125,-0.007183074951171875,0.0129241943359375,-0.004955291748046875,0.025299072265625,0.01538848876953125,-0.0054779052734375,-0.00630950927734375,-0.010711669921875,0.043914794921875,-0.004856109619140625,0.05169677734375,-0.020111083984375,0.023406982421875,-0.0021114349365234375,-0.039215087890625,-0.01314544677734375,-0.0036773681640625,0.01031494140625,-0.00981903076171875,0.01366424560546875,0.0101776123046875,0.0274658203125,-0.0386962890625,0.0194244384765625,-0.04803466796875,0.033172607421875,0.0269775390625,-0.0176849365234375,-0.0016927719116210938,-0.02783203125,0.0015516281127929688,0.01325225830078125,-0.028472900390625,0.01470947265625,0.036773681640625,-0.038482666015625,-0.0009303092956542969,0.0236053466796875,-0.00498199462890625,0.0165557861328125,0.00003445148468017578,-0.03741455078125,-0.0517578125,-0.0090179443359375,-0.033966064453125,-0.0170440673828125,0.0013637542724609375,-0.04473876953125,-0.059478759765625,-0.0165557861328125,-0.047119140625,-0.033721923828125,0.018890380859375,0.00160980224609375,0.050811767578125,-0.0221099853515625,0.0306396484375,-0.01096343994140625,-0.007175445556640625,0.01580810546875,-0.00650787353515625,-0.00467681884765625,0.0256500244140625,0.006931304931640625,0.00316619873046875,-0.0170745849609375,-0.003265380859375,0.00554656982421875,-0.0166473388671875,0.0006661415100097656,0.0297393798828125,-0.00568389892578125,0.01043701171875,-0.03863525390625,0.01531982421875,0.021087646484375,0.002185821533203125,0.00977325439453125,-0.028594970703125,-0.0166473388671875,-0.00018537044525146484,-0.0014066696166992188,0.014312744140625,0.025299072265625,-0.0149383544921875,0.001495361328125,0.03692626953125,0.00438690185546875,0.05572509765625,-0.00350189208984375,0.0156402587890625,0.005992889404296875,-0.005748748779296875,-0.01739501953125,0.017059326171875,0.0006203651428222656,-0.0163726806640625,-0.0203704833984375,-0.005962371826171875,0.006130218505859375,-0.00022983551025390625,-0.014007568359375,-0.0025844573974609375,-0.0171356201171875,0.0130157470703125,-0.005809783935546875,0.0174560546875,-0.0196075439453125,-0.017486572265625,-0.035369873046875,0.0016012191772460938,-0.02008056640625,-0.0213775634765625,0.04119873046875,-0.0125732421875,-0.00983428955078125,0.01010894775390625,-0.01099395751953125,-0.009613037109375,-0.01091766357421875,0.0032520294189453125,-0.004924774169921875,-0.041656494140625,0.01227569580078125,0.011077880859375,-0.040740966796875,0.002017974853515625,-0.0193023681640625,0.014739990234375,-0.0018491744995117188,0.008636474609375,0.017791748046875,-0.0012598037719726562,-0.004123687744140625,-0.006511688232421875,-0.0179443359375,-0.03619384765625,-0.0009822845458984375,0.0066680908203125,-0.0012950897216796875,0.0031185150146484375,-0.05401611328125,0.0266876220703125,-0.035308837890625,-0.0234375,0.0234222412109375,-0.037384033203125,0.002349853515625,0.01290130615234375,-0.0321044921875,0.019622802734375,-0.052337646484375,-0.00556182861328125,0.005496978759765625,0.0078125,0.010101318359375,-0.0055084228515625,0.021087646484375,0.016754150390625,0.0192413330078125,-0.024261474609375,0.0457763671875,-0.0185394287109375,0.0007729530334472656,0.0173187255859375,0.0224456787109375,0.0283355712890625,0.00576019287109375,0.04150390625,-0.005279541015625,0.01000213623046875,0.01496124267578125,0.003604888916015625,-0.033447265625,0.013824462890625,-0.0014410018920898438,-0.0225067138671875,-0.0017547607421875,0.0235443115234375,0.0171966552734375,0.0234375,-0.00482177734375,-0.0062103271484375,0.01885986328125,-0.003917694091796875,0.0172119140625,0.0240478515625,-0.006069183349609375,-0.0166015625,-0.00955963134765625,-0.01861572265625,0.0198822021484375,-0.046875,-0.0011920928955078125,-0.00972747802734375,0.01349639892578125,-0.00629425048828125,-0.0087738037109375,0.01393890380859375,0.0006022453308105469,-0.007038116455078125,-0.017181396484375,-0.00965118408203125,0.0133514404296875,-0.0025787353515625,0.017547607421875,-0.0276641845703125,0.018890380859375,0.01517486572265625,-0.0311737060546875,-0.016815185546875,0.00264739990234375,-0.0214080810546875,0.0181884765625,-0.01145172119140625,-0.0011072158813476562,0.02880859375,0.00782012939453125,-0.0238037109375,0.039031982421875,-0.00690460205078125,0.0018301010131835938,0.0305023193359375,0.005344390869140625,-0.003803253173828125,-0.033782958984375,0.01241302490234375,0.0206146240234375,0.00766754150390625,0.0177459716796875,-0.002201080322265625,-0.01444244384765625,0.031402587890625,-0.04498291015625,-0.02203369140625,-0.017486572265625,0.031341552734375,0.032562255859375,-0.031951904296875,0.0182037353515625,-0.01207733154296875,0.0235748291015625,0.0391845703125,0.00971221923828125,0.029388427734375,-0.038360595703125,0.025726318359375,-0.0040435791015625,0.020233154296875,0.0009427070617675781,0.0347900390625,-0.0226287841796875,0.01318359375,0.01505279541015625,0.01042938232421875,-0.011749267578125,-0.022705078125,-0.006938934326171875,0.008087158203125,-0.00205230712890625,-0.018463134765625,0.02960205078125,-0.0309600830078125,-0.024749755859375,0.004817962646484375,-0.01258087158203125,0.00850677490234375,-0.00560760498046875,-0.021881103515625,-0.004638671875,0.0244903564453125,-0.020416259765625,0.02655029296875,-0.0226287841796875,0.030029296875,0.024139404296875,0.03497314453125,0.0161285400390625,0.0206756591796875,-0.040924072265625,0.01042938232421875,0.048126220703125,0.006565093994140625,-0.00260162353515625,0.037139892578125,0.0006361007690429688,-0.01007843017578125,0.0282745361328125,-0.013702392578125,0.044525146484375,-0.006237030029296875,0.034637451171875,0.0285186767578125,0.0124053955078125,-0.034423828125,0.0007100105285644531,-0.045501708984375,-0.0219268798828125,-0.00836181640625,-0.03704833984375,-0.07000732421875,0.006748199462890625,-0.0036602020263671875,0.00751495361328125,-0.0162353515625,-0.0137176513671875,0.0227203369140625,0.001644134521484375,-0.028656005859375,-0.00397491455078125,-0.0088043212890625,-0.0007772445678710938,0.035797119140625,0.0389404296875,0.0380859375,-0.005031585693359375,0.0059967041015625,-0.016815185546875,-0.0027980804443359375,0.0127410888671875,0.03399658203125,-0.0003867149353027344,0.00679779052734375,-0.0079193115234375,-0.02294921875,0.023101806640625,0.0009560585021972656,0.042694091796875,-0.031768798828125,-0.00247955322265625,0.0197296142578125,0.0196075439453125,-0.0229339599609375,-0.0250396728515625,-0.0006723403930664062,0.011871337890625,0.0308990478515625,0.002803802490234375,0.003803253173828125,-0.0112762451171875,0.0016689300537109375,-0.040985107421875,0.0175933837890625,0.029083251953125,-0.00962066650390625,-0.0384521484375,-0.006683349609375,0.00439453125,0.0269012451171875,0.02252197265625,-0.027587890625,0.003749847412109375,-0.004119873046875,-0.015228271484375,-0.031036376953125,-0.0042724609375,-0.043853759765625,-0.0016918182373046875,-0.015411376953125,0.03643798828125,-0.03814697265625,0.020599365234375,-0.007030487060546875,-0.02532958984375,-0.0216522216796875,0.0016412734985351562,0.00982666015625,0.0205230712890625,0.02484130859375,0.0078887939453125,-0.0261077880859375,0.0247039794921875,-0.01251983642578125,0.0090789794921875,0.013092041015625,0.0082550048828125,0.006603240966796875,-0.00423431396484375,0.01424407958984375,0.01349639892578125,-0.02264404296875,0.0236358642578125,-0.001506805419921875,0.007030487060546875,-0.01727294921875,-0.0249481201171875,-0.00611114501953125,0.0177459716796875,-0.0077056884765625,0.023773193359375,0.01357269287109375,0.012237548828125,0.0338134765625,-0.029022216796875,0.02880859375,-0.0018472671508789062,-0.024139404296875,-0.032989501953125,0.055084228515625,0.02984619140625,0.040618896484375,0.0006160736083984375,0.03814697265625,0.022552490234375,-0.01071929931640625,0.0250091552734375,0.033782958984375,0.00806427001953125,-0.005443572998046875,-0.00899505615234375,-0.00969696044921875,0.01045989990234375,0.037384033203125,0.01308441162109375,-0.01435089111328125,-0.0032367706298828125,0.0186004638671875,-0.0330810546875,-0.014617919921875,0.01088714599609375,-0.00847625732421875,0.02984619140625,-0.0283355712890625,0.023162841796875,0.019134521484375,-0.01218414306640625,-0.033966064453125,-0.028839111328125,-0.022552490234375,-0.02001953125,0.005214691162109375,-0.01418304443359375,0.0035915374755859375,-0.011993408203125,0.0076751708984375,-0.0098876953125,-0.002002716064453125,-0.0008831024169921875,-0.01294708251953125,-0.05120849609375,0.0008082389831542969,0.0205535888671875,-0.0017843246459960938,0.006366729736328125,0.0137939453125,0.060699462890625,-0.0177459716796875,-0.005641937255859375,0.0170440673828125,0.0026397705078125,0.009857177734375,-0.024658203125,0.006175994873046875,0.04205322265625,0.0253143310546875,0.00972747802734375,0.0031375885009765625,-0.022064208984375,0.0006480216979980469,-0.004180908203125,-0.00794219970703125,-0.015106201171875,-0.00901031494140625,-0.00812530517578125,-0.01406097412109375,-0.0247039794921875,-0.0221405029296875,0.025543212890625,0.037353515625,-0.01702880859375,-0.0021762847900390625,0.0237274169921875,0.016632080078125,-0.0335693359375,0.002178192138671875,-0.022705078125,-0.011810302734375,0.01666259765625,0.0287628173828125,-0.02313232421875,-0.011199951171875,0.026702880859375,-0.0195770263671875,0.0278778076171875,0.0106658935546875,-0.0199432373046875,-0.035919189453125,0.028656005859375,0.0009784698486328125,-0.004291534423828125,-0.0309906005859375,0.03277587890625,0.011260986328125,0.0112457275390625,-0.034698486328125,-0.01111602783203125,0.0309906005859375,0.042236328125],"object":"embedding"}],"model":"nvidia/llama-3.2-nv-embedqa-1b-v2","usage":{"prompt_tokens":10,"total_tokens":10}}
Last Modified Jan 19, 2026

Setup Users

5 minutes  

In this section, we’ll create users for each workshop participant, with a namespace and resource quota for each.

Create User Namespaces and Resource Quotas

cd user-setup
./create-namespaces.sh

Create Users

Create an HTPasswd file with participant credentials, then replace the ROSA-managed HTPasswd IdP with a custom one:

./create-users.sh

Re-create the cluster-admin user then login again

Re-create the cluster-admin user then login again:

rosa create admin -c rosa-test
oc login <Cluster API URL> --username cluster-admin --password <cluster admin password>

Add Role to Users

Grant each user access to their namespace only:

./add-role-to-users.sh

Note: if you see errors such as the following, they can be safely ignored

Warning: User 'participant1' not found
clusterrole.rbac.authorization.k8s.io/admin added: "participant1"

Test Login

Install the OpenShift CLI

To test the logins from our local machine, we’ll need to install the OpenShift CLI.

For MacOS, we can install the OpenShift CLI using the Homebrew package manager:

brew install openshift-cli

For other installation options, please refer to the OpenShift documentation.

Login as Workshop User

Try logging in as one of the workshop users from your local machine:

oc login https://api.<cluster-domain>:443 -u participant1 -p 'TempPass123!'

It should say something like:

Login successful.

You have one project on this server: "workshop-participant-1"

Confirm Access to the LLM

Let’s ensure we can access the LLM from the workshop user account.

Start a pod that has access to the curl command:

oc run curl --rm -it --image=curlimages/curl:latest \
  --overrides='{
    "spec": {
      "containers": [{
        "name": "curl",
        "image": "curlimages/curl:latest",
        "stdin": true,
        "tty": true,
        "command": ["sh"],
        "resources": {
          "limits": {
            "cpu": "50m",
            "memory": "100Mi"
          },
          "requests": {
            "cpu": "50m",
            "memory": "100Mi"
          }
        }
      }]
    }
  }'

Then run the following command to send a prompt to the LLM:

curl -X "POST" \
 'http://meta-llama-3-2-1b-instruct.nim-service:8000/v1/chat/completions' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
        "model": "meta/llama-3.2-1b-instruct",
        "messages": [
        {
          "content":"What is the capital of Canada?",
          "role": "user"
        }],
        "top_p": 1,
        "n": 1,
        "max_tokens": 1024,
        "stream": false,
        "frequency_penalty": 0.0,
        "stop": ["STOP"]
      }'
{
  "id": "chatcmpl-2ccfcd75a0214518aab0ef0375f8ca21",
  "object": "chat.completion",
  "created": 1758919002,
  "model": "meta/llama-3.2-1b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": null,
        "content": "The capital of Canada is Ottawa.",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "total_tokens": 50,
    "completion_tokens": 8,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}
Last Modified Feb 2, 2026

Install the OpenTelemetry Collector

5 minutes  

In this section, we’ll install the OpenTelemetry collector with only the clusterReceiver enabled (as the workshop participants will install their own agent in their namespace). We’ll then take the ClusterRole created by this collector installation and bind it to each of the workshop participant namespaces.

Install the OpenTelemetry Collector

First, we’ll create a new project for the collector and switch to that project:

oc new-project admin-otel 

Add the Splunk OpenTelemetry Collector for Kubernetes’ Helm chart repository:

helm repo add splunk-otel-collector-chart https://signalfx.github.io/splunk-otel-collector-chart

Ensure the repository is up-to-date:

helm repo update

Review the file named ./admin-otel-collector/admin-otel-collector-values.yaml as we’ll be using it to install the OpenTelemetry collector.

Set environment variables to configure the Splunk environment you’d like the collector to send data to:

export CLUSTER_NAME=ai-pod-workshop-admin
export ENVIRONMENT_NAME=ai-pod-workshop-admin
export SPLUNK_ACCESS_TOKEN=<your access token for Splunk Observability Cloud> 
export SPLUNK_REALM=<your realm for Splunk Observability Cloud i.e. us0, us1, eu0, etc.>
export SPLUNK_HEC_URL=<HEC endpoint to send logs to Splunk platform i.e. https://<hostname>:443/services/collector/event> 
export SPLUNK_HEC_TOKEN=<HEC token to send logs to Splunk platform> 
export SPLUNK_INDEX=splunk4rookies-workshop

Then install the collector using the following command:

helm install splunk-otel-collector \
  --set="clusterName=$CLUSTER_NAME" \
  --set="environment=$ENVIRONMENT_NAME" \
  --set="splunkObservability.accessToken=$SPLUNK_ACCESS_TOKEN" \
  --set="splunkObservability.realm=$SPLUNK_REALM" \
  --set="splunkPlatform.endpoint=$SPLUNK_HEC_URL" \
  --set="splunkPlatform.token=$SPLUNK_HEC_TOKEN" \
  --set="splunkPlatform.index=$SPLUNK_INDEX" \
  -f ./admin-otel-collector/admin-otel-collector-values.yaml \
  -n admin-otel \
  splunk-otel-collector-chart/splunk-otel-collector

Run the following command to confirm that all of the collector pods are running:

oc get pods -n admin-otel

NAME                                                          READY   STATUS    RESTARTS   AGE
splunk-otel-collector-k8s-cluster-receiver-7b7f5cdc5b-rhxsj   1/1     Running   0          6m40s

Create Service Account for each Workshop Participant and Bind to Cluster Role

for i in {1..30}; do
  ns="workshop-participant-$i"

  oc get ns "$ns" >/dev/null 2>&1 || continue
  oc -n "$ns" create sa splunk-otel-collector 2>/dev/null || true

  oc apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: splunk-otel-collector-${ns}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: splunk-otel-collector
subjects:
- kind: ServiceAccount
  name: splunk-otel-collector
  namespace: ${ns}
EOF
done

We also need to grant the SecurityContextConstraint (SCC) to each namespace ServiceAccount:

for i in {1..30}; do
  ns="workshop-participant-$i"
  oc get ns "$ns" >/dev/null 2>&1 || continue
  oc -n "$ns" adm policy add-scc-to-user splunk-otel-collector -z splunk-otel-collector
done
Last Modified Jan 30, 2026

Deploy the Vector Database

10 minutes  

In this step, we’ll deploy a vector database to the OpenShift cluster and populate it with test data that will be used by workshop participants.

Deploy a Vector Database

For the workshop, we’ll deploy an open-source vector database named Weaviate.

First, add the Weaviate helm repo that contains the Weaviate helm chart:

helm repo add weaviate https://weaviate.github.io/weaviate-helm
helm repo update

The weaviate/weaviate-values.yaml file includes the configuration we’ll use to deploy the Weviate vector database.

We’ve set the following environment variables to TRUE, to ensure Weaviate exposes metrics that we can scrape later with the Prometheus receiver:

  PROMETHEUS_MONITORING_ENABLED: true
  PROMETHEUS_MONITORING_GROUP: true

Review Weaviate documentation to explore additional customization options available.

Let’s create a new namespace:

oc create namespace weaviate

Run the following command to allow Weaviate to run a privileged container:

Note: this approach is not recommended for production environments

oc adm policy add-scc-to-user privileged -z default -n weaviate

Then deploy Weaviate:

helm upgrade --install \
  "weaviate" \
  weaviate/weaviate \
  --namespace "weaviate" \
  --values ./weaviate/weaviate-values.yaml

Populate the Vector Database

Now that Weaviate is up and running, let’s add some data to it that we’ll use in the workshop with a custom application.

The application used to do this is based on LangChain Playbook for NeMo Retriever Text Embedding NIM.

Per the configuration in ./load-embeddings/k8s-job.yaml, we’re going to load a datasheet for the NVIDIA H200 Tensor Core GPU into our vector database.

This document includes information about NVIDIA’s H200 GPUs that our large language model isn’t trained on. And in the next part of the workshop, we’ll build an application that uses an LLM to answer questions using the context from this document, which will be loaded into the vector database.

We’ll deploy a Kubernetes Job to our OpenShift cluster to load the embeddings. A Kubernetes Job is used rather than a Pod to ensure that this process runs only once:

oc create namespace llm-app
oc apply -f ./load-embeddings/k8s-job.yaml

Note: to build a Docker image for the Python application that loads the embeddings into Weaviate, we executed the following commands:

cd workshop/cisco-ai-pods/load-embeddings
docker build --platform linux/amd64 -t derekmitchell399/load-embeddings:1.0 .
docker push derekmitchell399/load-embeddings:1.0
Last Modified Jan 19, 2026

Deploy the Portworx Metrics Endpoint

10 minutes  

In this step, we’ll deploy a Python service that mimics the Portworx metrics endpoint. This will be used in the workshop to configure monitoring for Pure Storage.

Deploy the Portworx Metrics Endpoint

Run the following command to deploy the Portworx metrics endpoint service:

oc new-project portworx
oc apply -f ./portworx/k8s.yaml -n portworx

Test the Portworx Metrics Endpoint

Let’s ensure the Portworx metrics endpoint is working as expected.

Start a pod that has access to the curl command:

oc run --rm -it -n default curl --image=curlimages/curl:latest -- sh

Then run the following command to send a prompt to the endpoint:

curl http://portworx-metrics-sim.portworx:17001/metrics
# HELP px_cluster_cpu_percent Percentage of CPU Used
# TYPE px_cluster_cpu_percent gauge
px_cluster_cpu_percent{cluster="ocp-pxclus-32430549-ad99-4839-bf9b-d6beb8ddc2d6",clusterUUID="e870909b-6150-4d72-87cb-a012630e42ae",node="worker2.flashstack.local",nodeID="f63312a2-0884-4878-be4e-51935613aa80"} 1.91
...
Last Modified Jan 30, 2026

Clean Up

5 minutes  

Clean Up Steps

Once the workshop is complete, follow the steps in this section to uninstall the OpenShift cluster.

Get the cluster ID, the Amazon Resource Names (ARNs) for the cluster-specific Operator roles, and the endpoint URL for the OIDC provider by running the following command:

rosa describe cluster --cluster=$CLUSTER_NAME

Delete the cluster using the following command

rosa delete cluster --cluster=$CLUSTER_NAME --watch

Delete the cluster-specific Operator IAM roles:

Note: just accept the default values when prompted.

rosa delete operator-roles --prefix $OPERATOR_ROLES_PREFIX

Delete the OIDC provider:

Note: just accept the default values when prompted.

rosa delete oidc-provider --oidc-config-id $OIDC_ID

Delete the network:

Note: add the name of the CloudFormation stack used to create the network before running the following command

aws cloudformation delete-stack --region $AWS_REGION --stack-name <stack name i.e. rosa-network-stack-nnnnnnnnnnn>

Refer to OpenShift documentation if you’d like to completely remove the Red Hat OpenShift Service from your AWS account.

Last Modified Jan 21, 2026

Workshop

This section includes the steps that workshop attendees will follow:

  • Practice deploying the OpenTelemetry Collector in the Red Hat OpenShift cluster.
  • Practice adding Prometheus receivers to the collector to ingest infrastructure metrics.
  • Practice monitoring the Weaviate vector database in the cluster.
  • Practice gathering the Pure Storage metrics using Prometheus.
  • Practice instrumenting Python services that interact with Large Language Models (LLMs) with OpenTelemetry.
  • Understanding which details which OpenTelemetry captures in the trace from applications that interact with LLMs.
Last Modified Jan 30, 2026

Subsections of 2. Workshop

Overview of the Workshop Environment

5 minutes  

Cisco’s AI-ready PODs combine cutting-edge hardware and software to deliver a robust, scalable, and efficient AI infrastructure. Splunk Observability Cloud provides comprehensive visibility into this entire stack: from infrastructure to application components.

This hands-on workshop teaches you how to monitor AI infrastructure using OpenTelemetry and Prometheus, without requiring access to an actual Cisco AI POD. You’ll gain practical experience deploying and configuring monitoring technologies in a realistic environment.

Lab Environment

The workshop uses a shared OpenShift Cluster running in AWS, equipped with NVIDIA GPUs and NVIDIA AI Enterprise software.

Pre-Deployed Infrastructure

The workshop instructor has deployed the following shared components to the workshop environment:

  • NVIDIA NIM models:
    • meta/llama-3.2-1b-instruct - Processes user prompts
    • nvidia/llama-3.2-nv-embedqa-1b-v2 - Generates embeddings
  • Weaviate - A vector database for semantic search and retrieval
  • Prometheus exporter - Simulates Pure Storage metrics typical of production AI PODs

Your Workspace

Each participant receives a dedicated namespace within the shared cluster, ensuring isolated environments for independent work.

Workshop Activities

During the workshop, each participant will execute the following tasks:

  1. Deploy and configure an OpenTelemetry collector in your namespace
  2. Integrate observability data collection with the cluster infrastructure
  3. Deploy a Python application that leverages the NVIDIA NIM models
  4. Monitor application performance and infrastructure metrics using Splunk Observability Cloud

What is Prometheus?

While Prometheus typically refers to a full-stack monitoring system used for storage and alerting, this workshop focuses on the Prometheus ecosystem’s data standards.

We will be leveraging Prometheus Exporters, which are small utilities that translate a component’s internal health into a standardized metrics endpoint (e.g., http://localhost:9100/metrics).

Instead of using a full Prometheus server to collect this data, we will use the OpenTelemetry Collector. By using its Prometheus receiver, the collector can scrape these endpoints, allowing us to gather rich telemetry data using a widely-supported industry format.

Last Modified Feb 6, 2026

Connect to the OpenShift Cluster

5 minutes  

Connect to your EC2 Instance

We’ve prepared an Ubuntu Linux instance in AWS/EC2 for each attendee.

Using the IP address and password provided by your instructor, connect to your EC2 instance using one of the methods below:

  • Mac OS / Linux
    • ssh splunk@IP address
  • Windows 10+
    • Use the OpenSSH client
  • Earlier versions of Windows
    • Use Putty

Set the Workshop Participant Number

The instructor will provide each participant with a number from 1 to 30. Store this in an environment variable, and remember what it is, as it will be used throughout the workshop:

export PARTICIPANT_NUMBER=<your participant number>

Install the OpenShift CLI

To access the OpenShift cluster, we’ll need to install the OpenShift CLI.

We can use the following command to download the OpenShift CLI binary directly to our EC2 instance:

curl -L -O https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/stable/openshift-client-linux.tar.gz

Extract the contents:

tar -xvzf openshift-client-linux.tar.gz

Move the resulting files (oc and kubectl) to a location that’s included as part of your path. For example:

sudo mv oc /usr/local/bin/oc
sudo mv kubectl /usr/local/bin/kubectl

Connect to the OpenShift Cluster

Ensure the Kube config file is modifiable by the splunk user:

chmod 600 /home/splunk/.kube/config

Use the cluster API URL and password provided by the workshop organizer to log in to the OpenShift cluster:

oc login https://api.<cluster-domain>:443 -u participant$PARTICIPANT_NUMBER -p '<password>'

Ensure you’re connected to the OpenShift cluster:

oc whoami --show-server 
https://api.***.openshiftapps.com:443
Last Modified Feb 6, 2026

Deploy the OpenTelemetry Collector

10 minutes  

In this section we’ll deploy the OpenTelemetry Collector in our OpenShift namespace, which gathers metrics, logs, and traces from the infrastructure and applications running in the cluster, and sends the resulting data to Splunk Observability Cloud.

Deploy the OpenTelemetry Collector

Ensure Helm is installed

Run the following command to confirm that Helm is installed:

helm version
version.BuildInfo{Version:"v3.19.4", GitCommit:"7cfb6e486dac026202556836bb910c37d847793e", GitTreeState:"clean", GoVersion:"go1.24.11"}

If it’s not installed, execute the following commands:

sudo apt-get install curl gpg apt-transport-https --yes
curl -fsSL https://packages.buildkite.com/helm-linux/helm-debian/gpgkey | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/helm.gpg] https://packages.buildkite.com/helm-linux/helm-debian/any/ any main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

Add the Splunk OpenTelemetry Collector Helm Chart

Add the Splunk OpenTelemetry Collector for Kubernetes’ Helm chart repository:

helm repo add splunk-otel-collector-chart https://signalfx.github.io/splunk-otel-collector-chart

Ensure the repository is up-to-date:

helm repo update

Configure Environment Variables

Set environment variables to configure the Splunk environment you’d like the collector to send data to:

export USER_NAME=workshop-participant-$PARTICIPANT_NUMBER
export CLUSTER_NAME=ai-pod-$USER_NAME
export ENVIRONMENT_NAME=ai-pod-$USER_NAME
export SPLUNK_INDEX=splunk4rookies-workshop

Confirm that the environment name is set:

echo $ENVIRONMENT_NAME
ai-pod-workshop-participant-1

Deploy the Collector

Navigate to the workshop directory:

cd ~/workshop/cisco-ai-pods

Then install the collector in your namespace using the following command:

{ [ -z "$CLUSTER_NAME" ] || \
  [ -z "$ENVIRONMENT_NAME" ] || \
  [ -z "$USER_NAME" ]; } && \
  echo "Error: Missing variables" || \
  helm upgrade --install splunk-otel-collector \
  --set="clusterName=$CLUSTER_NAME" \
  --set="environment=$ENVIRONMENT_NAME" \
  --set="splunkObservability.accessToken=$ACCESS_TOKEN" \
  --set="splunkObservability.realm=$REALM" \
  --set="splunkPlatform.endpoint=$HEC_URL" \
  --set="splunkPlatform.token=$HEC_TOKEN" \
  --set="splunkPlatform.index=$SPLUNK_INDEX" \
  -f ./otel-collector/otel-collector-values.yaml \
  -n $USER_NAME \
  splunk-otel-collector-chart/splunk-otel-collector

Note: if you get an error that says Missing variables, you’ll need to define your environment variables again. Add your participant number before running the following commands:

export PARTICIPANT_NUMBER=<your participant number>
export USER_NAME=workshop-participant-$PARTICIPANT_NUMBER
export CLUSTER_NAME=ai-pod-$USER_NAME
export ENVIRONMENT_NAME=ai-pod-$USER_NAME
export SPLUNK_INDEX=splunk4rookies-workshop

Run the following command to confirm that the collector pods are running:

watch -n 1 oc get pods

NAME                                                          READY   STATUS    RESTARTS   AGE
splunk-otel-collector-agent-58rwm                             1/1     Running   0          6m40s
splunk-otel-collector-agent-8dndr                             1/1     Running   0          6m40s

Note: in OpenShift environments, the collector takes about three minutes to start and transition to the Running state.

Review Collector Data in Splunk Observability Cloud

Confirm that you can see your cluster in Splunk Observability Cloud by navigating to Infrastructure Monitoring -> Kubernetes -> Kubernetes Clusters and then adding a filter on k8s.cluster.name with your cluster name (i.e. ai-pod-workshop-participant-1):

Kubernetes Pods Kubernetes Pods

Last Modified Feb 18, 2026

Monitor NVIDIA Components

10 minutes  

In this section, we’ll use the Prometheus receiver with the OpenTelemetry collector to monitor the NVIDIA components running in the OpenShift cluster. We’ll start by navigating to the directory where the collector configuration file is stored:

cd otel-collector

Capture the NVIDIA DCGM Exporter metrics

The NVIDIA DCGM exporter is running in our OpenShift cluster. It exposes GPU metrics that we can send to Splunk.

To do this, let’s customize the configuration of the collector by editing the otel-collector-values.yaml file that we used earlier when deploying the collector.

Add the following content, just below the kubeletstats receiver:

      receiver_creator/nvidia:
        # Name of the extensions to watch for endpoints to start and stop.
        watch_observers: [ k8s_observer ]
        receivers:
          prometheus/dcgm:
            config:
              config:
                scrape_configs:
                  - job_name: gpu-metrics
                    scrape_interval: 60s
                    static_configs:
                      - targets:
                          - '`endpoint`:9400'
            rule: type == "pod" && labels["app"] == "nvidia-dcgm-exporter"

This tells the collector to look for pods with a label of app=nvidia-dcgm-exporter. And when it finds a pod with this label, it will connect to port 9400 of the pod and scrape the default metrics endpoint (/v1/metrics).

Why are we using the receiver_creator receiver instead of just the Prometheus receiver?

  • The Prometheus receiver uses a static configuration that scrapes metrics from predefined endpoints.
  • The receiver_creator receiver enables dynamic creation of receivers (including Prometheus receivers) based on runtime information, allowing for scalable and flexible scraping setups.
  • Using receiver_creator can simplify configurations in dynamic environments by automating the management of multiple Prometheus scraping targets.

To ensure this new receiver is used, we’ll need to add a new pipeline to the otel-collector-values.yaml file as well.

Add the following code to the bottom of the file:

    service:
      pipelines:
        metrics/nvidia-metrics:
          exporters:
            - signalfx
          processors:
            - memory_limiter
            - batch
            - resourcedetection
            - resource
          receivers:
            - receiver_creator/nvidia

We’ll add one more Prometheus receiver related to NVIDIA in the next section.

Capture the NVIDIA NIM metrics

The meta-llama-3-2-1b-instruct large language model was deployed to the OpenShift cluster using NVIDIA NIM. It includes a Prometheus endpoint that we can scrape with the collector. Let’s add the following to the otel-collector-values.yaml file, just below the prometheus/dcgm receiver we added earlier:

          prometheus/nim-llm:
            config:
              config:
                scrape_configs:
                  - job_name: nim-for-llm-metrics
                    scrape_interval: 60s
                    metrics_path: /v1/metrics
                    static_configs:
                      - targets:
                          - '`endpoint`:8000'
            rule: type == "pod" && labels["app"] == "meta-llama-3-2-1b-instruct"

This tells the collector to look for pods with a label of app=meta-llama-3-2-1b-instruct. And when it finds a pod with this label, it will connect to port 8000 of the pod and scrape the /v1/metrics metrics endpoint.

There’s no need to make changes to the pipeline, as this receiver will already be picked up as part of the receiver_creator/nvidia receiver.

Add a Filter Processor

Scraping Prometheus endpoints can result in a large number of metrics, sometimes with high cardinality.

Let’s add a filter processor that defines exactly what metrics we want to send to Splunk. Specifically, we’ll send only the metrics that are utilized by a dashboard chart or an alert detector.

Add the following code to the otel-collector-values.yaml file, after the exporters section but before the receivers section:

    processors:
      filter/metrics_to_be_included:
        metrics:
          # Include only metrics used in charts and detectors
          include:
            match_type: strict
            metric_names:
              - DCGM_FI_DEV_FB_FREE
              - DCGM_FI_DEV_FB_USED
              - DCGM_FI_DEV_GPU_TEMP
              - DCGM_FI_DEV_GPU_UTIL
              - DCGM_FI_DEV_MEM_CLOCK
              - DCGM_FI_DEV_MEM_COPY_UTIL
              - DCGM_FI_DEV_MEMORY_TEMP
              - DCGM_FI_DEV_POWER_USAGE
              - DCGM_FI_DEV_SM_CLOCK
              - DCGM_FI_DEV_TOTAL_ENERGY_CONSUMPTION
              - DCGM_FI_PROF_DRAM_ACTIVE
              - DCGM_FI_PROF_GR_ENGINE_ACTIVE
              - DCGM_FI_PROF_PCIE_RX_BYTES
              - DCGM_FI_PROF_PCIE_TX_BYTES
              - DCGM_FI_PROF_PIPE_TENSOR_ACTIVE
              - generation_tokens_total
              - go_info
              - go_memstats_alloc_bytes
              - go_memstats_alloc_bytes_total
              - go_memstats_buck_hash_sys_bytes
              - go_memstats_frees_total
              - go_memstats_gc_sys_bytes
              - go_memstats_heap_alloc_bytes
              - go_memstats_heap_idle_bytes
              - go_memstats_heap_inuse_bytes
              - go_memstats_heap_objects
              - go_memstats_heap_released_bytes
              - go_memstats_heap_sys_bytes
              - go_memstats_last_gc_time_seconds
              - go_memstats_lookups_total
              - go_memstats_mallocs_total
              - go_memstats_mcache_inuse_bytes
              - go_memstats_mcache_sys_bytes
              - go_memstats_mspan_inuse_bytes
              - go_memstats_mspan_sys_bytes
              - go_memstats_next_gc_bytes
              - go_memstats_other_sys_bytes
              - go_memstats_stack_inuse_bytes
              - go_memstats_stack_sys_bytes
              - go_memstats_sys_bytes
              - go_sched_gomaxprocs_threads
              - gpu_cache_usage_perc
              - gpu_total_energy_consumption_joules
              - http.server.active_requests
              - num_request_max
              - num_requests_running
              - num_requests_waiting
              - process_cpu_seconds_total
              - process_max_fds
              - process_open_fds
              - process_resident_memory_bytes
              - process_start_time_seconds
              - process_virtual_memory_bytes
              - process_virtual_memory_max_bytes
              - promhttp_metric_handler_requests_in_flight
              - promhttp_metric_handler_requests_total
              - prompt_tokens_total
              - python_gc_collections_total
              - python_gc_objects_collected_total
              - python_gc_objects_uncollectable_total
              - python_info
              - request_finish_total
              - request_success_total
              - system.cpu.time
              - e2e_request_latency_seconds
              - time_to_first_token_seconds
              - time_per_output_token_seconds
              - request_prompt_tokens
              - request_generation_tokens

Ensure the filter/metrics_to_be_included processor is included in the metrics/nvidia-metrics pipeline we added earlier:

    service:
      pipelines:
        metrics/nvidia-metrics:
          exporters:
            - signalfx
          processors:
            - memory_limiter
            - filter/metrics_to_be_included
            - batch
            - resourcedetection
            - resource
          receivers:
            - receiver_creator/nvidia

Verify Changes

Take a moment to compare the contents of your modified otel-collector-values.yaml file with the otel-collector-values-with-nvidia.yaml file. Remember that indentation is important for yaml files, and needs to be precise:

diff otel-collector-values.yaml otel-collector-values-with-nvidia.yaml

Update your file if needed to ensure the contents match.

Don’t restart the collector yet

Because restarting the collector in an OpenShift environment takes 3 minutes per node, we’ll wait until we’ve completed all configuration changes before initiating a restart.

Last Modified Feb 18, 2026

Monitor the Vector Database

5 minutes  

In this step, we’ll configure the Prometheus receiver to monitor the Weaviate vector database.

What is a Vector Database?

A vector database stores and indexes data as numerical “vector embeddings,” which capture the semantic meaning of information like text or images. Unlike traditional databases, they excel at similarity searches, finding conceptually related data points rather than exact matches.

How is a Vector Database Used?

Vector databases play a key role in a pattern called Retrieval Augmented Generation (RAG), which is widely used by applications that leverage Large Language Models (LLMs).

The pattern is as follows:

  • The end-user asks a question to the application
  • The application takes the question and calculates a vector embedding for it
  • The app then performs a similarity search, looking for related documents in the vector database
  • The app then takes the original question and the related documents, and sends it to the LLM as context
  • The LLM reviews the context and returns a response to the application

Capture Weaviate Metrics with Prometheus

Let’s modify the OpenTelemetry collector configuration to scrape Weaviate’s Prometheus metrics.

To do so, let’s add an additional Prometheus receiver creator section to the otel-collector-values.yaml file. Add it after the receiver_creator/nvidia section but before the pipelines section:

      receiver_creator/weaviate:
        # Name of the extensions to watch for endpoints to start and stop.
        watch_observers: [ k8s_observer ]
        receivers:
          prometheus/weaviate:
            config:
              config:
                scrape_configs:
                  - job_name: weaviate-metrics
                    scrape_interval: 60s
                    static_configs:
                      - targets:
                          - '`endpoint`:2112'
            rule: type == "pod" && labels["app"] == "weaviate"

We’ll need to ensure that Weaviate’s metrics are added to the filter/metrics_to_be_included filter processor configuration as well:

    processors:
      filter/metrics_to_be_included:
        metrics:
          # Include only metrics used in charts and detectors
          include:
            match_type: strict
            metric_names:
              - DCGM_FI_DEV_FB_FREE
              - ...
              - object_count
              - vector_index_size
              - vector_index_operations
              - vector_index_tombstones
              - vector_index_tombstone_cleanup_threads
              - vector_index_tombstone_cleanup_threads
              - requests_total
              - objects_durations_ms_sum
              - objects_durations_ms_count
              - batch_delete_durations_ms_sum
              - batch_delete_durations_ms_count

Note: add just the new metrics starting with object_count

We also want to add a Resource processor to the configuration file with the following configuration. Add it after the filter/metrics_to_be_included processor but before the receivers section:

      resource/weaviate:
        attributes:
          - key: weaviate.instance.id
            from_attribute: service.instance.id
            action: insert

This processor takes the service.instance.id property on the Weaviate metrics and copies it into a new property called weaviate.instance.id. This is done so that we can more easily distinguish Weaviate metrics from other metrics that use service.instance.id, which is a standard OpenTelemetry property used in Splunk Observability Cloud.

We’ll need to add a new metrics pipeline for Weaviate metrics as well (we need to use a separate pipeline since we don’t want the weaviate.instance.id metric to be added to non-Weaviate metrics). Add the following to the bottom of the file:

        metrics/weaviate:
          exporters:
            - signalfx
          processors:
            - memory_limiter
            - filter/metrics_to_be_included
            - resource/weaviate
            - batch
            - resourcedetection
            - resource
          receivers:
            - receiver_creator/weaviate

Take a moment to compare the contents of your modified otel-collector-values.yaml file with the otel-collector-values-with-weaviate.yaml file. Remember that indentation is important for yaml files, and needs to be precise:

diff otel-collector-values.yaml otel-collector-values-with-weaviate.yaml

Update your file if needed to ensure the contents match.

Don’t restart the collector yet

Because restarting the collector in an OpenShift environment takes 3 minutes per node, we’ll wait until we’ve completed all configuration changes before initiating a restart.

Last Modified Feb 18, 2026

Monitor Storage

5 minutes  

In this step, we’ll configure the Prometheus receiver to monitor the storage.

What storage do Cisco AI PODs utilize?

Cisco AI PODs have a number of different storage options, including Pure Storage, VAST, and NetApp.

The workshop will focus on Pure Storage.

How do we capture Pure Storage metrics?

Cisco AI PODs that utilize Pure Storage also use a technology called Portworx, which provides persistent storage for Kubernetes.

Portworx includes a metrics endpoint that we can scrape using the Prometheus receiver.

Capture Storage Metrics with Prometheus

Let’s modify the OpenTelemetry collector configuration to scrape Portworx metrics with the Prometheus receiver.

To do so, let’s add an additional Prometheus receiver creator section to the otel-collector-values.yaml file. Add it after the receiver_creator/weaviate section but before the pipelines section:

      receiver_creator/storage:
        # Name of the extensions to watch for endpoints to start and stop.
        watch_observers: [ k8s_observer ]
        receivers:
          prometheus/portworx:
            config:
              config:
                scrape_configs:
                  - job_name: portworx-metrics
                    static_configs:
                      - targets:
                          - '`endpoint`:17001'
                          - '`endpoint`:17018'
            rule: type == "pod" && labels["app"] == "portworx-metrics-sim"

We’ll need to ensure that Portworx metrics are added to the filter/metrics_to_be_included filter processor configuration as well:

    processors:
      filter/metrics_to_be_included:
        metrics:
          # Include only metrics used in charts and detectors
          include:
            match_type: strict
            metric_names:
              - DCGM_FI_DEV_FB_FREE
              - ...
              - px_cluster_cpu_percent
              - px_cluster_disk_total_bytes
              - px_cluster_disk_utilized_bytes
              - px_cluster_status_nodes_offline
              - px_cluster_status_nodes_online
              - px_volume_read_latency_seconds
              - px_volume_reads_total
              - px_volume_readthroughput
              - px_volume_write_latency_seconds
              - px_volume_writes_total
              - px_volume_writethroughput

Note: add just the new metrics starting with px_cluster_cpu_percent

We’ll need to add a new metrics pipeline for Portworx metrics as well. Add the following to the bottom of the file:

        metrics/storage:
          exporters:
            - signalfx
          processors:
            - memory_limiter
            - filter/metrics_to_be_included
            - batch
            - resourcedetection
            - resource
          receivers:
            - receiver_creator/storage

Take a moment to compare the contents of your modified otel-collector-values.yaml file with the otel-collector-values-with-portworx.yaml file.Remember that indentation is important for yaml files, and needs to be precise:

diff otel-collector-values.yaml otel-collector-values-with-portworx.yaml

Update your file if needed to ensure the contents match.

Don’t restart the collector yet

Because restarting the collector in an OpenShift environment takes 3 minutes per node, we’ll wait until we’ve completed all configuration changes before initiating a restart.

Last Modified Feb 18, 2026

Review AI POD Dashboards

10 minutes  

In this section, we’ll review the AI POD dashboards in Splunk Observability Cloud to confirm that the data from NVIDIA, Pure Storage, and Weaviate is captured as expected.

Update the OpenTelemetry Collector Config

We can apply the collector configuration changes by running the following Helm command:

{ [ -z "$CLUSTER_NAME" ] || \
  [ -z "$ENVIRONMENT_NAME" ] || \
  [ -z "$USER_NAME" ]; } && \
  echo "Error: Missing variables" || \
  helm upgrade splunk-otel-collector \
  --set="clusterName=$CLUSTER_NAME" \
  --set="environment=$ENVIRONMENT_NAME" \
  --set="splunkObservability.accessToken=$ACCESS_TOKEN" \
  --set="splunkObservability.realm=$REALM" \
  --set="splunkPlatform.endpoint=$HEC_URL" \
  --set="splunkPlatform.token=$HEC_TOKEN" \
  --set="splunkPlatform.index=$SPLUNK_INDEX" \
  -f ./otel-collector-values.yaml \
  -n $USER_NAME \
  splunk-otel-collector-chart/splunk-otel-collector

Note: if you get an error that says Missing variables, you’ll need to define your environment variables again. Add your participant number before running the following commands:

export PARTICIPANT_NUMBER=<your participant number>
export USER_NAME=workshop-participant-$PARTICIPANT_NUMBER
export CLUSTER_NAME=ai-pod-$USER_NAME
export ENVIRONMENT_NAME=ai-pod-$USER_NAME
export SPLUNK_INDEX=splunk4rookies-workshop

Review the AI POD Overview Dashboard Tab

Navigate to Dashboards in Splunk Observability Cloud, then search for the Cisco AI PODs Dashboard, which is included in the Built-in dashboard groups. Ensure the dashboard is filtered on your OpenShift cluster name. The charts should be populated as in the following example:

Kubernetes Pods Kubernetes Pods

Review the Pure Storage Dashboard Tab

Navigate to the PURE STORAGE tab and ensure the dashboard is filtered on your OpenShift cluster name. The charts should be populated as in the following example:

Pure Storage Dashboard Pure Storage Dashboard

Review the Weaviate Infrastructure Navigator

Since Weaviate isn’t included by default with an AI POD, it’s not included on the out-of-the-box AI POD dashboard. Instead, we can view Weaviate performance data using one of the infrastructure navigators.

In Splunk Observability Cloud, navigate to Infrastructure -> AI Frameworks -> Weaviate. Filter on the k8s.cluster.name of interest, and ensure the navigator is populated as in the following example:

Kubernetes Pods Kubernetes Pods

Last Modified Feb 18, 2026

Review the LLM Application

15 minutes  

In the final step of the workshop, we’ll deploy an application to our OpenShift cluster that uses the instruct and embeddings models.

What is LangChain?

Like most applications that interact with LLMs, our application is written in Python. It also uses LangChain, which is an open-source orchestration framework that simplifies the development of applications powered by LLMs.

Application Overview

Connect to the LLMs

Our application starts by connecting to two LLMs that we’ll be using:

  • meta/llama-3.2-1b-instruct: used for responding to user prompts
  • nvidia/llama-3.2-nv-embedqa-1b-v2: used to calculate embeddings
# connect to a LLM NIM at the specified endpoint, specifying a specific model
llm = ChatNVIDIA(base_url=INSTRUCT_MODEL_URL, model="meta/llama-3.2-1b-instruct")

# Initialize and connect to a NeMo Retriever Text Embedding NIM (nvidia/llama-3.2-nv-embedqa-1b-v2)
embeddings_model = NVIDIAEmbeddings(model="nvidia/llama-3.2-nv-embedqa-1b-v2",
                                   base_url=EMBEDDINGS_MODEL_URL)

Why are there two models? Here’s a helpful analogy:

  • The Embedding model is the “Librarian” (it helps find the right books),
  • The Instruct model is the “Writer” (it reads the books and writes the answer).

Define the Prompt Template

The application then defines a prompt template that will be used in interactions with the meta/llama-3.2-1b-instruct LLM:

prompt = ChatPromptTemplate.from_messages([
    ("system",
        "You are a helpful and friendly AI!"
        "Your responses should be concise and no longer than two sentences."
        "Do not hallucinate. Say you don't know if you don't have this information."
        "Answer the question using only the context"
        "\n\nQuestion: {question}\n\nContext: {context}"
    ),
    ("user", "{question}")
])

Note how we’re explicitly instructing the LLM to just say it doesn’t know the answer if it doesn’t know, which helps minimize hallucinations. There’s also a placeholder for us to provide context that the LLM can use to answer the question.

Connect to the Vector Database

The application then connects to the vector database that was pre-populated with NVIDIA data sheet documents:

    weaviate_client = weaviate.connect_to_custom(
        http_host=os.getenv('WEAVIATE_HTTP_HOST'),
        http_port=os.getenv('WEAVIATE_HTTP_PORT'),
        http_secure=False,
        grpc_host=os.getenv('WEAVIATE_GRPC_HOST'),
        grpc_port=os.getenv('WEAVIATE_GRPC_PORT'),
        grpc_secure=False
    )
        
    vector_store = WeaviateVectorStore(
        client=weaviate_client,
        embedding=embeddings_model,
        index_name="CustomDocs",
        text_key="page_content"
    )

Define the Chain

The application uses LCEL (LangChain Expression Language) to define the chain. The | (pipe) symbol works like an assembly line; the output of one step becomes the input for the next.

    chain = (
        {
            "context": vector_store.as_retriever(),
            "question": RunnablePassthrough()
        }
        | prompt
        | llm
        | StrOutputParser()
    )

Let’s break this down step-by-step:

  • Step 1: The Input Map {…}: We are preparing the ingredients for our prompt.
    • context: We turn our vector store into a retriever. This acts like a search engine that finds the most relevant snippets from our NVIDIA data sheets based on the user’s question.
    • question: We use RunnablePassthrough() to ensure the user’s original question is passed directly into the prompt.
    • Note: These keys (context and question) map directly to the {context} and {question} placeholders we defined in our prompt template earlier.
  • Step 2: The prompt: This is the instruction manual. It takes the context and the question and formats them using the prompt template (e.g., “Answer the question using only the context…”).
  • Step 3: The llm: This is the “Engine” (like GPT-4). It reads the formatted prompt and generates a response.
  • Step 4: The StrOutputParser(): By default, AI models return complex objects. This “cleaner” ensures we get back a simple, readable string of text.

Invoke the Chain

Finally, the application invokes the chain by passing the end user’s question in as input:

    response = chain.invoke(question)

This is the “Start” button. You drop the end users’ question into the beginning of the pipeline, and it flows through the retriever, the prompt, and the LLM until the answer comes out the other side.

Last Modified Feb 6, 2026

Instrument the LLM Application

10 minutes  

Instrument the Application with OpenTelemetry

Instrumentation Packages

To capture metrics, traces, and logs from our application, we’ve instrumented it with OpenTelemetry. This required adding the following package to the requirements.txt file (which ultimately gets installed with pip install):

splunk-opentelemetry==2.8.0

We also added the following to the Dockerfile used to build the container image for this application, to install additional OpenTelemetry instrumentation packages:

# Add additional OpenTelemetry instrumentation packages
RUN opentelemetry-bootstrap --action=install

Then we modified the ENTRYPOINT in the Dockerfile to call opentelemetry-instrument when running the application:

ENTRYPOINT ["opentelemetry-instrument", "flask", "run", "-p", "8080", "--host", "0.0.0.0"]

Finally, to enhance the traces and metrics collected with OpenTelemetry from this LangChain application, we added additional Splunk instrumentation packages:

splunk-otel-instrumentation-langchain==0.1.4
splunk-otel-util-genai==0.1.4

Environment Variables

To instrument the application with OpenTelemetry, we also included several environment variables in the Kubernetes manifest file used to deploy the application:

  env:
    - name: OTEL_SERVICE_NAME
      value: "llm-app"
    - name: OTEL_EXPORTER_OTLP_ENDPOINT
      value: "http://splunk-otel-collector-agent:4317"
    - name: OTEL_EXPORTER_OTLP_PROTOCOL
      value: "grpc"
      # filter out health check requests to the root URL
    - name: OTEL_PYTHON_EXCLUDED_URLS
      value: "^(https?://)?[^/]+(/)?$"
    - name: OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
      value: "httpx,requests"
    - name: OTEL_INSTRUMENTATION_LANGCHAIN_CAPTURE_MESSAGE_CONTENT
      value: "true"
    - name: OTEL_LOGS_EXPORTER
      value: "otlp"
    - name: OTEL_PYTHON_LOG_CORRELATION
      value: "true"
    - name: OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE
      value: "delta"
    - name: OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED
      value: "true"
    - name: OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
      value: "true"
    - name: OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE
      value: "SPAN_AND_EVENT"
    - name: OTEL_INSTRUMENTATION_GENAI_EMITTERS
      value: "span_metric_event,splunk"
    - name: OTEL_INSTRUMENTATION_GENAI_EMITTERS_EVALUATION
      value: "replace-category:SplunkEvaluationResults"
    - name: SPLUNK_PROFILER_ENABLED
      value: "true"

Note that the OTEL_INSTRUMENTATION_LANGCHAIN_CAPTURE_MESSAGE_CONTENT and OTEL_INSTRUMENTATION_GENAI_* environment variables are specific to the LangChain instrumentation we’ve used.

Last Modified Feb 6, 2026

Deploy the LLM Application

10 minutes  

Deploy the LLM Application

Use the following command to deploy this application to the OpenShift cluster:

cd ~/workshop/cisco-ai-pods
oc apply -f ./llm-app/k8s-manifest.yaml

Note: to build a Docker image for this Python application, we executed the following commands:

cd workshop/cisco-ai-pods/llm-app
docker build --platform linux/amd64 -t ghcr.io/splunk/cisco-ai-pod-workshop-app:1.0 .
docker push ghcr.io/splunk/cisco-ai-pod-workshop-app:1.0

Test the LLM Application

Let’s ensure the application is working as expected.

Start a pod that has access to the curl command:

oc run curl --rm -it --image=curlimages/curl:latest \
  --overrides='{
    "spec": {
      "containers": [{
        "name": "curl",
        "image": "curlimages/curl:latest",
        "stdin": true,
        "tty": true,
        "command": ["sh"],
        "resources": {
          "limits": {
            "cpu": "50m",
            "memory": "100Mi"
          },
          "requests": {
            "cpu": "50m",
            "memory": "100Mi"
          }
        }
      }]
    }
  }'

Then run the following command to send a question to the LLM:

curl -X "POST" \
 'http://llm-app:8080/askquestion' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "question": "How much memory does the NVIDIA H200 have?"
  }'
The NVIDIA H200 has 141GB of HBM3e memory, which is twice the capacity of the NVIDIA H100 Tensor Core GPU with 1.4X more memory bandwidth.
Last Modified Feb 18, 2026

Review Metrics, Traces, and Logs

10 minutes  

View Trace Data in Splunk Observability Cloud

In Splunk Observability Cloud, navigate to APM and then select Service Map. Ensure your environment name is selected (e.g. ai-pod-workshop-participant-1).
You should see a service map that looks like the following:

Service Map Service Map

Click on Traces on the right-hand side menu. Then select one of the slower running traces. It should look like the following example:

Trace Trace

The trace shows all the interactions that our application executed to return an answer to the users question (i.e. “How much memory does the NVIDIA H200 have?”)

For example, we can see where our application performed a similarity search to look for documents related to the question at hand in the Weaviate vector database.

We can also see how the application created a prompt to send to the LLM, including the context that was retrieved from the vector database:

Prompt Template Prompt Template

Note: if you don’t see the chat and invoke_workflow AI interactions in the trace waterfall view, or you don’t see the AI details tab on the right-hand side, ask your instructor about the superpowers which need to be enabled.

Finally, we can see the response from the LLM, the time it took, and the number of input and output tokens utilized:

LLM Response LLM Response

Confirm Metrics are Sent to Splunk

Navigate to Dashboards in Splunk Observability Cloud, then search for the Cisco AI PODs Dashboard, which is included in the Built-in dashboard groups. Navigate to the NIM FOR LLMS tab and ensure the dashboard is filtered on your OpenShift cluster name. The charts should be populated as in the following example:

NIM LLMS Dashboard NIM LLMS Dashboard

Last Modified Feb 18, 2026

Wrap-Up

5 minutes  

Wrap-Up

We hope you enjoyed this workshop, which provided hands-on experience deploying and working with several of the technologies that are used to monitor Cisco AI PODs with Splunk Observability Cloud. Specifically, you had the opportunity to:

  • Work with a RedHat OpenShift cluster with GPU-based worker nodes.
  • Work with the NVIDIA NIM Operator and NVIDIA GPU Operator.
  • Work with Large Language Models (LLMs) deployed using NVIDIA NIM to the cluster.
  • Deploy the OpenTelemetry Collector in the Red Hat OpenShift cluster.
  • Add Prometheus receivers to the collector to ingest infrastructure metrics.
  • Monitor the Weaviate vector database in the cluster.
  • Configure monitoring for Pure Storage metrics using Prometheus.
  • Instrument Python services that interact with Large Language Models (LLMs) with OpenTelemetry.
  • Understand which details which OpenTelemetry captures in the trace from applications that interact with LLMs.