Distributed Tracing for AWS Lambda Functions

Auto-Instrumentation

15 mins

The first part of our workshop will demonstrate how auto-instrumentation with OpenTelemetry allows the OpenTelemetry Collector to auto-detect what language your function is written in, and start capturing traces for those functions.

First, let us take a look at the workshop/lambda/auto directory, and some of its files. This is where all the content for the auto-instrumentation portion of our workshop resides.

bash
cd ~/workshop/lambda/auto
bash
ls

The output should include the following files and directories:_

bash
handler             outputs.tf          terraform.tf        variables.tf
main.tf             send_message.py     terraform.tfvars

Take a closer look at the main.tf file:

bash
cat main.tf

Workshop Questions

  • Can you identify which AWS resources are being created by this template?
  • Can you identify where OpenTelemetry instrumentation is being set up?
    • Hint: study the lambda function definitions
  • Can you determine which instrumentation information is being provided by the environment variables we set earlier?

You should see a section where the environment variables for each lambda function are being set.

bash
environment {
  variables = {
    SPLUNK_ACCESS_TOKEN = var.o11y_access_token
    SPLUNK_REALM = var.o11y_realm
    OTEL_SERVICE_NAME = "producer-lambda"
    OTEL_RESOURCE_ATTRIBUTES = "deployment.environment=${var.prefix}-lambda-shop"
    AWS_LAMBDA_EXEC_WRAPPER = "/opt/nodejs-otel-handler"
    KINESIS_STREAM = aws_kinesis_stream.lambda_streamer.name
  }
}

By using these environment variables, we are configuring our auto-instrumentation in a few ways:

bash
SPLUNK_ACCESS_TOKEN = var.o11y_access_token
SPLUNK_ACCESS_TOKEN = var.o11y_realm
bash
OTEL_SERVICE_NAME = "producer-lambda" # consumer-lambda in the case of the consumer function
OTEL_RESOURCE_ATTRIBUTES = "deployment.environment=${var.prefix}-lambda-shop"
bash
AWS_LAMBDA_EXEC_WRAPPER - "/opt/nodejs-otel-handler"
bash
KINESIS_STREAM = aws_kinesis_stream.lambda_streamer.name

You should also see an argument for setting the Splunk OpenTelemetry Lambda layer on each function

bash
layers = var.otel_lambda_layer

Next, let’s take a look at the producer-lambda function code:

bash
cat ~/workshop/lambda/auto/handler/producer.mjs

Now that we are familiar with the contents of our auto directory, we can deploy the resources for our workshop, and generate some trace data from our Lambda functions.

Exercise Deploying the Lambda Function

In order to deploy the resources defined in the main.tf file, you first need to make sure that Terraform is initialized in the same folder as that file.

  • Change to the auto directory:
bash
cd ~/workshop/lambda/auto
  • Run the following command to initialize Terraform in this directory
bash
terraform init
  • This command will create a number of elements in the same folder:
    • .terraform.lock.hcl file: to record the providers it will use to provide resources
      • .terraform directory: to store the provider configurations
    • In addition to the above files, when terraform is run using the apply subcommand, the terraform.tfstate file will be created to track the state of your deployed resources.
    • These enable Terraform to manage the creation, state and destruction of resources, as defined within the main.tf file of the auto directory

Once we’ve initialized Terraform in this directory, we can go ahead and deploy our resources.

  • First, run the terraform plan command to ensure that Terraform will be able to create your resources without encountering any issues.
bash
terraform plan
  • This will result in a plan to deploy resources and output some data, which you can review to ensure everything will work as intended.

    • Do note that a number of the values shown in the plan will be known post-creation, or are masked for security purposes.
  • Next, run the terraform apply command to deploy the Lambda functions and other supporting resources from the main.tf file:

bash
terraform apply
  • Respond yes when you see the Enter a value: prompt

  • This will result in the following outputs:

bash
Outputs:

base_url = "https://______.amazonaws.com/serverless_stage/producer"
consumer_function_name = "_____-consumer"
consumer_log_group_arn = "arn:aws:logs:us-east-1:############:log-group:/aws/lambda/______-consumer"
consumer_log_group_name = "/aws/lambda/______-consumer"
environment = "______-lambda-shop"
lambda_bucket_name = "lambda-shop-______-______"
producer_function_name = "______-producer"
producer_log_group_arn = "arn:aws:logs:us-east-1:############:log-group:/aws/lambda/______-producer"
producer_log_group_name = "/aws/lambda/______-producer"
  • Terraform outputs are defined in the outputs.tf file.
  • These outputs will be used programmatically in other parts of our workshop, as well.
Exercise Send some traffic to the producer-lambda

To start getting some traces from our deployed Lambda functions, we would need to generate some traffic. We will send a message to our producer-lambda function’s endpoint, which should be put as a record into our Kinesis Stream, and then pulled from the Stream by the consumer-lambda function.

  • Change to the auto directory:
bash
cd ~/workshop/lambda/auto

The send_message.py script is a Python script that will take input at the command line, add it to a JSON dictionary, and send it to your producer-lambda function’s endpoint repeatedly, as part of a while loop.

  • Run the send_message.py script as a background process
    • It requires the --name and --superpower arguments
bash
nohup ./send_message.py --name CHANGEME --superpower CHANGEME &
  • You should see an output similar to the following if your message is successful
bash
[1] 179789
nohup: ignoring input and appending output to 'nohup.out'
  • The two most import bits of information here are:

    • The process ID on the first line (79829 in the case of my example), and
    • The appending output to nohup.out message
      • The nohup command ensures the script will not hang up when sent to the background. It also captures the curl output from our command in a nohup.out file in the same folder as the one you’re currently in.
      • The & tells our shell process to run this process in the background, thus freeing our shell to run other commands.
  • Next, check the contents of the response.logs file, to ensure your output confirms your requests to your producer-lambda endpoint are successful:

bash
cat response.logs
  • You should see the following output among the lines printed to your screen if your message is successful:
bash
{"message": "Message placed in the Event Stream: {prefix}-lambda_stream"}
  • If unsuccessful, you will see:
bash
{"message": "Internal server error"}

Warning

If this occurs, ask one of the workshop facilitators for assistance.
Exercise View the Lambda Function logs

Next, let’s take a look at the logs for our Lambda functions.

  • To view your producer-lambda logs, check the producer.logs file:
bash
cat producer.logs
  • To view your consumer-lambda logs, check the consumer.logs file:
bash
cat consumer.logs

Examine the logs carefully.

Workshop Question

  • Do you see OpenTelemetry being loaded? Look out for the lines with splunk-extension-wrapper
    • Consider running head -n 50 producer.logs or head -n 50 consumer.logs to see the splunk-extension-wrapper being loaded.