Skip to content

AWS Health Check Dashboards

The Health Check Dashboards in Splunk Add-on for AWS lets monitor deployment performance while making it easier for users troubleshoot and mitigate issues faster. It provides the following insights from your AWS add-on configuration and deployment.

Dashboard Panels Description
Health Overview (Provides information for all the errors and warnings generated from the inputs configured in AWS add-on) Error count by categories Displays the count of errors by categories like configuration error, network error, etc. Error count panels contain drilldowns which redirect to the Error Details dashboard containing information on possible reasons and resolutions for the errors. Thus, clicking on the error count will redirect to the Error Details dashboard, from where the user can identify and mitigate issues faster.
Warning count Displays the count of warning messages. The warning count panel contains drilldowns which redirect to the Warning Details dashboard containing information on possible reasons and resolutions for the warnings. Thus, clicking on the warning count will redirect to the Warning Details dashboard, from where the user can identify and eradicate unnecessary warnings.
Error count timechart These timecharts display the count of errors over time based on hosts, input types, input names, and error categories.
Resource Utilization (Provides information regarding the resource utilization by different types of inputs configured in the AWS add-on) CPU and Memory utilization Displays the CPU and memory utilization over time for single instance and multi instance inputs configured in the AWS add-on (single instance inputs are the inputs where Splunk spawns a single process for all inputs, whereas multi instance inputs are the inputs where Splunk spawns individual process for each input). This can be useful to identify over-utilization of resources which may affect your Splunk platform environment.
Inputs count (single instance and multi instance) Displays the number of inputs (enabled/disabled) configured in the AWS add-on. Number of inputs help to identify resource utilization, and can be scaled up or down, based on the requirements.
KV Store calls count Displays the number of key value store calls over time. This is useful to examine the load on Splunk KV store as some of the inputs in the AWS add-on use KV store-based checkpointing mechanism. The KV store panel contains a drilldown to KV store Utilization dashboard which lists the KV store calls count by collection name and KV store call method (GET/POST/DELETE). Thus, clicking on the KV store calls count panel within a particular time range will redirect to the KV Store Utilization dashboard for that time range, where the load can be analyzed based on collections used by AWS add-on, when compared to collections used by other apps and add-ons. Clicking on any collection name under the AWS add-on will display the timechart for average time taken by KV store calls on that particular collection.
S3 Inputs Health Details (Focuses on the Generic S3, Incremental S3, and SQS-based S3 input types) Time lapse (delay) and throughput Displays the delay (time taken) in fetching the data and throughput (size of data) over time. Useful to identify network latency or delay related issues.
Error Message Details Displays the error details encountered while input execution along with possible reasons and resolutions.

In the Splunk Web UI, open the Splunk Add-on for AWS, and click on the Health Check tab. Select the dashboard from the dropdown which you want to monitor.