splunk-operator

Custom Resource Guide

The Splunk Operator provides a collection of custom resources you can use to manage Splunk Enterprise deployments in your Kubernetes cluster.

For examples on how to use these custom resources, please see Configuring Splunk Enterprise Deployments.

Metadata Parameters

All resources in Kubernetes include a metadata section. You can use this to define a name for a specific instance of the resource, and which namespace you would like the resource to reside within:

Key Type Description
name string Each instance of your resource is distinguished using this name.
namespace string Your instance will be created within this namespace. You must ensure that this namespace exists beforehand.

If you do not provide a namespace, you current context will be used.

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: s1
  namespace: splunk
  finalizers:
  - enterprise.splunk.com/delete-pvc

The enterprise.splunk.com/delete-pvc finalizer is optional, and may be used to tell the Splunk Operator that you would like it to remove all the Persistent Volumes associated with the instance when you delete it.

Common Spec Parameters for All Resources

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  name: example
spec:
  imagePullPolicy: Always
  livenessInitialDelaySeconds: 400
  readinessInitialDelaySeconds: 390
  serviceTemplate:
    spec:
      type: LoadBalancer
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        foo: bar
  extraEnv:
  - name: ADDITIONAL_ENV_VAR_1
    value: "test_value_1"
  - name: ADDITIONAL_ENV_VAR_2
    value: "test_value_2"
  resources:
    requests:
      memory: "512Mi"
      cpu: "0.1"
    limits:
      memory: "8Gi"
      cpu: "4"

The spec section is used to define the desired state for a resource. All custom resources provided by the Splunk Operator include the following configuration parameters:

Key Type Description
image string Container image to use for pod instances (overrides RELATED_IMAGE_SPLUNK_ENTERPRISE environment variable
imagePullPolicy string Sets pull policy for all images (either “Always” or the default: “IfNotPresent”)
livenessInitialDelaySeconds number Sets the initialDelaySeconds for Liveness probe (default: 300)
readinessInitialDelaySeconds number Sets the initialDelaySeconds for Readiness probe (default: 10)
extraEnv EnvVar Sets the extra environment variables to be passed to the Splunk instance containers. WARNING: Setting environment variables used by Splunk or Ansible will affect Splunk installation and operation
schedulerName string Name of Scheduler to use for pod placement (defaults to “default-scheduler”)
affinity Affinity Kubernetes Affinity rules that control how pods are assigned to particular nodes
resources ResourceRequirements The settings for allocating compute resource requirements to use for each pod instance. The default settings should be considered for demo/test purposes. Please see Hardware Resource Requirements for production values.
serviceTemplate Service Template used to create Kubernetes Services
topologySpreadConstraint TopologySpreadConstraint Template used to create Kubernetes TopologySpreadConstraint

Common Spec Parameters for Splunk Enterprise Resources

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: example
spec:
  etcVolumeStorageConfig:
    storageClassName: gp2
    storageCapacity: 15Gi
  varVolumeStorageConfig:
    storageClassName: customStorageClass
    storageCapacity: 25Gi
  volumes:
    - name: licenses
      configMap:
        name: splunk-licenses
  licenseManagerRef:
    name: example
  clusterManagerRef:
    name: example
  serviceAccount: custom-serviceaccount

The following additional configuration parameters may be used for all Splunk Enterprise resources, including: Standalone, LicenseManager, SearchHeadCluster, ClusterManager and IndexerCluster:

Key Type Description
etcVolumeStorageConfig StorageClassSpec Storage class spec for Splunk etc volume as described in StorageClass
varVolumeStorageConfig StorageClassSpec Storage class spec for Splunk var volume as described in StorageClass
volumes Volume List of one or more Kubernetes volumes. These will be mounted in all container pods as as /mnt/<name>
defaults string Inline map of default.yml overrides used to initialize the environment
defaultsUrl string Full path or URL for one or more default.yml files, separated by commas
licenseUrl string Full path or URL for a Splunk Enterprise license file
licenseManagerRef ObjectReference Reference to a Splunk Operator managed LicenseManager instance (via name and optionally namespace) to use for licensing
clusterManagerRef ObjectReference Reference to a Splunk Operator managed ClusterManager instance (via name and optionally namespace) to use for indexing
monitoringConsoleRef string Logical name assigned to the Monitoring Console pod. You can set the name before or after the MC pod creation.
serviceAccount ServiceAccount Represents the service account used by the pods deployed by the CRD
extraEnv Extra environment variables Extra environment variables to be passed to the Splunk instance containers
readinessInitialDelaySeconds readinessProbe initialDelaySeconds Defines initialDelaySeconds for Readiness probe
livenessInitialDelaySeconds livenessProbe initialDelaySeconds Defines initialDelaySeconds for the Liveness probe
imagePullSecrets imagePullSecrets Config to pull images from private registry. Use in conjunction with image config from common spec

LicenseManager Resource Spec Parameters

apiVersion: enterprise.splunk.com/v4
kind: LicenseManager
metadata:
  name: example
spec:
  volumes:
    - name: licenses
      configMap:
        name: splunk-licenses
  licenseUrl: /mnt/licenses/enterprise.lic

Please see Common Spec Parameters for All Resources and Common Spec Parameters for All Splunk Enterprise Resources. The LicenseManager resource does not provide any additional configuration parameters.

Standalone Resource Spec Parameters

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: standalone
  labels:
    app: SplunkStandAlone
    type: Splunk
  finalizers:
  - enterprise.splunk.com/delete-pvc

In addition to Common Spec Parameters for All Resources and Common Spec Parameters for All Splunk Enterprise Resources, the Standalone resource provides the following Spec configuration parameters:

Key Type Description
replicas integer The number of standalone replicas (defaults to 1)

SearchHeadCluster Resource Spec Parameters

apiVersion: enterprise.splunk.com/v4
kind: SearchHeadCluster
metadata:
  name: example
spec:
  replicas: 5

In addition to Common Spec Parameters for All Resources and Common Spec Parameters for All Splunk Enterprise Resources, the SearchHeadCluster resource provides the following Spec configuration parameters:

Key Type Description
replicas integer The number of search heads cluster members (minimum of 3, which is the default)

ClusterManager Resource Spec Parameters

ClusterManager resource does not have a required spec parameter, but to configure SmartStore, you can specify indexes and volume configuration as below -

apiVersion: enterprise.splunk.com/v4
kind: ClusterManager
metadata:
  name: example-cm
spec:
  smartstore:
    defaults:
        volumeName: msos_s2s3_vol
    indexes:
      - name: salesdata1
        remotePath: $_index_name
        volumeName: msos_s2s3_vol
      - name: salesdata2
        remotePath: $_index_name
        volumeName: msos_s2s3_vol
      - name: salesdata3
        remotePath: $_index_name
        volumeName: msos_s2s3_vol
    volumes:
      - name: msos_s2s3_vol
        path: <remote path>
        endpoint: <remote endpoint>
        secretRef: s3-secret

IndexerCluster Resource Spec Parameters

apiVersion: enterprise.splunk.com/v4
kind: IndexerCluster
metadata:
  name: example
spec:
  replicas: 3
  clusterManagerRef: 
    name: example-cm

Note: clusterManagerRef is required field in case of IndexerCluster resource since it will be used to connect the IndexerCluster to ClusterManager resource.

In addition to Common Spec Parameters for All Resources and Common Spec Parameters for All Splunk Enterprise Resources, the IndexerCluster resource provides the following Spec configuration parameters:

Key Type Description
replicas integer The number of indexer cluster members (defaults to 1)

MonitoringConsole Resource Spec Parameters

cat <<EOF | kubectl apply -n splunk-operator -f -
apiVersion: enterprise.splunk.com/v3
kind: MonitoringConsole
metadata:
  name: example-mc
  finalizers:
  - enterprise.splunk.com/delete-pvc
EOF

Use the Monitoring Console to view detailed topology and performance information about your Splunk Enterprise deployment. See What can the Monitoring Console do? in the Splunk Enterprise documentation.

The Splunk Operator now includes a CRD for the Monitoring Console (MC). This offers a number of advantages available to other CR’s, including: customizable resource allocation, app management, and license management.

The MC pod is referenced by using the monitoringConsoleRef parameter. There is no preferred order when running an MC pod; you can start the pod before or after the other CR’s in the namespace. When a pod that references the monitoringConsoleRef parameter is created or deleted, the MC pod will automatically update itself and create or remove connections to those pods.

Examples of Guaranteed and Burstable QoS

You can change the CPU and memory resources, and assign different Quality of Services (QoS) classes to your pods using the Kubernetes Quality of Service section. Here are some examples:

A Guaranteed QoS Class example:

Set equal requests and limits values for CPU and memory to establish a QoS class of Guaranteed.

Note: A pod will not start on a node that cannot meet the CPU and memory requests values.

Example: The minimum resource requirements for a Standalone Splunk Enterprise instance in production are 24 vCPU and 12GB RAM.

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: example
spec:
  imagePullPolicy: Always
  resources:
    requests:
      memory: "12Gi"
      cpu: "24"
    limits:
      memory: "12Gi"
      cpu: "24"  

A Burstable QoS Class example:

Set the requests value for CPU and memory lower than the limits value to establish a QoS class of Burstable.

Example: This Standalone Splunk Enterprise instance should start with minimal indexing and search capacity, but will be allowed to scale up if Kubernetes is able to allocate additional CPU and Memory up to the limits values.

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: example
spec:
  imagePullPolicy: Always
  resources:
    requests:
      memory: "2Gi"
      cpu: "4"
    limits:
      memory: "12Gi"
      cpu: "24"  

A BestEffort QoS Class example:

With no requests or limits values set for CPU and memory, the QoS class is set to BestEffort. The BestEffort QoS is not recommended for use with Splunk Operator.

Pod Resources Management

CPU Throttling

Kubernetes starts throttling CPUs if a pod’s demand for CPU exceeds the value set in the limits parameter. If your nodes have extra CPU resources available, leaving the limits value unset will allow the pods to utilize more CPUs.

Troubleshooting

CR Status Message

The Splunk Enterprise CRDs with the Splunk Operator have a field cr.Status.message which provides a detailed view of the CR’s current status.

Here is an example of a Standalone with a message indicating an invalid CR config:

bash% kubectl get stdaln
NAME   PHASE   DESIRED   READY   AGE   MESSAGE
ido    Error   0         0       26s   invalid Volume Name for App Source: custom. volume: csh, doesn't exist

bash# kubectl get stdaln -o yaml | grep -i message -A 5 -B 5
      appsStatusMaxConcurrentAppDownloads: 5
      bundlePushStatus: {}
      isDeploymentInProgress: false
      lastAppInfoCheckTime: 0
      version: 0
    message: 'invalid Volume Name for App Source: custom. volume: csh, doesn''t exist'
    phase: Error
    readyReplicas: 0
    replicas: 0
    resourceRevMap: {}
    selector: ""

Pause Annotations

The Splunk Operator controller reconciles every Splunk Enterprise CR. However, there might be circumstances wherein the influence of the Splunk Operator is not desired and needs to be paused. Every Splunk Enterprise CR has its own pause annotation associated with it, which when configured ensures that the Splunk Operator controller reconcile is paused for it. Below is a table listing the pause annotations:

Customer Resource Definition Annotation
clustermaster.enterprise.splunk.com “clustermaster.enterprise.splunk.com/paused”
clustermanager.enterprise.splunk.com “clustermanager.enterprise.splunk.com/paused”
indexercluster.enterprise.splunk.com “indexercluster.enterprise.splunk.com/paused”
licensemaster.enterprise.splunk.com “licensemaster.enterprise.splunk.com/paused”
monitoringconsole.enterprise.splunk.com “monitoringconsole.enterprise.splunk.com/paused”
searchheadcluster.enterprise.splunk.com “searchheadcluster.enterprise.splunk.com/paused”
standalone.enterprise.splunk.com “standalone.enterprise.splunk.com/paused”

Note: Removal of the annotation resets the default behavior

Here is an example of a standalone with the pause annotation set. In this state, the Splunk Operator requeues the reconcillation without performing any reconcile operations unless the annotatation is removed.

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: test-only-debug
  namespace: splunk-operator
  annotations:
    standalone.enterprise.splunk.com/paused: "true"
  finalizers:
  - enterprise.splunk.com/delete-pvc
spec:
  replicas: 1

Container Logs

The Splunk Enterprise CRDs deploy Splunkd in Kubernetes pods running docker-splunk container images. Adding a couple of environment variables to the CR spec as follows produces detailed container logs:

apiVersion: enterprise.splunk.com/v4
kind: Standalone
metadata:
  name: test-only
  namespace: splunk-operator
  finalizers:
  - enterprise.splunk.com/delete-pvc
spec:
  replicas: 1
  extraEnv:
  - name: DEBUG
    value: "true"
  - name: ANSIBLE_EXTRA_FLAGS
    value: "-vvvv

From the standalone above, here is a snippet from the detailed contianer log:

TASK [splunk_common : Ensure license path] *************************************
task path: /opt/ansible/roles/splunk_common/tasks/licenses/add_license.yml:15
ok: [localhost] => {
    "changed": false,
    "invocation": {
        "module_args": {
            "checksum_algorithm": "sha1",
            "follow": false,
            "get_attributes": true,
            "get_checksum": true,
            "get_md5": false,
            "get_mime": true,
            "path": "splunk.lic"
        }
    },
    "stat": {
        "exists": false
    }
}

POD Eviction - OOM

As oppose to throttling in case of CPU cycles starvation, Kubernetes will evict a pod from the node if the pod’s memory demands exceeds the value set in the limits parameter.