From a design perspective, the plays within
splunk-ansible are meant to be run locally on each instance of your intended Splunk deployment. The execution flow of the provisioning process is meant to gracefully handle interoperability in this manner, while also maintaining idempotency and reliability.
Particularly when bringing up distributed Splunk topologies, there is a need for one Splunk instances to make a request against another Splunk instance in order to construct the cluster. These networking requests are often prone to failure, as when Ansible is executed asyncronously there are no guarantees that the requestee is online/ready to receive the message.
While developing new playbooks that require remote Splunk-to-Splunk connectivity, we employ the use of
delay options for tasks. For instance, in this example below, we add indexers as search peers of individual search head. To overcome error-prone networking, we have retry counts with delays embedded in the task. There are also break-early conditions that maintain idempotency so we can progress if successful:
- name: Set all indexers as search peers command: " add search-server ://: -auth : -remoteUsername -remotePassword " become: yes become_user: "" with_items: "" register: set_indexer_as_peer until: set_indexer_as_peer.rc == 0 or set_indexer_as_peer.rc == 24 retries: "" delay: "" changed_when: set_indexer_as_peer.rc == 0 failed_when: set_indexer_as_peer.rc != 0 and 'already exists' not in set_indexer_as_peer.stderr notify: - Restart the splunkd service no_log: "" when: "'splunk_indexer' in groups"
Another utility you can add when creating new plays is an implicit wait. For more information on this, see the
roles/splunk_common/tasks/wait_for_splunk_instance.yml play which will wait for another Splunk instance to be online before making any connections against it.
- name: Check Splunk instance is running uri: url: https://:/services/server/info?output_mode=json method: GET user: "" password: "" validate_certs: false register: task_response until: - task_response.status == 200 - lookup('pipe', 'date +"%s"')|int - task_response.json.entry.content.startup_time > 10 retries: "" delay: "" ignore_errors: true no_log: ""
This Ansible repository also contains a custom dynamic inventory script, located at
inventory/environ.py. Using a dynamic inventory when everything runs within the context of a local connection may seem counterproductive, but the purpose of this script is to build out the Splunk topology from the information provided by environment variables.
Particularly, this can be demonstrated with the approach used in the Docker image. If the Docker image is ran with certain environment variables (ex.
SPLUNK_INDEXER_URLS=idx1), then each instance you’re provisioning to assume a specific Splunk role will know how to add itself or other instances to build out the cluster.
environ.py is responsible for choreographing everything together and giving a shared context to each separate, distinct host. It does all this by reading special
SPLUNK_-prefixed variables and passing it into the
ansible-playbook command. To utilize this feature, you can run your command as so:
$ ansible-playbook --inventory inventory/environ.py ... $ ansible-playbook -i inventory/environ.py ...
At the current time, this only supports Linux-based platforms (Debian, CentOS, etc.). We do have plans to incorporate Windows in the future.
See how the
splunk-ansible project is being used in the wild! You can use these projects as reference for how to most effectively bring these playbooks into your cloud provider or technology stack.
The playbooks in this repository are already being used in the context of containers! For more information on how this works, please see the docker-splunk project and learn how
splunk-ansible is incorporated.