ElasticSearch with StorageOS
Elasticsearch is a distributed, RESTful search and analytics engine, most popularly used to aggregate logs, but also to serve as a search backend to a number of different applications.
Using StorageOS persistent volumes with ElasticSearch (ES) means that if a pod fails, the cluster is only in a degraded state for as long as it takes Kubernetes to restart the pod. When the pod comes back up, the pod data is immediately available. Should Kubernetes schedule the Elasticsearch pod on a new node, StorageOS allows for the data to be available to the pod, irrespective of whether or not the original StorageOS master volume is located on the same node.
Elasticsearch has features to allow it to handle data replication, and as such careful consideration of whether to allow StorageOS or Elasticsearch to handle replication is required.
Before you start, ensure you have StorageOS installed and ready on a Kubernetes cluster. See our guide on how to install StorageOS on Kubernetes for more information.
Deploying Elasticsearch on Kubernetes
Prerequisites
Some OS tuning is required, which is done automatically when using our example from the use cases repository.
Elasticsearch requires vm.max_map_count
to be increased to a minimum of
262144
, which is a system wide setting. One way to achieve this is to
run sysctl -w vm.max_map_count=262144
and update /etc/sysctl.conf
to ensure it persists over a reboot. See ElasicSearch reference
here.
Administrators should be aware that this impacts the behaviour of nodes and that there may be collisions with other application settings. Administrators are advised to centrally collate sysctl settings using the tooling of their choice.
Deployment of the application
StatefulSet defintion
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: esdata
[...]
spec:
serviceAccountName: elasticsearch
containers:
- name: data
image: elasticsearch:6.7.0
imagePullPolicy: IfNotPresent
[...]
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data/data
[...]
volumeClaimTemplates:
- metadata:
name: "data"
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "fast" # <--- default StorageOS storage class name
resources:
requests:
storage: 10Gi # <--- change this to the appropriate value
This excerpt is from the StatefulSet definition
(/elasticsearch/10-es-data.yaml
). The file contains the
PersistentVolumeClaim template that will dynamically
provision the necessary storage, using the StorageOS storage class.
Dynamic provisioning occurs as a volumeMount has been declared with the same name as a VolumeClaimTemplate.
Installation
Clone the use cases repo
You can find the latest files in the StorageOS use cases repostiory
in /elasticsearch/
git clone https://github.com/storageos/use-cases.git storageos-usecases
cd storageos-usecases
-
Create the kubernetes objects
This will install an ES cluster with 3 master, 3 data and 3 coordinator nodes. Combined they will require ~ 14 GiB of available memory in your cluster, however, more may be used as the application is being used
kubectl apply -f ./elasticsearch/
Once completed, an internal service object will have been created making the cluster available as
http://elasticsearch:9200/
which is the default Kibana (when installed via Helm) will be using. -
Confirm Elasticsearch is up and running
kubectl get pods -l component=elasticsearch NAME READY STATUS RESTARTS AGE elasticsearch-exporter-d86ffd94-zw45l 1/1 Running 0 5m44s es-coordinator-b7b984dd4-7wlz5 1/1 Running 0 5m44s es-coordinator-b7b984dd4-89w26 1/1 Running 0 5m44s es-coordinator-b7b984dd4-b4t6j 1/1 Running 0 5m44s es-master-78dfd5b49f-9gf5c 1/1 Running 0 5m44s es-master-78dfd5b49f-smsbw 1/1 Running 0 5m44s es-master-78dfd5b49f-z4qpj 1/1 Running 0 5m44s esdata-0 1/1 Running 0 5m44s esdata-1 1/1 Running 0 4m34s esdata-2 1/1 Running 0 3m22s
-
Connect to ElasticSearch
To connect to ES directly, you can use the following port-forward command
kubectl port-forward svc/elasticsearch 9200
and then access it via http://localhost:9200
Kibana (optional)
One of the most popular uses of ES is to use it for log aggregation and indexing, Kibana helps us visualize the data in these indices and can be easily used when installed via its Helm chart
-
Install the helm chart.
helm install stable/kibana
-
Once installed, use a port-foward to Kibana instead of directly to ES
kubectl port-forward --namespace default $(kubectl get pods --namespace default -l "app=kibana" -o jsonpath="{.items[0].metadata.name}") 5601
and then access it via http://localhost:5601
Monitoring (optional)
As part of the example deployment, ES metrics are exposed and can be scraped
by Prometheus on port 9108
(see 77-es-exporter.yaml).
This is enabled by default, and should work with the default Prometheus install
via Helm. If you’re using the Prometheus service monitors, you can monitor
this installation by creating a monitor for the es-exporter
service. For an
example of how this is done to monitor StorageOS, please see prometheus-setup.