Contextual Code specializes in enterprise-level projects for state government agencies. We routinely tackle difficult web content management implementations, migrations, integrations, customizations, and operations. We know what it takes to get a project off the ground and onto the web.
We use Platform.sh as our primary hosting platform because it’s incredibly flexible and it provides a vast list of services that can be set up very easily.
There are a few possible scenarios, such as creating additional backups or syncing data to local development environments, when you might need to extract data from those services. In many cases, it’s simple to extract this data when you’re using Platform.sh. For example, you can get MariaDB/MySQL via the Platform.sh CLI tool command.
But in some cases, it’s not so simple to extract the data you need; more advanced tools are necessary. We covered one such case in our Backup Solr on Platform.sh blog post. Today we’ll cover another—how to use AWS Elasticsearch S3 snapshot repository for Elasticsearch on Platform.sh.
Getting started
First, let's make sure we have an Elasticsearch service in .platform/services.yaml
:
elasticsearch:
type: elasticsearch:7.2
disk: 256
Then let’s inject the service into the application via the elasticsearch
relationship in .platform.app.yaml
:
relationships:
elasticsearch: elasticsearch:elasticsearch
Also in AWS Management Console we need to:
- Create a new AWS S3 bucket
- Use AWS IAM to create a new user with read and write permissions for the newly created bucket
Registering the Elasticsearch S3 snapshot repository
The Elasticsearch S3 plugin is extremely easy to enable on Platform.sh. We just need to add repository-s3
in configuration.plugins
for the elasticsearch
service in .platform/services.yaml
:
elasticsearch:
type: elasticsearch:7.2
disk: 256
configuration:
plugins:
- repository-s3
After we deploy this change, we need to SSH to the application container and register a new snapshot repository by running the following command:
# SSH to the Platform.sh app container
platform ssh
# Replace the value for these variables
AWS_BUCKET_NAME="<YOUR_AWS_BUCKET_NAME>"
AWS_ACCESS_KEY_ID="<YOUR_AWS_ACCESS_KEY_ID>"
AWS_SECRET_ACCESS_KEY="<YOUR_AWS_SECRET_ACCESS_KEY>"
# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')
# Register the snapshot repository
curl -X PUT "http://${ES_HOST}:${ES_PORT}/_snapshot/aws-s3?pretty" -H 'Content-Type: application/json' -d'{
"type": "s3",
"settings": {
"bucket": "'"${AWS_BUCKET_NAME}"'",
"client": "default",
"access_key": "'"${AWS_ACCESS_KEY_ID}"'",
"secret_key": "'"${AWS_SECRET_ACCESS_KEY}"'"
}
}'
Once that is done, all new Elasticsearch snapshots will be stored on the AWS S3 bucket.
Creating the new Elasticsearch snapshots
We’ll use a simple bash script that will need to be executed in the app container --make-elasticsearch-snapshot.sh
in the root for your project:
# Extract snapshot parameters
SNAPSHOT_ID=$(date +"%Y%m%d-%H%M%S")
SNAPSHOT_NAME=$(echo "${PLATFORM_PROJECT}-${PLATFORM_BRANCH}-${SNAPSHOT_ID}")
SNAPSHOT_DATE=$(date +"%Y-%m-%d %H:%M:%S")
# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')
# Create a new snapshot
curl -X PUT "http://${ES_HOST}:${ES_PORT}/_snapshot/aws-s3/${SNAPSHOT_NAME}?wait_for_completion=true&pretty" -H 'Content-Type: application/json' -d'{
"ignore_unavailable": true,
"include_global_state": false,
"metadata": {
"taken_by": "Platform.sh cron",
"taken_on": "'"${SNAPSHOT_DATE}"'",
"taken_because": "Daily backup"
}
}
Add this script as elasticsearch_snapshot
to the cron jobs in .platform.app.yaml
:
crons:
....
elasticsearch_snapshot:
spec: '15 23 * * *'
cmd: bash make-elasticsearch-snapshot.sh
And deploy it:
git add .platform.app.yaml make-elasticsearch-snapshot.sh
git commit -m "Added Elasticsearch snapshot cron job"
git push
After this is deployed, we can run the script in the app container:
# SSH to the Platform.sh app container
platform ssh
# Run the newly deployed script
bash make-elasticsearch-snapshot.sh
The new snapshot will be created and stored in our AWS S3 bucket.
Using Elasticsearch snapshots
We can get a list of available snapshots by running the following commands:
# SSH to the Platform.sh app container
platform ssh
# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')
# Get the list of available snapshots
curl -X GET "http://${ES_HOST}:${ES_PORT}/_cat/snapshots/aws-s3?v"
Our next steps would be:
Register the same s3 snapshot repository for our local Elasticsearch
Chose the snapshot we want to restore on our local installation
Restore the snapshot on our local Elasticsearch:
curl -X POST "http://%LOCAL_ELASTICSEARCH%/_snapshot/aws-s3/%SNAPSHOT_NAME%/_restore"
Once these steps are done, we export the data from Platform.sh Elasticsearch to our local installation. And we can repeat these steps whenever we need.
Now it’s your turn
I hope you found this post interesting and useful. Hopefully it illustrates how flexible and extensible the Platform.sh framework is. Feedback and comments are appreciated. Happy snapshotting!
(Reprinted with permission.)