Elasticsearch: Adjusting Merge Settings to Make Frequent Updates Less Painful

The character of an Elasticsearch cluster, and the amount of work and level of expertise needed to keep it running smoothly, is largely determined by the frequency of updates. Applying an update to a document actually takes the form of flagging the existed document as deleted, and adding a new document. Every deleted document adds to the size of shards, and thus also the amount of memory and processing time required for Elasticsearch nodes to service requests. Enough deleted documents and garbage collection and response time spikes become a serious issue. A fifty node Elasticsearch cluster in which the data is static is much easier to maintain than a ten node cluster in which 10% of the documents are updated every week.

Merging and Merge Policies

The only tool in Elasticsearch that can be used to deal with deleted documents is the merge. There are two options: (1) intermittent application of a force merge to the entire cluster, which walks through all shards one by one and removes the deleted documents in each; (2) ongoing merges, the operation of which is governed by various merge policy settings, set on an index by index basis. Both of these operations have a processing and memory cost, and as is usually the case for Elasticsearch, if you are in the situation in which you need this maintenance, then it is already too late to apply it while continuing to accept traffic. The cost of the fix (merging), added to the existing cost of the problem (too many deleted documents), will tip things over into outright failure. This is a common theme for Elasticsearch.

In particular, if running a cluster that would benefit from force merging every so often, it is necessary to offload traffic to a backup while the force merge runs - it will absolutely cause significant disruption to response times. This means that any adjustment to the merge policy settings that can reduce the rate at which deleted documents accumulate will help matters by postponing the need for the next force merge. This can go a long way towards reducing the total cost of ownership for a cluster. Sensible merge policy settings should be able to achieve this result without a noticeable increase in response times, but the default settings are almost never optimal for a given cluster, its data, and its traffic.

Example Merge Policy Settings

What follows is an example set of optimizations, indicating the more interesting settings for a cluster experiencing a fair pace of updates. Optimal values for the specific numbers will vary depending on the processing power of the individual machines in the cluster, the nature of request traffic, and the size of the shards. In general smaller shards are better up to a point; shard sizes of 10-20% of the RAM assigned to Elasticsearch on an individual machine are are not unreasonable. Say 3G to 6G shards if working with 32G assigned to Elasticsearch on a 64G machine. Also note that merge policy settings have changed considerably between Elasticsearch 1.* and 6.* - some settings have been retired as the relevant functions became automatically managed, and the documentation has grown ever more sparse. To see the available options, examine the code rather than the documentation.

index.merge.policy.max_merged_segment

Firstly, limit the maximum size of segments produced during normal merging operations. The default is 5gb, but setting it lower results in more aggressive removal of deletions. This can be applied to a specific set of matching indexes via wildcards - replace index-name-match-* with a suitable pattern.

curl -XPUT 'localhost:9200/index-name-match-*/_settings' -d '{
  "index" : {
    "merge.policy.max_merged_segment": "1gb"
  }
}'

index.merge.policy.reclaim_deletes_weight

It is possible to adjust the degree to which merges are favored by how many deletions they will remove. Setting this to a higher value than the default of 2.0 is usually a good idea for those indexes that suffer the greatest rate of updates.

curl -XPUT 'localhost:9200/index-name-match-*/_settings' -d '{
  "index" : {
    "merge.policy.reclaim_deletes_weight": "12.0"
  }
}'

index.refresh_interval

Globally increasing the Elasticsearch refresh interval allows larger segments to be written to disk, and this is an excellent choice for any service that isn't real time. The example here results in a delay of three minutes prior to any update becoming indexed and visible. It helps to avoid the creation of many tiny segments, which also lowers the rate at which cluster performance decays due to updates.

curl -XPUT 'localhost:9200/_settings' -d '{
 "index" : {
     "refresh_interval": "180s"
 }
}'

Elasticsearch 1.* Throttle Settings

Lastly, if using Elasticsearch 1.*, before the throttle settings below were removed, it is a good idea to explicitly set the throttle to avoid merges consuming too much processing power. In later versions this is handled automatically.

curl -XPUT localhost:9200/_cluster/settings -d '{
  "transient" : {
    "indices.store.throttle.type" : "all",
    "indices.store.throttle.max_bytes_per_sec" : "100mb"
  }
}'

Experiment Rigorously, Don't Just Try at Random

These above items work fairly well in conjunction, given useful values. Some of the other merge policy settings may also be helpful in specific use cases - it is hard to say whether or not that is the case for any specific Elasticsearch cluster without trying it. In all cases, it is a matter of finding the best trade-off between load, response time, and the decay of the cluster due to growth in shard size from deletions. Some rigorous experimentation is usually necessary.