Efficient Management of Time-Based Indices in Elasticsearch

Efficient Management of Time-Based Indices in Elasticsearch

Introduction

Managing time-based indices in Elasticsearch can be a complex task, especially when dealing with large volumes of data. In this article, we will introduce a new “scroll mode” and its API, which can efficiently manage time-based indices.

Scroll Mode

Roll mode works as follows:

  • Index has an alias for writing, which points to an active index.
  • Another alias for reading (search) points to an inactive index.
  • The active index is distributed across multiple nodes, with each node having a slice of the index. This allows for efficient writing and reduces the load on individual nodes.
  • When the active index becomes too full or too old, it is scrolled to a new index, and the index aliases are automatically switched to the new index.

Getting Started

Suppose we have a 10-node cluster with a cold pool of nodes. Ideally, the active index should be uniformly distributed across each hot node, with each main fragment having a copy fragment to allow for node failure without data loss. This means that the active index should have five main fragments, adding up to a total of 10 slices (one for each node).

We can create a template for the active index index using the following API call:

PUT _template/active-logs
{
  "template": "active-logs-*",
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "routing.allocation.include.box_type": "hot",
    "routing.allocation.total_shards_per_node": 2
  },
  "aliases": {
    "active-logs": {},
    "search-logs": {}
  }
}

Index Log Event

When creating the active index, we also create an alias for writing and searching. We should only use the alias to write to the current active index.

POST active-logs/_log/_bulk
{
  "create": {
    "text": "Some log message",
    "@timestamp": "2016-07-01T01:00:00Z"
  }
}

Roll Index

At some point in time, the active index becomes too large or too old, and we want to use a new empty index to replace it. The rollover API allows us to specify a specific size or time limit for scrolling operations.

POST active-logs/_rollover
{
  "conditions": {
    "max_age": "7d",
    "max_docs": 5
  }
}

Why Not Support max_size Limit?

Since we want to distribute fragments of similar size, you might wonder why we don’t support max_size limits. The answer is that fragmentation size is not a reliable measure because of ongoing consolidation, which will generate temporary growth in the size of the fragment, and when the merger closes these increases will disappear.

Narrow Index

When the active index is no longer used for writing, we can move it to the cold junction and reduce it to a single slice. We can use the following API call to narrow the index:

POST active-logs-1/_shrink/inactive-logs-1

Save Space

After narrowing the index, we can delete the old index aliases and add the new index.

POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "active-logs-1",
        "alias": "search-logs"
      }
    },
    {
      "add": {
        "index": "inactive-logs-1",
        "alias": "search-logs"
      }
    }
  ]
}

Future Improvements

By scrolling, zooming out, and using the field statistics API, we provide a basis for efficient management tools based on an index of time. Of course, there are many steps that can be automated to make life better. These steps and Elasticsearch built-in are not very easy, because we need to let people know when the unexpected occurs. This is a responsibility on Elasticsearch build tools or applications. We can expect to see corresponding workflow and UI in Curator index management tool and X-Pack.