Indexing JIRA


Posted by
Frédéric Cilia

April 24, 2015

On a busy day, JIRA can place a heavy load on a database. To lighten the load and give users the performance and responsiveness they need, the application uses Lucene, an open source library which retrieves and caches the desired content of the database while keeping the form of file on the application server. This solution has the advantage of not being dependent on the network or the database server. However, it does require sufficient disk space and systematic updating of indexes for each change request.

Let’s take a closer look at the principles of re-indexing and the different possibilities when you re-index the application.

The principles

In JIRA, indexes are mainly used for searching in a way which significantly improves performance compared to repeatedly searching the database itself. JIRA gives us the option to re-index as soon as we make a configuration change to a field or when we add an additional component (which might come with new field type) that could cause a search result to be impacted by the change. Over the whole volume of the application, the best performance comes from attention to the indexes.

What is critical though is routinely re-indexing the entire application. The result is far more efficient than indexes built on the fly. The nuclear option, a total rebuild of the indexes allows greater consolidation and improved data compression.

There are several possibilities to re-indexing, each with its pros and cons…

Re-indexing in background

This type of re-indexing is available for JIRA version 5.2 and can be used in most cases. A few things worth noting is that during a configuration change, it does not stop users from working. However this type of re-indexing does not completely rebuild indexes but modifies the existing ones. Indexes of the history and comments of requests are not rebuilt. This re-indexing can be cancelled at any time. It is not recommended for high-volume instances because it can take several hours during which the application is “less” efficient.

Complete re-indexing

If the indexes become corrupted, a complete rebuild is mandatory.  Index corruption can be caused by an unexpected shut-down of the application, a disc problem or, an additional component having a design flaw. When there is a difference between the result of a research and real status of an application, it is an index corruption. This type of re-indexing removes existing indexes and completely rebuilds them, the result is better than that obtained using a re-indexing in the background. Construction time indexes is shorter than a re-indexing in the background, but the application is locked during the process and users can not use the application.

Re-indexing a project

When re-indexing a single project, it can be done in background without interuption to users. It should be used in the same context as a re-indexing in background, but when the configuration change impacts one project only.

Re-indexing for Optimization

While indexes are vital for high volume instances, the re-indexing time may become a show-stoppper. The time required depends mainly on the number of requests and the number of custom fields. For example, it is common to go beyond the hour to re-index the instances of over 500,000 applications. For these instances, re-indexing in the background is often pointless because it can take several hours, and in some cases make the application unusable. But these instances are often critical and when provide a service internationally, it can be difficult to plan a one-hour outage for each configuration change.

It is therefore essential to optimize the re-indexing process to reduce the downtime. The default configuration of JIRA is not sized for the servers hosting as many applications, as a result these servers are often underutilized for re-indexing. So we will change these settings to maximum use the capabilities of our server and reduce indexing time.

The parameters on which we can act are:

Parameter Default Value Description Comment
jira.index.issue.maxqueuesize 1000 The maximum number of applications present in the file re-indexing This value must match the number of processes used for the re-indexing. By default, we assign a maximum of 100 requests for each process (1,000 applications for 10 process).
jira.index.issue.minbatchsize 50 The minimum number of applications to be present in the file to enable the treatment of multiple processes (multi-threaded) This value defines the activation threshold processing in multi-process mode. In our case it is trivial. However, it must not be greater than the size of the queue.
jira.index.issue.threads 10 The number of processes to be used for re-indexing requests This is the greatest impact on the value of processing time reindexing. It must be modified carefully so as not to overload the server. It must be adapted to the number of cores and processors owned by the server.
jira.index.sharedentity.maxqueuesize 1000 The maximum number of shared objects (filters, dashboards, …) present in the file reindexing This value must match the number of processes used for the re-indexing. By default, we assign a maximum of 100 filters to each process (1,000 applications for 10 process).
jira.index.sharedentity.minbatchsize 50 The minimum number of shared objects to be present in the file to enable the treatment of multiple processes (multi-threaded) This value defines the activation threshold processing in multi-process mode. In our case it is trivial. However, it must not be greater than the size of the queue.
jira.index.sharedentity.threads 10 The number of processes to be used for re-indexing shared objects This is the greatest impact on the value of processing time reindexing. It must be modified carefully so as not to overload the server. It must be adapted to the number of cores and processors owned by the server.

These parameters can be added directly to the “jira-config.properties” file. They are available from the 5.2 version of JIRA.

In general:

  • Increasing the size of the file number reduces load on the database server, but increases the CPU workload process, and vice versa.
  • Increasing the number of processes increases the load on the database server and the application server.

It is necessary to adapt these settings according to the environment to determine the optimum parameters. Other factors may be limiting the access times to the hard drives and the speed of the network between the application server and database server.

Scheduling re-indexing

To ensure continuity of service during peak hours, it makes sense to schedule a re-indexing. Since version 6.1 of JIRA, there is a REST API that provides the ability to perform a re-indexing using a simple HTTP call. We’ll use this API to create a Unix script able to launch a re-indexing.

#!/bin/sh
# Params
username=admin
password=admin
baseURL=http://localhost:8080
reindexAPI="/rest/api/2/reindex"

# Generate credentials
credentials=$(printf $username:$password | base64)

# Call URL
curl -s -X POST -H “Authorization: Basic $credentials” -H “Cache-Control: no-cache” -H “X-Atlassian-Token: no-check” $baseURL$reindexAPI?type=FOREGROUND >> /dev/null

If you have any questions or suggestions regarding this topic, please let us know by commenting !

  • Here is a PS script for those windows users!

    =======================================

    $webclient = new-object system.net.webclient;

    $cred = [System.Text.Encoding]::UTF8.GetBytes(“user:pass”);
    $credentials = [System.Convert]::ToBase64String($cred);

    $webclient.Headers.Add(“Authorization”, “Basic ” + $credentials)
    $webclient.Headers.Add(“Cache-Control”, “no-cache”)

    $webclient.UploadString(‘http://jira/rest/api/2/reindex’, “”);

    =======================================