RabbitMQ Process Analysis and Performance Tuning

RabbitMQ Process Analysis and Performance Tuning

RabbitMQ is a popular open-source message queuing system that adheres to the Advanced Message Queuing Protocol (AMQP) standard. It is developed using the Erlang language, which provides high-performance, robust, and scalable features. The system has gained widespread adoption in various industries, including OpenStack, Spring, and Logstash, among others.

AMQP Model

AMQP is an application-layer protocol specification for asynchronous messaging. It allows clients to ignore any AMQP sources, send, and receive messages. The Broker provides message routing and queuing. The Broker is composed mainly of the Exchange and Queue: the Exchange receives messages and forwards them to the queue bindings, while the Queue stores messages, provides persistence, and other functions.

AMQP Channel

The AMQP Channel communicates with the client through the Broker, and it is an independent bi-directional data flow path multiplexed connections.

RabbitMQ Process Model

The RabbitMQ Server implements the AMQP model Broker section, and the Channel Queue is designed as Erlang processes, realizing the function of arithmetic Channel Exchange process. The RabbitMQ process model is as follows:

  • tcp_acceptor: This process handles incoming client connections, creating rabbit_reader, rabbit_writer, and rabbit_channel processes.
  • rabbit_reader: This process receives client-connected frames, parses AMQP, and sends data to the client.
  • rabbit_writer: This process returns data to the client.
  • rabbit_channel: This process analyzes AMQP, routes the message, and sends it to the appropriate process queue.
  • rabbit_amqqueue_process: This process is started when you create (restore durable type queue) or create a queue in RabbitMQ.
  • rabbit_msg_store: This process is responsible for message persistence.

RabbitMQ Flow Control

RabbitMQ has two thresholds for memory and disk usage. When the threshold is reached, the producer will block (Block), and the corresponding item will return to normal. Additionally, RabbitMQ uses a flow control mechanism to ensure stability.

Memory Management Optimization

RabbitMQ uses a credit-based flow control mechanism. Each message handling process has a credit group {InitialCredit, MoreCreditAfter}, with a default value of 200, 50}. A B message sender process sends a message to the recipient process, and each message sent, minus the number of 1 Credit until is 0, A is live block; B for receivers, each receiver MoreCreditAfter message, sends a message to A, a MoreCreditAfter administering a Credit, when a is Credit> time 0, a can continue to send messages to the B.

AMQQueue Process and Paging

The AMQQueue process is designed to efficiently handle messages and avoid unnecessary disk IO. It is implemented with four states and five internal queues for messages:

  • Alpha: Message content is indexed in memory.
  • Beta: Message content is stored on disk, and the index is in memory.
  • Gamma: Message content is stored in both memory and disk, and the index is in memory.
  • Delta: Message content and index are stored on disk.

The five internal queues are:

  • q1: Message status is alpha.
  • q2: Message status is beta and gamma.
  • Delta: Message status is delta.
  • q3: Message status is beta and gamma.
  • q4: Message status is alpha.

Paging

Paging is triggered when memory is tight, and messages are converted from alpha to beta and gamma states. If memory remains tense, messages are converted from beta and gamma to delta states. Paging is an ongoing process that involves a large number of messages of various state transitions, so Paging large overhead, severely affect system performance.

Analysis

In producers, consumers were normal circumstances, RabbitMQ pressure measurement performance is very stable, is maintained at a constant speed. When consumers do not consume or abnormal when, RabbitMQ is manifested very unstable.

Memory Usage

Memory usage shows that the number of messages in memory only 18M content, other messages have been page to disk, but the process still occupy memory 2G. Erlang memory usage show, Queues occupied 2G, Binaries occupied 2.1G.

Garbage Collection

The case described in the page from memory to disk (i.e., from the q2, q3 queue to delta) message, the system generated a lot of garbage (Garbage), while the Erlang VM without timely garbage collection (GC). This leads to incorrect calculation of the RabbitMQ memory usage, and continue to call the paging process until Erlang VM implicit garbage collection.

Memory Management Optimization

RabbitMQ memory usage calculations are performed within memory_monitor process that periodically computing system memory usage. Meanwhile amqqueue process periodically pulls memory usage when memory paging threshold is reached, triggers amqqueue process conducted paging. After paging occurs, amqqueue process will each receive a new message queue internal page (page will be calculated every time a certain number of message save).

Flow Control

Flow control is a mechanism to ensure stability. It is based on the credit-based flow control mechanism. Each message handling process has a credit group {InitialCredit, MoreCreditAfter}, with a default value of 200, 50}. A B message sender process sends a message to the recipient process, and each message sent, minus the number of 1 Credit until is 0, A is live block; B for receivers, each receiver MoreCreditAfter message, sends a message to A, a MoreCreditAfter administering a Credit, when a is Credit> time 0, a can continue to send messages to the B.

RabbitMQ Tuning Parameters

RabbitMQ optimizable parameters are divided into two portions, Erlang RabbitMQ portion and itself.

  • IO_THREAD_POOL_SIZE: The CPU core is greater than or equal to 16, the number of Erlang asynchronous thread pool to about 100, to improve the performance of file IO.
  • hipe_compile: Open Erlang HiPE compiler option (the equivalent of Erlang jit technology), can improve performance by 20% -50%. After Erlang R17 HiPE has been fairly stable, RabbitMQ official also recommended that you turn this option.
  • queue_index_embed_msgs_below: RabbitMQ 3.5 version introduces a small messages directly into the queue index (queue_index) optimization, message persistence processing directly in the amqqueue process, not by msg_store process. The optimization of system performance is improved by about 10%.
  • vm_memory_high_watermark: A configuration memory threshold, less than 0.5 is recommended, because in the worst case Erlang GC consume double the memory.
  • vm_memory_high_watermark_paging_ratio: For configuring paging threshold value, the value of 1, the threshold directly triggers the memory is full, block producers.
  • queue_index_max_journal_entries: Journal file is queue_index buffer layer (memory file) to avoid excessive disk seek to add. In the case of normal production and consumption, news production and consumption consistent record in journal file, you do not have to save; not for consumers, this file adds a redundant IO operation.

Conclusion

RabbitMQ has gained widespread adoption in various industries due to its high-performance, robust, and scalable features. However, its flow control mechanism and single amqqueue process for consumers abnormal scenario have some conflicts, which need further optimization. Additionally, RabbitMQ has many features, such as Cluster, HA, reliable delivery, and extended support, which are worthy of further study.