Building a Distributed Logging System with Flume: A Step-by-Step Guide

Building a Distributed Logging System with Flume: A Step-by-Step Guide

Overview

In this article, we will walk you through the process of building a distributed logging system using Flume, a popular open-source data collection and aggregation tool. We will cover the installation, configuration, and testing of Flume on a CentOS 7.0 system with Java 1.8.

Prerequisites

  • CentOS 7.0
  • Java 1.8
  • Flume 1.7.0

Step 1: Download and Install Flume

To download the latest version of Flume, visit the official website at http://flume.apache.org/download.html. At the time of writing, the current latest version is apache-flume-1.7.0-bin.tar.gz. Once downloaded, extract the file to the /usr/local folder and rename it to flume170.

Step 2: Configure the Environment

To configure the environment, we need to modify the Flume-env.sh file to add the JAVA_HOME variable. Set JAVA_HOME to /usr/lib/jvm/java8. We also need to set the global variables for Flume and add the FLUME and PATH variables to the profile file.

Step 3: Verify the Installation

To verify that the installation was successful, run the following command:

flume-ng version

This should display the version of Flume installed on your system.

Step 4: Test a Small Example

To test a small example, we will create a spooling directory and configure the spool.conf file to monitor the directory and read files from it. The spool.conf file should contain the following configuration:

# Describe the source
a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir = /usr/local/flume170/logs
a1.sources.r1.fileHeader = true

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

Start the Flume agent using the following command:

flume-ng agent -c /usr/local/flume170/conf -F /usr/local/flume170/conf/spool.conf -n a1 -Dflume.root.logger=INFO,console

Append a file to the /usr/local/flume170/logs directory:

echo "spool test1" > /usr/local/flume170/logs/spool_text.log

In the console, you should see the following information:

14/08/10 11:37:13 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
14/08/10 11:37:13 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
14/08/10 11:37:14 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/local/flume170/logs/spool_text.log to /usr/local/flume170/logs/spool_text.log.COMPLETED
14/08/10 11:37:14 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown
14/08/10 11:37:14 INFO source.SpoolDirectorySource:. Spooling Directory Source runner has shutdown
14/08/10 11:37:14. INFO sink.LoggerSink: Event: {headers: {file = /usr/local/flume170/logs/spool_text.log} body: 73 70 6F 6F 6C 20 74 65 73 74 31 spool test1}
14/08/10 11:37:15 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown
14/08/10 11:37:15 INFO source.SpoolDirectorySource:. Spooling Directory Source runner has shutdown
14/08/10 11:37:16 INFO source.SpoolDirectorySource. : Spooling Directory Source runner has shutdown
14/08/10 11:37:16 INFO source..SpoolDirectorySource: Spooling Directory Source runner has shutdown
14/08/10 11:37:17 INFO source.SpoolDirectorySource:. Spooling Directory Source runner has shutdown

This shows that the Flume agent has successfully read the file from the spooling directory and sent it to the logger sink.

Conclusion

In this article, we have walked through the process of building a distributed logging system using Flume on a CentOS 7.0 system with Java 1.8. We have covered the installation, configuration, and testing of Flume, and demonstrated how to use the spool.conf file to monitor a directory and read files from it. This is just a small example of what can be achieved with Flume, and we encourage you to explore its capabilities further.