In today’s world of constant, ongoing cyber attacks, administrators are seeking new and innovative ways to analyze existing log data to learn what is normal behavior and uncover compromise. Until recently, the tools that were available to centralize log storage and analysis tended to be:
Software like Kiwi Syslog Server, while reliable and easy to deploy, don’t provide much in the way of robust search and analysis features. Other more robust services like Splunk are feature-rich but become very expensive to implement for most enterprises. The recent introduction of the Elk stack and tools like Graylog, have made it possible for everyone to access the tools needed to gather useful information from their log data.
This post outline the basic steps needed to get a Graylog server up, running and ingesting logs for analysis. Graylog has OVA images available for VMware or Virtualbox but their VMs are built on Ubuntu. I don’t recommend using Ubuntu for any serious server deployment beyond testing (why is a discussion for another day) and building your own VM from scratch will give you insight into to deploying and configuring the various components required for an operational Graylog cluster so I am outlining the steps for installing on CentOS 7, the community RedHat distribution. The configuration steps are sufficient to get a test VM up and running. Later posts will address optimizing each component for production / enterprise deployment.
Here are the basic steps:
I prefer to use the minimal install of CentOS 7 as it comes with the are minimum required to run Linux. Visit the CentOS ISO Download site, choose a mirror and download the correct ISO image. As of the writing of this post, the correct ISO image is called CentOS-7-x86_64-Minimal-1503-01.iso. Save this image to a location that you can access when building a virtual machine.
The virtual machine that we are going to build is going to handle all the services required for a single node, all-in-one Graylog server. This is intended to be used for basic testing and to get you familiar with the configuration and use of a Graylog deployment. In production, one would separate services and build in redundancy. A recommended deployment for enterprise use will be discussed in a future post.
I prefer to use VMware Workstation or a full VMware environment for my VM deployments but you can use Virtualbox and everything should work just fine. I am using VMware Workstation version 11.
Your VM will be created and should boot immediately into the CentOS 7 installer. Follow the installer steps to get a base minimal install of CentOS 7 on the VM. Configure your VM with the automated partitioning and be sure to enable your network adapter and configure a static IP or allow DHCP on the interface. Do not install any added software components. Once installed and booted, log in and do the following:
# vi /etc/resolv.conf nameserver 8.8.8.8
Save the file and test pinging to a site on the Internet by name.
# yum upgrade
# yum -y install net-tools # yum -y install nano
# yum -y install java
NOTE: In a working environment, we will want to install the Oracle Java packages as they are more stable and better at garbage collection than the open source packages.
Elasticsearch just released version 2.0. Unfortunately, Graylog is not yet compatible with Elasticsearch 2.0 so we will configure our VM to install Elasticsearch version 1.7 for compatibility.
First we will import the Elasticsearch GPG key
# rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
Next, we add the correct Elasticsearch Repository.
# nano /etc/yum.repos.d/elasticsearch.repo
The file will be empty. Paste in the following:
[elasticsearch-1.7] name=Elasticsearch repository for 1.7.x packages baseurl=http://packages.elastic.co/elasticsearch/1.7/centos gpgcheck=1 gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch enabled=1
Save the file by pressing CTRL+O <Enter>. CTRL+X to exit
Now we an install Elasticsearch.
# yum -y install elasticsearch
Set Elasticsearch to start when the system starts
# systemctl daemon-reload # systemctl enable elasticsearch-service
The only configuration setting we care about on this test VM for Elasticsearch is the name of the cluster. Open the /etc/elasticsearch/elasticsearch.yml file and set the cluster.name value to “graylog2“
# nano /etc/elasticsearch/elasticsearch.yml cluster.name: graylog2
For security, you should disable dynamic scripts by appending the following to the end of the elasticsearch.yml file:
script.disable_dynamic: true
Once the basic configuration is complete, restart the elasticsearch service.
# systemctl restart elasticsearch.service
Once the elasticsearch service has had sufficient time to restart, you can check the status of your cluster using CURL.
# curl -X GET http://localhost:9200 { "status" : 200, "name" : "Franz Kafka", "cluster_name" : "graylog2", "version" : { "number" : "1.7.3", "build_hash" : "05d4530971ef0ea46d0f4fa6ee64dbc8df659682", "build_timestamp" : "2015-10-15T09:14:17Z", "build_snapshot" : false, "lucene_version" : "4.10.4" }, "tagline" : "You Know, for Search" }
If you want to see more details about the status of you cluster, use the command below. Elasticsearch must show status “green” in order to work with a single node Graylog install.
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true' { "cluster_name" : "graylog2", "status" : "green", "timed_out" : false, "number_of_nodes" : 2, "number_of_data_nodes" : 1, "active_primary_shards" : 4, "active_shards" : 4, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0 }
MongoDB can be installed in RPM format. In order to automatically download, it is necessary to add the repository needed to install MongoDB.
# nano /etc/yum.repos.d/mongodb-org-3.0.repo [mongodb-org-3.0] name=MongoDB Repository baseurl=http://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.0/x86_64/ gpgcheck=0 enabled=1
CTRL+O to save and CTRL+X to exit.
Install MongoDB
# yum -y install mongodb-org
The above command installs
CentOS 7 installs SELinux by default so we need to install the following package to allow us to manage elements of the SELinux policy:
# yum -y install policycoreutils-python
Run the following command to configure SELinux to allow MongoDB to start.
# semanage port -a -t mongod_port_t -p tcp 27017
Enable the MongoDB service.
# systemctl enable mongod
Start MongoDB service.
# systemctl start mongodb
Graylog server accepts and processes incoming log messages, storing the data in Elasticsearch indexes for visualization and analysis using the Graylog web services or Kibana.
First, use the following to install the correct repository for Graylog:
# rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-1.2-repository-el7_latest.rpm
Now we can install the latest Graylog server.
# yum -y install graylog-server
Once installed, there are a few minimal configuration changes that we need to make.
We will need to generate a secret password used for services in a cluster to exchange information and the default “root user” password for the API and Web interface.
First, generate a password secret of at least 64 characters. We can use pwgen but we will need install the EPEL repository and the correct package.If you have another password generator, you can use that and just copy and paste the password into the config file.
Install EPEL
# cd /tmp # rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Now EPEL should appear in your repo list
# yum repolist
Install the pwgen package
# yum -y install pwgen
Generate a secret
# pwgen -N 1 -s 96 y2396e28v6cka9pa6mb8vk63kpaca69272238fzgb6xc44eyfgh4y7h3hk7mf666vvxg6at9786p3a69hd42a63mfz3qtyh7
Copy the password into a notepad window.
Generate the hash for the default root user password (admin account) for Graylog server and Graylog web
# echo -n TypeYourPasswordHere | sha256sum 769b1626b7b1c3c124a05dc1f48b98d9c86807b8453927d664db20817e946d91
Copy the hash to the same notepad window
Edit the configuration file.
# nano /etc/graylog/server/server.conf
Copy the generated password into the password_secret value shown below
password_secret = y2396e28v6cka9pa6mb8vk63kpaca69272238fzgb6xc44eyfgh4y7h3hk7mf666vvxg6at9786p3a69hd42a63mfz3qtyh7
Add the hash to the root_password_sha2 value
root_password_sha2 = 769b1626b7b1c3c124a05dc1f48b98d9c86807b8453927d664db20817e946d91
Set to admin user email address
root_email = "user@mydomain.com"
Set the admin user timezone
root_timezone = America/Chicago
Graylog will try to find the Elasticsearch nodes automatically using multicast but that is not the optimal manner so we will set this ping unicast hosts and set it to the IP address of our VM. This is done by setting the following values:
elasticsearch_http_enabled = false elasticsearch_discovery_zen_ping_unicast_hosts = x.x.x.x:9300
Where x.x.x.x is the IP address of the VM.
Next, we need to set the VM to be the master node (even though we have one Graylog server, this is required).
is_master = true
Now we set the number of messages per index. Several smaller indexes are better than one large one.
elasticsearch_max_docs_per_index = 20000000
We also need to tell the sever how many indexes to keep open and keep stored.
elasticsearch_max_number_of_indices = 20
Shards are used by Elasticsearch to spread indices across nodes. We only have one node here so set shards to 1.
elasticsearch_shards = 1
Replicas are used to keep indices running when a node fails. We only have one node so set replicas to 0.
elasticsearch_replicas = 0
MongoDB is installed locally so we don’t need any authentication information.
mongodb_useauth = false
Save the changes to the config file (CRTL+O, CTRL+X) and we are ready to enable and restart the Graylog server service.
# systemctl enable graylog-server # systemctl restart graylog-server
We can check the status of the service by tailing the log.
# tail -f -n 20 /var/log/graylog-server/server.log
You can also check the status using systemctl.
# systemctl status graylog-server
We are going to install our Web interface for Graylog (aka Graylog web) on the same server as our Graylog server and other components. This is definitely not the recommended deployment method for a production instance but will get us up and running and able to test the viability of including Graylog in our production environments.
We have already installed the necessary repo on our VM so we can move ahead with the install.
# yum -y install graylog-web
Now we edit the configuration file and set a few parameters.
# nan0 /etc/graylog/web/web.conf
Set the server URIs (the Web APIs for the Graylog servers in our cluster). In this case, the localhost only.
graylog2-server.uris = "http://127.0.0.1:12900"
Set the application secret to the same value you set for the password_secret in the Graylog server configurations.
application.secret ="y2396e28v6cka9pa6mb8vk63kpaca69272238fzgb6xc44eyfgh4y7h3hk7mf666vvxg6at9786p3a69hd42a63mfz3qtyh7"
Save the configuration file. Enable and restart the Graylog web service.
# systemctl enable graylog-web # systemctl restart graylog-web
You can check the status of the Graylog web service using systemctl.
# systemctl status graylog-web
We only have to enable the firewall to allow access to the Web interface from the outside. All the other services are local in this VM install. The Web interface listens on port 9000.
# firewall-cmd --permanent --zone=public --add-port=9000/tcp # firewall-cmd --reload
If everything is running correctly, you should now be able to connect to the Graylog web interface at http://x.x.x.x:9000. Log in with the username admin and the password you used to generate your sha256sum value.
Once logged in, you will be directed to the default search page.
We will address configuring inputs and sending data to your test Graylog server in upcoming posts.
Please keep in mind, this post covers a basic install on a single VM for testing and familiarization only. I will create some other posts relating to my experience in deploying larger Elasticsearch and Graylog clusters. I have operational clusters that can handle loads of 100K plus messages per second without issue and will detail some of the tricks I have uncovered as well as specific configuration optimizations and input configurations in upcoming posts.