ELK Stack Tutorial – Discover, Analyze And Visualize Your Data Efficiently

Last updated on May 21,2020 74.5K Views
Sr Research Analyst at Edureka. A techno freak who likes to explore... Sr Research Analyst at Edureka. A techno freak who likes to explore different technologies. Likes to follow the technology trends in market and write...

ELK Stack Tutorial – Discover, Analyze And Visualize Your Data Efficiently

edureka.co

With more and more IT infrastructures switching to the cloud, need for public cloud security tools and log analytics platforms is also increasing rapidly. Irrespective of the size of the organization a huge amount data is generated on daily basis. A considerable amount of this data is composed of the company’s web server logs. Logs are one of the most important and often-neglected sources of information. Each log file contains invaluable pieces of information which are mostly unstructured and makes no sense. Without a careful and detailed analysis of this log data, an organization can remain oblivious to both opportunities as well as threats surrounding it. Here is where the log analysis tools come in handy. ELK Stack or Elastic Stack is a complete log analysis solution which helps in deep searching, analyzing and visualizing the log generated from different machines. Through this blog on ELK Stack tutorial, I will give you the insights on it.

But before I start, let me list down the topics I will be discussing:

You may go through this ELK Tutorial recording where our ELK stack training expert has explained the topics in a detailed manner with examples that will help you to understand this concept better.

ELK Tutorial | Edureka

This Edureka tutorial on What Is ELK Stack will help you in understanding the fundamentals of Elasticsearch, Logstash, and Kibana together and help you in building a strong foundation in ELK Stack.

So, lets quickly get started with this ELK Stack Tutorial blog, by first understanding what exactly is ELK Stack.

What Is ELK Stack? – ELK Stack Tutorial

Popularly known as ELK Stack has been recently re-branded as Elastic Stack. It is a powerful collection of three open source tools: Elasticsearch, Logstash, and Kibana.

These three different products are most commonly used together for log analysis in different IT environments. Using  ELK Stack you can perform centralized logging which helps in identifying the problems with the web servers or applications. It lets you search through all the logs at a single place and identify the issues spanning through multiple servers by correlating their logs within a specific time frame.

Lets now discuss each of these tools in detail.

Logstash

Logstash is the data collection pipeline tool. It the first component of ELK Stack which collects data inputs and feeds it to the Elasticsearch. It collects various types of data from different sources, all at once and makes it available immediately for further use.

 Elasticsearch

Elasticsearch is a NoSQL database which is based on Lucene search engine and is built with RESTful APIs. It is a highly flexible and distributed search and analytics engine. Also, it provides simple deployment, maximum reliability, and easy management through horizontal scalability. It provides advanced queries to perform detailed analysis and stores all the data centrally for quick search of the documents. 

Kibana

Kibana is a data visualization tool. It is used for visualizing the Elasticsearch documents and helps the developers to have an immediate insight into it. Kibana dashboard provides various interactive diagrams, geospatial data, timelines, and graphs to visualize the complex queries done using Elasticsearch. Using Kibana you can create and save custom graphs according to your specific needs.

The next section of this ELK Stack Tutorial blog will talk about the ELK Stack architecture and how data flows within it.

ELK Stack Architecture – ELK Stack Tutorial

The following is the architecture of ELK Stack which shows the proper order of log flow within ELK. Here, the logs generated from various sources are collected and processed by Logstash, based on the provided filter criteria. Logstash then pipes those logs to Elasticsearch which then analyzes and searches the data. Finally, using Kibana, the logs are visualized and managed as per the requirements.

ELK Stack Installation – ELK Stack Tutorial

STEP I: Go to https://www.elastic.co/downloads.

STEP II: Select and download Elasticsearch.

STEP III: Select and download Kibana.

STEP IV: Select and download Logstash.

STEP V: Unzip all the three files to get their folder files.

Installing Elasticsearch

STEP VI: Now open the elasticsearch folder and go to its bin folder.

STEP VII: Double click on the elasticsearch.bat file to start the elasticsearch server.

STEP VIII: Wait for the elasticsearch server to start.

STEP IX: To check whether the server has started or not go to the browser and type localhost:9200.

Installing Kibana

STEP X: Now open the kibana folder and go to its bin folder.

STEP XI: Double click on the kibana.bat file to start the elasticsearch server.

STEP XII: Wait for the kibana server to start.

STEP XIII: To check whether the server has started or not go to the browser and type localhost:5601.

Installing Logstash

STEP XIV: Now open the logstash folder.

STEP XV: To test your logstash installation, open the command prompt and go to your logstash folder. Now type :

binlogstash -e 'input { stdin { } } output { stdout {} }'

STEP XVI: Wait until “Pipeline main started” appears on the command prompt.

STEP XVII:  Now enter a message at the command prompt and hit enter.

STEP XVIII: Logstash appends timestamp and IP address information to the message and displays it on the command prompt.

Since we are done with the installation, lets now take a deeper dive into these tools. Let’s start with Elasticsearch.

 

Elasticsearch – ELK Stack Tutorial

As mentioned before, Elasticsearch is a highly scalable search engine which runs on top of Java-based Lucene engine. It is basically a NoSQL database; which means it stores data in an unstructured format and SQL queries can’t be performed for any kind of transaction. In other words, it stores the data inside the documents instead of tables and schemas. To have a better picture, check the following table which shows what is what, in Elasticsearch when compared to a database.

Let’s now get familiar with the basic concepts of Elasticsearch.

When you work with Elasticsearch there are three major steps which you need to follow:

  1. Indexing
  2. Mapping
  3. Searching

Let’s talk about them in detail, one by one.

Indexing

Indexing is the process of adding the data Elasticsearch. It is called ‘indexing’ because when the data is entered into Elasticsearch, it gets placed into Apache Lucene indexes.  Elasticsearch then uses these Lucene indexes to store and retrieve the data. Indexing is similar to the create and update process of CRUD operations.

An index scheme consists of name/type/id, where name and type are mandatory fields. In case you do not provide any ID, Elasticsearch will provide an id on its own. This entire query is then is appended to an HTTP PUT request and the final URL looks like:PUT name/type/id Along with the HTTP payload a JSON document, which contains the fields and values, is sent as well.
Following is an example of creating a document of a US-based customer with his details in the fields.

PUT /customer/US/1
{
 "ID" : 101,
 "FName" : "James",
 "LName" : "Butt",
 "Email" : "jbutt@gmail.com",
 "City" : "New Orleans",
 "Type" : "VIP"
}

It will give you the following output:

 

Here it shows the document has been created and added to the index.

 

 

 

 

Now if you try to change the field details without changing the id, Elasticsearch will overwrite your existing document with the current details.

PUT /customer/US/1
{
 "ID" : 101,
 "FName" : "James",
 "LName" : "Butt",
 "Email" : "jbutt@yahoo.com",
 "City" : "Los Angeles",
 "Type" : "VVIP"
}

 

 

Here it shows the document has been updated with new details the index.

 

 

 

Mapping

Mapping is the process of setting the schema of the index. By mapping, you tell Elasticsearch about the data types of the attributes present in your schema. If the mapping is not done for a specific at the pre-index time, dynamically a generic type will be added to that field by Elasticsearch. But these generic types are very basic and most of the times do not satisfy the query expectations.

Lets now try to map our query.

PUT /customer/
{
 "mappings":{
 "US":{
 "properties":{
 "ID":{
 "type": "long"
 },
 "FName" : {
 "type" : "text"
 },
 "LName" : {
 "type" : "text"
 },
 "Email" : {
 "type" : "text"
 },
 "City" : {
 "type" : "text"
 },
 "Type" : {
 "type" : "text"
 }
 }
 
 }
 }
}

 

When you execute your query you will get this type of output.

 

 

Searching

A general search query with specific index and type will look like: POST index/type/_search

Lets now try to search for the details of all the customers present in our ‘customer’ index.

POST /customer/US/_search

When you execute this query, following result will be generated:

But when you want to search for specific results, Elasticsearch provides three ways in which you can perform it:

POST /customer/_search
{
 "query": {
 "match": {
 "Type": "VVIP"
 }
 },
 "post_filter": {
 "match" : {
 "ID" : 101
 }
 }
}

If you execute this query you’ll get the following kind of result:

 

POST /customer/_search
{
 "size": 0, 
 "aggs" : {
 "Cust_Types" : {
 "terms" : { "field" : "Type.keyword" }
 }
 }
}

Lets now see how to retrieve a data set from an index.

Getting Data 

To check the list of documents you have within an index, you just need to send an HTTP GET request in the following format: GET index/type/id 

Let us try to retrieve the details of the customer with ‘id’ equals 2:

GET /customer/US/2

It will give you the following type of result, on executing successfully.

With Elasticsearch,  you can not only browse through the data, but you can delete or remove the documents as well.

Deleting Data

Using the delete convention you can easily remove the unwanted data from your index and free up the memory space. To delete any document you need to send an HTTP DELETE request in the following format: DELETE index/type/id.

Lets now try to delete the details of a customer with id 2.

DELETE /customer/US/2

On executing this query, you will get the following type of result.

So, this concludes the basics of CRUD operations using Elasticsearch. Knowing these basic operations will help you perform a different kind of searches and you are good enough to proceed with the ELK Tutorial. But if you want to learn Elasticsearch in depth, you can refer to my blog on Elasticsearch Tutorial.

Lets now begin with next tool of ELK Stack, which is Logstash.

Logstash – ELK Stack Tutorial

As I have already discussed, Logstash is a pipeline tool generally used for collecting and forwarding the logs or events. It is an open source data collection engine which can dynamically integrate data from various sources and normalize it into the specified destinations.

Using a number of input, filter, and output plugins, Logstash enables the easy transformation of various events. At the very least, Logstash needs an input and an output plugin specified in its configurational file to perform the transformations. Following is the structure of a Logstash config file:

input {
 ...
}

filter {
 ...
}

output {
 ...
}

As you can see, the entire configuration file is divided into three sections and each of these sections holds the configuration options for one or more plugins. The three sections are:

  1. input
  2. filter
  3. output

You can apply more than one filter in your config file as well. In such cases, the order of their application will be the same as the order of specification in the config file.

Lets now try to configure our US customer data set file which is in CSV file format.

input{ 
          file{ 
          path => "E:/ELK/data/US_Customer_List.csv" 
          start_position => "beginning" 
          sincedb_path => "/dev/null" 
        }
}
filter{ 
     csv{ 
     separator => ","
     columns =>["Cust_ID","Cust_Fname","Cust_Lname","Cust_Email","Cust_City","Cust_Type"] 
     }
     mutate{convert => ["Cust_ID","integer"]}
}
output{ 
     elasticsearch{ 
     hosts => "localhost" 
     index => "customers" 
     document_type => "US_Based_Cust" 
     } 
     stdout{}
}

To insert this CSV file data into the elasticsearch you have to notify the Logstash server.

For that follow the below steps:

  1. Open command prompt
  2. Go the bin directory of Logstash
  3. Type: logstash f X:/foldername/config_filename.config and hit enter. Once your logstash server is up and running it will start pipelining your data from the file, into the Elasticsearch.

If you want to check whether your data was inserted successfully or not, go to the sense plugin and type:
GET /customers/

It will give you the number of documents that have been created.

Now if you want to visualize this data, you have to make use of the last tool of ELK Stack i.e Kibana. So, in the next section of this ELK Stack Tutorial, I will be discussing Kibana and the ways to use, it to visualize your data.

Kibana – ELK Stack Tutorial

As mentioned earlier, Kibana is an open source visualization and analytics tool. It helps in visualizing the data that is piped down by the Logstash and is stored into the Elasticsearch. You can use Kibana to search, view, and interact with this stored data and then visualize it in various charts, tables, and maps.The browser-based interface of Kibana simplifies the huge volumes of data and reflects the real-time changes in the Elasticsearch queries. Moreover, you can easily create, customize, save and share your dashboards as well.

Once you have learned, how to work with Elasticsearch and Logstash, leaning Kibana becomes no big deal. In this section of the ELK tutorial blog, I will introduce you to the different functions which you need in order to perform the analysis on your data.

This concludes this blog on ELK Stack Tutorial. Now you are ready to perform various search and analysis on any data you want, using Logstash, Elasticsearch, and Kibana. 

If you found this ELK Stack Tutorial blog, relevant, check out the ELK Stack Training by Edureka, a trusted online learning company with a network of more than 250,000 satisfied learners spread across the globe. The Edureka ELK Stack Training and Certification course help learners to run and operate their own search cluster using Elasticsearch, Logstash, and Kibana.
Got a question for us? Please mention it in the comments section of this ELK Stack Tutorial blog and we will get back to you as soon as possible.
BROWSE COURSES