Where Big Data tools like Hadoop and Spark comes into picture when we talk about ETL

0 votes
I am working on Hadoop from last 4 months. So now Im very curious about to know that where ETL tools are used in case of Big Data tools like hadoop and Spark and for what purpose?
May 3, 2018 in Big Data Hadoop by Shubham
• 13,490 points
612 views

1 answer to this question.

0 votes

When we talk about ETL, ETL means extract, transform & load (ETL)

A typical ETL pipeline consists of a data source, followed by a transformation, used for filtering or cleaning data, ending in a data sink.

image
So in case of Hadoop and Spark an ETL flow can be defined as:

Data is coming from various sources such as databases, Kafka, Twitter, etc.

To get some meaningful insights we need to filter out or clean the data using spark, mapreduce, hive, pig, etc.

Finally after processing(transformation) the data, it is stored in a data sink such as HDFS, table, etc.

Hope this will help you.

answered May 3, 2018 by nitinrawat895
• 11,380 points

Related Questions In Big Data Hadoop

0 votes
1 answer

What is the Data format and database choices in Hadoop and Spark?

Use Parquet. I'm not sure about CSV ...READ MORE

answered Sep 4, 2018 in Big Data Hadoop by Frankie
• 9,830 points
964 views
0 votes
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
2,360 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
4,640 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,075 views
0 votes
1 answer

How to get started with Hadoop?

Well, hadoop is actually a framework that ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,090 points
1,207 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
109,062 views
0 votes
1 answer

HortonWorks Hadoop encryption tools and data security

There are many tools available for encrypting ...READ MORE

answered Apr 20, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,150 views
+2 votes
1 answer

Is Kafka and Zookeeper are required in a Big Data Cluster?

Apache Kafka is one of the components ...READ MORE

answered Mar 23, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
2,115 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP