Apache Falcon: New Data Management Platform For Hadoop Ecosystem

Big Data and Hadoop (170 Blogs) Become a Certified Professional

Apache Falcon is a framework for managing data life cycle in Hadoop clusters. It establishes relationship between various data and processing elements on a Hadoop environment, and also provides feed management services such as feed retention, replications across clusters, archival etc.

Let us first discuss how to setup Apache Falcon. Run the below given command to download git repository of Falcon:

Command: git clone https://git-wip-us.apache.org/repos/asf/falcon.git falcon

git-command-apache-falcon

To run falcon, you need to build it first.

Command: cd falcon

Command: export MAVEN_OPTS=”-Xmx1024m -XX:MaxPermSize=256m -noverify” && mvn clean install -DskipTests

Command: mvn clean assembly:assembly -DskipTests -DskipITs

Once you have built falcon, you will find a falcon package inside /falcon/distro/target/ directory.

The commands for building falcon looks very easy, but you will face a lot issues before you see the Build Success message. I faced a lot of issues while building it for Hadoop-2.2.0

So to skip the pain of building Falcon , I am giving you a successfully built falcon package, which you can download using the below link.

https://edureka.wistia.com/medias/xw5cfzqmho/download?media_file_id=124642564

Unzip the file to get falcon-0.10 directory.

Command: unzip falcon-0.10-SNAPSHOT.zip

Set flacon environment variables in .bashrc file.

Command: sudo gedit .bashrc

Command: source .bashrc

You can go to falcon directory and see the files and directories inside it.

Command: cd falcon-0.10-SNAPSHOT/

Command: ls

You can find falcon scripts inside bin directory.

Run below command to start Falcon.

Command: ./bin/falcon-start

You’ll see a new daemon FalconServer running now.

Command: jps

Command: ./bin/falcon admin -version

Open your browser, and go to localhost:15000. You can see Falcon web ui.

Got a question for us? Mention them in the comment section and we will get back to you.

Related Posts:

Apache Falcon: New Data Management Platform For The Hadoop Ecosystem

Recommended videos for you

Webinar: Introduction to Big Data & Hadoop

Ways to Succeed with Hadoop in 2015

MapReduce Tutorial – All You Need To Know About MapReduce

Hadoop Architecture – Hadoop Tutorial on HDFS Architecture

New-Age Search through Apache Solr

What is Apache Storm all about?

Is Hadoop A Necessity For Data Science?

Spark SQL | Apache Spark

Introduction to Hadoop Administration

Boost Your Data Career with Predictive Analytics! Learn How ?

5 Things One Must Know About Spark

Apache Kafka With Spark Streaming: Real-Time Analytics Redefined

Bulk Loading Into HBase With MapReduce

Apache Spark Will Replace Hadoop ! Know Why

Power of Python With BigData

Big Data Processing with Spark and Scala

Real-Time Analytics with Apache Storm

Big Data Processing With Apache Spark

Is It The Right Time For Me To Learn Hadoop ? Find out.

Hadoop-A Highly Available And Secure Enterprise Data Warehousing Solution

Recommended blogs for you

Spark Accumulators Explained: Apache Spark

Hadoop Administration Interview Questions and Answers For 2024

What is Scala? A Complete Guide to Scala Programming

Hadoop Career: Career in Big Data Analytics

How to Run Hive Scripts?

Operators in Apache Pig: Part 2- Diagnostic Operators

PySpark CheatSheet: Spark RDD with Python

Anatomy of a MapReduce Job in Apache Hadoop

Introduction to Apache Hive

How to Plan the Capacity of a Hadoop Cluster?

How to Become an Azure Data Engineer in 2024? – A Complete Roadmap

PySpark Dataframe Tutorial – PySpark Programming with Dataframes

Hadoop 2.0 – Frequently Asked Questions

Operators in Apache Pig: Part 1- Relational Operators

Install Hadoop: Setting up a Single Node Hadoop Cluster

What is Big Data? – A Beginner’s Guide to the World of Big Data

Hadoop Learners’ Profile

Spark SQL Tutorial – Understanding Spark SQL With Examples

Business Applications of Hadoop

Switching Careers: From Java to Big Data / Hadoop

Join the discussion Cancel reply

Trending Courses in Big Data

Azure Data Engineer Certification (DP-203) Co ...

PySpark Course Online Training

Big Data Hadoop Certification Training Course

Apache Spark and Scala Certification Training ...

Apache Kafka Certification Training Course

Splunk Certification Training: Power User and ...

Leveraging Big Data for Business Intelligence ...

ELK Stack Training & Certification

Apache Solr Certification Training

Apache Storm Certification Training

Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.

Apache Falcon: New Data Management Platform For The Hadoop Ecosystem