Apache Falcon: New Data Management Platform For Hadoop Ecosystem

Big Data and Hadoop (165 Blogs) Become a Certified Professional

Apache Falcon is a framework for managing data life cycle in Hadoop clusters. It establishes relationship between various data and processing elements on a Hadoop environment, and also provides feed management services such as feed retention, replications across clusters, archival etc.

Let us first discuss how to setup Apache Falcon. Run the below given command to download git repository of Falcon:

Command: git clone https://git-wip-us.apache.org/repos/asf/falcon.git falcon

To run falcon, you need to build it first.

Command: cd falcon

Command: export MAVEN_OPTS=”-Xmx1024m -XX:MaxPermSize=256m -noverify” && mvn clean install -DskipTests

Command: mvn clean assembly:assembly -DskipTests -DskipITs

Once you have built falcon, you will find a falcon package inside /falcon/distro/target/ directory.

The commands for building falcon looks very easy, but you will face a lot issues before you see the Build Success message. I faced a lot of issues while building it for Hadoop-2.2.0

So to skip the pain of building Falcon , I am giving you a successfully built falcon package, which you can download using the below link.

https://edureka.wistia.com/medias/xw5cfzqmho/download?media_file_id=124642564

Unzip the file to get falcon-0.10 directory.

Command: unzip falcon-0.10-SNAPSHOT.zip

Set flacon environment variables in .bashrc file.

Command: sudo gedit .bashrc

Command: source .bashrc

You can go to falcon directory and see the files and directories inside it.

Command: cd falcon-0.10-SNAPSHOT/

Command: ls

You can find falcon scripts inside bin directory.

Run below command to start Falcon.

Command: ./bin/falcon-start

You’ll see a new daemon FalconServer running now.

Command: jps

Command: ./bin/falcon admin -version

Open your browser, and go to localhost:15000. You can see Falcon web ui.

Got a question for us? Mention them in the comment section and we will get back to you.

Related Posts:

Apache Falcon: New Data Management Platform For The Hadoop Ecosystem

Recommended videos for you

Distributed Cache With MapReduce

Introduction to Big Data TDD and Pig Unit

Apache Kafka With Spark Streaming: Real-Time Analytics Redefined

MapReduce Design Patterns – Application of Join Pattern

Introduction to Apache Solr-1

Real-Time Analytics with Apache Storm

Hadoop Architecture – Hadoop Tutorial on HDFS Architecture

Bulk Loading Into HBase With MapReduce

Pig Tutorial – Know Everything About Apache Pig Script

Hadoop-A Highly Available And Secure Enterprise Data Warehousing Solution

Is It The Right Time For Me To Learn Hadoop ? Find out.

Apache Spark For Faster Batch Processing

Advanced Security In Hadoop Cluster

Is Hadoop A Necessity For Data Science?

When not to use Hadoop

Logistic Regression In Data Science

Tailored Big Data Solutions Using MapReduce Design Patterns

HBase Tutorial – A Complete Guide On Apache HBase

Big Data Processing With Apache Spark

Hive Tutorial – Understanding Hive In Depth

Recommended blogs for you

A Day In The Life Of A Hadoop Administrator

Brief Introduction to Oozie

Hadoop Developer-Job Responsibilities & Skills

HBase Tutorial: HBase Introduction and Facebook Case Study

Why SAP HANA is a Game Changer?

Why should a Software Testing Engineer learn Big Data and Hadoop Ecosystem Technologies?

HDFS Tutorial: Introduction to HDFS & its Features

Introduction to Hadoop

We Are Deloitte’s #1 Fastest Growing Tech Company!

How to become a Hadoop Administrator?

Why do we need Hadoop for Data Science?

Hadoop Interview Questions For 2025 – Setting Up Hadoop Cluster

MapReduce Tutorial – Fundamentals of MapReduce with MapReduce Example

How to Run Hive Scripts?

Map Side Join Vs. Join

Transfer files from Windows to Cloudera Demo VM

Top Skills Required for Big Data Engineer

7 Ways Big Data Training Can Change Your Organization

Top Hive Commands with Examples in HQL

Big Prospects for Big Data

Join the discussionCancel reply

Trending Courses in Big Data

Microsoft Azure Data Engineering Training Cou ...

Microsoft Fabric DP-700 Certification Trainin ...

PySpark Certification Training Course

Big Data Hadoop Certification Training Course

Applied Data Engineering on Azure Cloud Cours ...

Apache Kafka Certification Training Course

ELK Stack Training & Certification

Apache Spark and Scala Certification Training ...

Splunk Certification Training: Power User and ...

Comprehensive MapReduce Certification Trainin ...

Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.

Apache Falcon: New Data Management Platform For The Hadoop Ecosystem