How to assess and compare Hadoop for Business Intelligence?

Question

I am considering various technologies for data warehousing and business intelligence, and have come upon this radical tool called Hadoop. Hadoop doesn't seem to be exactly built for BI purposes, but there are references of it having potential in this field.However little information I have got from the internet, my friend tells me that Hadoop can become a disruptive technology in the space of traditional BI solutions. There really is sparse information regarding this topic, and hence I wanted to gather all the Guru's thoughts here on the potential of Hadoop as a BI tool as compared to traditional backend BI infrastructure like&#160;Oracle Exadata, Vertica etc. For starters, I would like to ask the following question -Design Considerations&#160;- How would designing a BI solution with Hadoop be different from traditional tools? I know it should be different, as I read one cannot create schemas in Hadoop. I also read that a major advantage will be the complete elimination of ETL tools for Hadoop (is this true?) Do we need Hadoop + pig + mahout to get a BI solution??

Frankie · Answer

Hadoop is a great tool to be part of a BI solution. It is not, itself, a BI solution. What Hadoop does is takes in Data_A and outputs Data_B. Whatever is needed for Bi but is not in a useful form can be processed using MapReduce and output a useful form of the data. Be it CSV, HIVE, HBase, MSSQL or anything else used to view data.I believe Hadoop is supposed to be the ETL tool. That's what we are using it for. We process gigs of log files every hour and store it in Hive and do daily aggregations that are loading into an MSSQL server and viewed through a visualization layer.The major design considerations I've run against are:-&#160;Data Flexibility:&#160;Do you want your users to view pre-aggregated data or have the flexibility to adjust the query and look at the data how they want-&#160;Speed:&#160;How long do you want your users to wait for the data? Hive (for example) is slow. It takes minutes to generate results, even on fairly small data sets. The larger the data traversed the longer it will take to generate a result.-&#160;Visualization:&#160;What type of visualization do you want to use? Do you want to custom build a lot of pieces or be able to use something off the shelf? What restraints and flexibility are needed for your visualization? How flexible and changeable does the visualization need to be?Advance Your Career with Our&#160;Business Analyst Course Online&#160;&#8212; Sign Up Now!

How to assess and compare Hadoop for Business Intelligence

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

How to install and configure a multi-node Hadoop cluster?

How to create a FileSystem object that can be used for reading from and writing to HDFS?

How to get started with Hadoop and do some development using Eclipse IDE?

How to find hadoop distribution and version?

Hadoop Mapreduce word count Program

hadoop fs -put command?

Hadoop dfs -ls command?

Is there a way to copy data from one one Hadoop distributed file system(HDFS) to another HDFS?

How do I get connected to Hadoop and Geo Spatial connector?

How to choose between Cassandra, Membase, Hadoop, MongoDB and RDBMS?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES