Using Hadoop for Data Analytics.

0 votes

I have a question regarding implementation of hadoop in one of my projects. Basically the requirement is that, we receive bunch of logs on daily basis containing information regarding videos(When it was played, when it stopped, which user playe it etc).

What we have to do is analyze these files and return stats data in response to an HTTP request. Example request: http://somesite/requestData?startDate=someDate&endDate=anotherDate. Basically this request asks for count of all videos played between a date Range.

My question is can we use hadoop to solve this?

I have read in various articles hadoop is not real time. So to approach this scenario should i use hadoop in conjunction with MySQL?

What i have thought of doing is to write a Map/Reduce job and store count for each video for each day in mysql. The hadoop job can be scheduled to run like once a day. Mysql data can then be used to serve the request in real time.

Is this approach correct? Is hive useful in this in any way? Please provide some guidance on this.

Sep 28, 2018 in Big Data Hadoop by Neha
• 6,280 points
45 views

1 answer to this question.

0 votes
Yes, your approach is correct - you can create the per day data with MR job or Hive and store them in MySQL for serving in real time.

However newer versions of Hive when configured with Tez can provide decent query performance. You could try storing your per day data in Hive serve them directly from there. If the query is a simple select, it should be fast enough.
answered Sep 28, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How will you choose various file formats for storing and processing data using Apache Hadoop ?

The decision to choose a particular file ...READ MORE

answered Sep 27, 2018 in Big Data Hadoop by zombie
• 3,710 points
136 views
+1 vote
2 answers

How to authenticate username & password while using Connector for Cloudera Hadoop in Tableau?

Hadoop server installed was kerberos enabled server. ...READ MORE

answered Aug 21, 2018 in Big Data Hadoop by Priyaj
• 57,300 points
238 views
0 votes
1 answer

Hadoop for data migration

You can use Hadoop for these kinds ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
192 views
0 votes
1 answer

GUI for using Hadoop

Hue is open source and works well ...READ MORE

answered Apr 23, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
456 views
0 votes
1 answer

Update hdfs data before stroring in MySql

Yes, you can update the data before ...READ MORE

answered Jan 27 in Big Data Hadoop by Omkar
• 68,160 points
41 views
0 votes
1 answer

How to open MySql console in Ubuntu?

sudo service mysqld restart mysql -u <username> root ...READ MORE

answered Dec 14, 2018 in Big Data Hadoop by Omkar
• 68,160 points
150 views
–1 vote
1 answer

Facing the below error while installing mysql in VM

We would like to say that the ...READ MORE

answered Dec 21, 2018 in Big Data Hadoop by Omkar
• 68,160 points
78 views
0 votes
1 answer

MySql connection problem "Cant connect to local mysql server through socket "var/lib/mysql/mysql.sock" .

First start the mysql server: service mysqld start To ...READ MORE

answered Dec 26, 2018 in Big Data Hadoop by Omkar
• 68,160 points
355 views
0 votes
2 answers

How does Hadoop/Spark is used for building large analytics report?

The best possible framework for this task ...READ MORE

answered Aug 7, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
167 views
0 votes
1 answer

How to access Hadoop Data using REST service?

The REST API gateway for the Apache ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
666 views