Want to have an idea on Hadoop Machine Learning and Data Mining project.

0 votes
I am a graduate CS student (Data mining and machine learning) and have a good exposure to core Java (>4 years). I have read up a bunch of stuff on Hadoop and Map/Reduce

I would now like to do a project on this stuff (over my free time of course) to get a better understanding.

Any good project ideas would be really appreciated. I just wanna do this to learn, so I don't really mind re-inventing the wheel. Also, anything related to data mining/machine learning would be an added bonus (fits with my research) but absolutely not necessary.
Aug 13, 2018 in Big Data Hadoop by Neha
• 6,280 points
179 views

1 answer to this question.

0 votes

You haven't written anything about your interest. I know algorithms in graph mining has been implemented over the Hadoop framework. This software http://www.cs.cmu.edu/~pegasus/ and paper: "PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations" may give you a starting point.

Further, this link discusses something similar to your question: http://atbrox.com/2010/02/08/parallel-machine-learning-for-hadoopmapreduce-a-python-example/ but it is in python. And, there is a very good paper by Andrew Ng "Map-Reduce for Machine Learning on Multicore".

There was a NIPS 2009 workshop on the similar topic "Large-Scale Machine Learning: Parallelism and Massive Datasets". You can browse some of the paper and get an idea.

Edit : Also there is Apache Mahout http://mahout.apache.org/ -->" Our core algorithms for clustering, classification and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm"

answered Aug 13, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to analyze block placement on datanodes and rebalancing data across Hadoop nodes?

HDFS provides a tool for administrators i.e. ...READ MORE

answered Jun 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
126 views
0 votes
2 answers

Hey for all, how to get on large data i want use in hadoop?

Hi, To work with Hadoop you can also ...READ MORE

answered Jul 30 in Big Data Hadoop by Sunny
48 views
0 votes
1 answer

Explain to me the method to transfer data between Azure tables and Hadoop on Azure

this article on HiveStorageHandler will let you create ...READ MORE

answered May 2 in Big Data Hadoop by ravikiran
• 4,560 points
35 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,690 points
3,051 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
15,006 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,118 views
0 votes
1 answer
0 votes
1 answer

What is Modeling data in Hadoop and how to do it?

I suggest spending some time with Apache ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by Frankie
• 9,810 points
100 views
0 votes
1 answer

I want to install snappy on Hadoop 1.2.1. How do I do that?

As per Cloudera, if you install hadoop ...READ MORE

answered Dec 11, 2018 in Big Data Hadoop by Frankie
• 9,810 points
84 views