What is the best functional language to do Hadoop Map-Reduce?

Question

I'm doing an assignment for a course, which requires me to implement a&#160;parallel MapReduce engine&#160;in a functional language and then&#160;use it solve certain simple problems.Which functional language do you think I should use?Here are my requirements:Should be relatively easy to learn, since I have only about 2 weeks for this assignment.Has existing MapReduce implementations which can be found on the web - my course does&#160;not&#160;forbid me from using open-sourced code or internet resources in general.Should fit the problem, and be an overall worthwhile language to learn (a relatively popular language).I am currently considering Haskell and Clojure, but both these languages are new to me - I have no idea if any of these languages are actually appropriate for the situation.

Frankie · Answer

down voteacceptedBoth Clojure and Haskell are definitely worth learning, for different reasons. If you get a chance, I would try both. I'd also suggest adding Scala to your list.If you have to pick one, I would choose Clojure, for the following reasons:It's a&#160;Lisp&#160;- everyone should learn a Lisp.It has a&#160;unique approach to concurrency&#160;- see&#160;http://www.infoq.com/presentations/Value-Identity-State-Rich-HickeyIt's a&#160;JVM language, which makes it immediately useful from a practical perspective: the library & tool ecosystem on the JVM is extremely good, better than any other platform IMHO. If you want to do serious tech. work in the enterprise or startup space, it is very helpful to gain a good knowledge of the JVM. FWIW, Scala also falls into this category of "interesting JVM languages".Also, Clojure makes parallel map-reduce very easy. Here's one to start with:(reduce + (pmap inc (range 1000)))
=> 500500Using ratherpmap than&#160;map&#160;is enough to give you a parallel mapping operation. There are also parallel reducers if you use Clojure 1.5, see the&#160;reducers framework&#160;for more details.Apart from that, you can also use&#160;&#160;Scalding, which is&#160;a Scala abstraction on top of Cascading to abstract low-level Hadoop details. It was developed at Twitter, and seems mature enough today so you can start actually using it without too much trouble.Here is an example how you would do a Wordcount in Scalding:package com.twitter.scalding.examples

import com.twitter.scalding._

class WordCountJob(args : Args) extends Job(args) {
  TextLine( args("input") )
    .flatMap('line -> 'word) { line : String => tokenize(line) }
    .groupBy('word) { _.size }
    .write( Tsv( args("output") ) )

// Split a piece of text into individual words.
  def tokenize(text : String) : Array[String] = {
    // Lowercase each word and remove punctuation.
    text.toLowerCase.replaceAll("[^a-zA-Z0-9\s]", "").split("\s+")
  }
}I think it's a good candidate since because it's using Scala it's not too far from regular Map/Reduce Java programs, and even if you don't know Scala it's not too hard to pick up.

What is the best functional language to do Hadoop Map-Reduce

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

I have to ingest in hadoop cluster large number of files for testing , what is the best way to do it?

What is the command to start Job history server in Hadoop 2.x & how to get its UI?

Which is the most preferable language for Hadooop Map-Reduce programs?

What is the technique to know the Default scheduler in hadoop?

Hadoop Mapreduce word count Program

hadoop fs -put command?

Hadoop dfs -ls command?

Is there a way to copy data from one one Hadoop distributed file system(HDFS) to another HDFS?

What is the best way to integrate SAS with Hadoop without losing the parallel processing capacity of Hadoop

What is Modeling data in Hadoop and how to do it?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES