Other Distributed Alternatives to Big Data Hadoop

Question

I have recently started working on Hadoop and I have a curious question.

Are there some other distributed and scalable solutions as an alternatives to hadoop? Bascially, I am referring to implementions that are similar to HDFS that uses commodity hardwares and provide fault tolerant storag. Also, it should have processing engine on top of it to perform batch and real-time processing. I have heard about spark as an alternative but, I want a data warehousing solution which is distributed, fault tolerant and scalable. Suggestions are welcomed. Thanks :)

coldcode · Answer 1 · Mar 27, 2018

Yes, there are lot of alternatives to Hadoop that provides scalable, fault tolerant and cost effective solution to Big Data problem. Let me list few for you:

Cluster MapReduce: Cluster Map Reduce was developed by Massachusetts-based online ad company Chitika. Cluster Map Reduce provides a similar framework like Hadoop for MapReduce jobs run in a distributed environment.
HPCC (High Performance Computing Cluster): High Performance Computing Cluster is an open source parallel processing platform that incorporates a data refinery cluster called Thor, a query cluster called Roxie, plus middleware components, external communications, and client interfaces.
Hydra: Hydra is a distributed task processing system developed by social bookmarking service AddThis. It's available under an open source Apache license and can tackle some Big Data tasks that Hadoop struggles with.

Some other notable mentions are Sphere and Riak.