Growth of the data is a big challenge in today’s world. Talking about the whole world, we are generating terabytes and petabytes of data everyday. You go to a particular vacation, or buy a particular product; you’d end up creating some data. You take photos, upload them on Facebook, tweet about it, go create lots of emails, and documents on the web. All this data is not available for us to store and analyze.
The data is growing in terms of volume, and we need a mechanism, with which we can not only store the data but can also process the huge volumes of data and get some insight out of it. So that’s where big data technologies come into picture that help you in processing the huge volumes of data and getting some meaningful insight out of it.
Sources of Big Data
When we talk about big data, there are different sources, where it comes from:
- Transaction data from systems like, OLTP, SAP, ERP
- Data from social media, like Facebook, twitter and other web interfaces, as well as the email
- Different kinds of documents prepared whenever there’s a new release of a product
- Another source from where a lot of data is getting generated is mobile phones.
All the data being generated from those sources is giving rise to dire Big Data challenges. To fulfill the need of large data sets to be stored and analyzed, we need to have a mechanism, which can match with the kind of growth we are having in terms of data generation, and can analyze and create an understanding about the data.
Emergence of Hadoop
The good thing is that despite lots of big data here, we have a technology like Hadoop with us, which provides with a framework like MapReduce and tools, such as Hive, Pig, NoSQL, HBase, and Cassandra. By using them, we can process the huge volumes of data by involving distributed computing concept and we can process it too. But then, the challenge that we have is how to basically understand, store and analyze the data.
Since the beginning of the era of Big Data growth, the data is getting huge and the problem of processing that huge volume of data has been mounting. Initially, the organizations having huge volumes of data, tried to solve this problem using the existing RDBMS kind of solutions, but were not able to scale out anything in terms of the speed. The need to process big data and get insights out of it led to the need for a technology like Hadoop. The widespread use of Hadoop has now reduced the Big Data challenges to a large extent .
Got a question for us? Mention them in the comments section and we will get back to you.