Most answered questions in Big Data Hadoop

0 votes
1 answer

Hive query: Join tables based on ID

After creating the tables a1 and b1 ...READ MORE

Jul 25, 2019 in Big Data Hadoop by Tarun
1,016 views
0 votes
1 answer

How can I append data to an existing file in HDFS?

You have to do some configurations as ...READ MORE

Jul 25, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
9,172 views
0 votes
1 answer

Limit for Namenode Quantity

Each file Schema = 150bytes   Block schema ...READ MORE

Jul 25, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
506 views
0 votes
1 answer

Current Date in Hive

Try this instead: select from_unixtime(unix_timestamp()); If you have an ...READ MORE

Jul 25, 2019 in Big Data Hadoop by Tina
2,961 views
0 votes
1 answer

Classes implementing InputFormat frequently

FileInputFormat : Base class for all file-based InputFormats Other ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Reshma
814 views
0 votes
1 answer

Mapreduce: custom Input format

Here's a list of Input Formats: CombineFileInputFormat CombineS ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Veer
1,213 views
0 votes
1 answer

Hive- unable to load data into table

The command you are typing is incorrect. ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Firoz
1,579 views
0 votes
1 answer

Index LZO files

You can do it using the following ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Rishi
1,003 views
0 votes
1 answer

Produce compressed data from map reduce

 It is straight forward and you can achieve ...READ MORE

Jul 24, 2019 in Big Data Hadoop by John
719 views
0 votes
1 answer

Running Mapreduce on compressed data

It is very straight forward, no need ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Nanda
1,092 views
0 votes
1 answer

How to Compress Map output?

With MR2, now we should set conf.set("mapreduce.map.output.compress", true)  conf.set("mapreduce.output.fileoutputformat.compress", ...READ MORE

Jul 24, 2019 in Big Data Hadoop by Varun
1,515 views
+1 vote
1 answer
0 votes
1 answer

How to download Large Hadoop[]closed] Data?

The best thing with Millions Songs Dataset ...READ MORE

Jul 23, 2019 in Big Data Hadoop by ravikiran
• 4,620 points

edited Jul 25, 2019 by ravikiran 1,206 views
0 votes
1 answer

How can we retrieve/get complete HQL hive query from hive,spark and tez?

To get full query running for the ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Lohit
3,727 views
0 votes
1 answer

HBase shell failed to connect

Enter the below command in the terminal ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Joshua
1,838 views
0 votes
1 answer

Unable to run Sqoop scirpt to copy data to Cassandra

Unfortunately, this can't be achieved with open ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Shri
1,251 views
0 votes
1 answer

How to import data to hbase database?

First you have to have the file ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Kiran
3,215 views
0 votes
1 answer

How to restart failed Namenode?

You need to solve the issue which ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Ishan
3,546 views
0 votes
1 answer

Name node RAM metadata

For the above requirement, the memory consumption ...READ MORE

Jul 23, 2019 in Big Data Hadoop by Reshma
1,654 views
0 votes
1 answer

How can we ignore header line while loading data into Pig?

You can use the following code: A = ...READ MORE

Jul 22, 2019 in Big Data Hadoop by kiran
758 views
0 votes
1 answer

How to Sqoop in a Java Program?

You can use the following sample code for ...READ MORE

Jul 22, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
1,980 views
0 votes
1 answer

How many partitions can do for one table?

Well, there are two kinds of partitions: 1. ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Kunal
685 views
0 votes
1 answer

Can we use different input and output format classes?

Yes, InputFormatClass and OutputFormatClass are independent of ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Jishan
786 views
0 votes
1 answer

Why do we use job.waitForCompletion(true) ?

The main reason for job.waitForCompletion exists is that ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Kiran
1,181 views
0 votes
1 answer

Mapreduce: What is the use of setting the name of the job?

Job job = new Job(conf,"job_name"), is just ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Madhu
1,354 views
0 votes
1 answer

Output types of mapper and reducer does not match

job.setOutputValueClass will set the types expected as ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Reena
3,018 views
0 votes
1 answer

Not a host:port pair: PBUF?

Hey. This error usually occurs when the ...READ MORE

Jul 22, 2019 in Big Data Hadoop by Esha
1,533 views
0 votes
1 answer

Mutliple Output Format in Hadoop

Each reducer uses an OutputFormat to write ...READ MORE

Jul 19, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
606 views
0 votes
1 answer

How to fix corrupt files on HDFS

1 - Spark if following slave/master architecture. So ...READ MORE

Jul 18, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
1,399 views
0 votes
1 answer

Sqoop export not working

The issue that you might be getting ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Tina
3,138 views
0 votes
1 answer

Spark Vs Hive LLAP Question

While Apache Hive and Spark SQL perform ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Karan
3,885 views
0 votes
1 answer

Pig: Difference between inner bag and outer bag

Outer Bag: An outer bag is nothing but ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Firoz
3,194 views
0 votes
1 answer

CSV integration with Hadoop

For integrating Hadoop with CSV, we can use ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Krish
1,251 views
0 votes
1 answer

RDMBS integration with Hadoop

About integrating RDBMS with Hadoop, you can ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Nanda
732 views
0 votes
1 answer

How to execute combiner and partitioning program without adding package name?

Yes, it is possible to do so ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Raman
1,076 views
0 votes
1 answer

Output Splitting problem in Hadoop

When you are loading two different files, ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Sayni
1,329 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Please refer to the below code: import org.apache.hadoop.conf.Configuration import ...READ MORE

Jul 16, 2019 in Big Data Hadoop by Raj
11,801 views
0 votes
1 answer

How does data gets split in Sqoop?

I will drop the answer in the ...READ MORE

Jul 16, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
9,098 views
0 votes
1 answer

how to store images and videos on to HDFS?

HDFS is capable to accept data in ...READ MORE

Jul 16, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
3,237 views
0 votes
1 answer

Pig script not dumping the data correctly

The first column is denoted by $0, ...READ MORE

Jul 15, 2019 in Big Data Hadoop by George
1,355 views
0 votes
1 answer

How to Import the complete directory into HDFS?

First, need to create a directory in Hadoop: $ ...READ MORE

Jul 15, 2019 in Big Data Hadoop by Hiran
902 views
0 votes
1 answer

Which one is better MR or Spark?

The above difference clearly points out that ...READ MORE

Jul 15, 2019 in Big Data Hadoop by Daniel
1,952 views
0 votes
1 answer

Sqoop: Dsqoop.export.records.per.statement option

Dsqoop.export.records.per.statement=1 is as the name suggests how many ...READ MORE

Jul 15, 2019 in Big Data Hadoop by Krish
3,484 views
0 votes
1 answer

Load data into teradata using sqoop

The general syntax to do this as ...READ MORE

Jul 15, 2019 in Big Data Hadoop by Ritu
2,300 views
0 votes
1 answer

Import JSON file into hive

There are two ways to load json ...READ MORE

Jul 15, 2019 in Big Data Hadoop by Guru
14,936 views
0 votes
1 answer

Hive Query to sort data

If you are trying to sort first ...READ MORE

Jul 14, 2019 in Big Data Hadoop by Tina
1,069 views
0 votes
1 answer

Creating a hive script and execute in the edureka cloudlab.

Please remove -f option from hive arguments and use hql extension ...READ MORE

Jul 14, 2019 in Big Data Hadoop by Karan
918 views
0 votes
1 answer

Unable to Locate WinUtils Library in Hadoop binary path.

If you are facing this problem while running a ...READ MORE

Jul 11, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
2,211 views
0 votes
1 answer

Primary keys in apache Spark

from pyspark.sql.functions import monotonically_increasing_id df.withColumn("id", monotonically_increasing_id()).show() Verify the second ...READ MORE

Jul 11, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
1,550 views
0 votes
1 answer

How to remove duplicate records from Hive table?

A record is duplicate if there are ...READ MORE

Jul 11, 2019 in Big Data Hadoop by Bhuvan
10,106 views