Bucketing in Hive

Question

Could you please let me know by default, how many buckets are created in hdfs location while inserting data if buckets are not defined in create statement?

Omkar · Answer 1 · Feb 11, 2019

By default, only 1 bucket will be created but that is not going to be efficient. The number of buckets should be equal to or less than the number of files in the HDFS. But, if there are more buckets (for example 1 bucket for each file), then the storage will be very inefficient. So, the optimal numbers of buckets should be decided based on the number of files and the size of files.

answered Feb 11, 2019 by Omkar
• 69,180 points

Bucketing in Hive

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

Bucketing vs Partitioning in HIve

How are Partitioning and Bucketing different from each other in Apache Hive?

What is the syntax for creating bucketing table in hive?

What is the difference between partitioning and bucketing a table in Hive ?

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

Hadoop: How to keep duplicates in Hive using collect_set()?

How to save Spark dataframe as dynamic partitioned table in Hive?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES