How to check size of HDFS directory?

0 votes

In case of Linux filesystems we use du -sh. But is there any way to check directory size in case of HDFS?

May 3, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
7,059 views

11 answers to this question.

0 votes

You can view the size of the files and directories in a specific directory with the du command. The command will show you the space (in bytes) used by the files that match the file pattern you specify. If it’s a file, you’ll get the length of the file. The syntax of the du command is as follows:

hdfs dfs -du -h /"path to specific hdfs directory"

image

Note the following about the output of the du –h command shown here:

The first column shows the actual size (raw size) of the files that users have placed in the various HDFS directories.

The second column shows the actual space consumed by those files in HDFS.

Hope this will answer your query to some extent.

answered May 3, 2018 by nitinrawat895
• 10,510 points
0 votes
hdfs dfs -du [-s] [-h] URI [URI …]

answered Dec 7, 2018 by Nishant
0 votes

hadoop fs -du -s -h /path/to/dir

answered Dec 7, 2018 by abhijeet
0 votes

To get the size in Gb, you can try this:

hdfs dfs -du PATHTODIRECTORY | awk '/^[0-9]+/ { print int($1/(1024**3)) " [GB]\t" $2 }'
answered Dec 7, 2018 by Narayan
0 votes
hadoop fs -du /user/hadoop/dir1 \
    /user/hadoop/file1 \
    hdfs://domain.com/user/hadoop/dir1 
answered Dec 7, 2018 by Nisha
0 votes
hdfs dfs -du -s -h /$DirectoryName
answered Dec 7, 2018 by Chunnu
0 votes

Using the following command, you'll get the size in %:

sudo -u hdfs hadoop fs –df
answered Dec 7, 2018 by Khush
0 votes

To check the size under a particular directory:

sudo -u hdfs hadoop fs -du -h /user
answered Dec 7, 2018 by Bunty
0 votes
hdfs dfs -du -s dir_name
answered Dec 7, 2018 by Yadav
0 votes

Another way to show size in GB:

hadoop fs -dus  /path/to/dir  |   awk '{print $2/1024**3 " G"}' 
answered Dec 7, 2018 by Anil
+1 vote

It is the same syntax. Use the following command

hadoop fs -du -s [DIR_NAME]
answered Jun 6 by Sowmya

Related Questions In Big Data Hadoop

0 votes
1 answer

How to check the size of a file in Hadoop HDFS?

You can use the  hadoop fs -ls command to ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 67,290 points
671 views
+1 vote
1 answer

How to get status of hdfs directory using python?

import commands hdir_list = commands.getoutput('hadoop fs -ls hdfs:/ ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Omkar
• 67,290 points
179 views
0 votes
1 answer

How can I use my host machine’s web browser to check my HDFS services running in the VM?

The sole purpose of the virtual machine ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by Shubham
• 13,290 points
119 views
0 votes
1 answer

How to print the content of a file in console present in HDFS?

Yes, you can use hdfs dfs command ...READ MORE

answered Apr 19, 2018 in Big Data Hadoop by Shubham
• 13,290 points
428 views
0 votes
1 answer

How to extract only few lines of data from HDFS?

Here also in case of Hadoop, it is ...READ MORE

answered May 2, 2018 in Big Data Hadoop by nitinrawat895
• 10,510 points
919 views
0 votes
1 answer

How to change the replication factor of specific directory in Hadoop?

Yes, you can change the replication factor ...READ MORE

answered May 10, 2018 in Big Data Hadoop by Shubham
• 13,290 points
183 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
896 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
2,392 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
12,191 views
0 votes
1 answer