How to resolve corrupted blocks in Hadoop?

Question

Hi. I am new to hadoop and had stored some data on hdfs. Now the data in the blocks has been corrupted. How can I recover them?

Nandini · Answer

You can use  hdfs fsck /to determine which files are having problems. Look through the output for missing or corrupt blocks (ignore under-replicated blocks for now). This command is really verbose especially on a large HDFS filesystem so I normally get down to the meaningful output with  hdfs fsck / | egrep -v '^\.+$' | grep -v eplicawhich ignores lines with nothing but dots and lines talking about replication.Once you find a file that is corrupt  hdfs fsck /path/to/corrupt/file -locations -blocks -filesUse that output to determine where blocks might live. If the file is larger than your block size it might have multiple blocks.You can use the reported block numbers to go around to the datanodes and the namenode logs searching for the machine or machines on which the blocks lived. Try looking for filesystem errors on those machines. Missing mount points, datanode not running, file system reformatted/reprovisioned. If you can find a problem in that way and bring the block back online that file will be healthy again.Lather rinse and repeat until all files are healthy or you exhaust all alternatives looking for the blocks.Once you determine what happened and you cannot recover any more blocks, just use the  hdfs fs -rm /path/to/file/with/permanently/missing/blockscommand to get your HDFS filesystem back to healthy so you can start tracking new errors as they occur.

How to resolve corrupted blocks in Hadoop

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

How to find the number of blocks for a file in Hadoop?

How to retrieve the list of sql (Hive QL) commands that has been executed in a hadoop cluster?

How to configure secondary namenode in Hadoop 2.x ?

How to authenticate username & password while using Connector for Cloudera Hadoop in Tableau?

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

How to run Hadoop in Docker containers?

How to run a jar file in hadoop?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES