How are blocks created while written file in hdfs

Question

Suppose I have a file of 1GB that I want to store in hdfs. When I copy it to hdfs, how is this file divided and stored in hdfs?

Omkar · Answer 1 · Dec 21, 2018

Suppose we want to write a 1 gb file on hdfs then that 1 gb is broken into multiple 128 mb blocks.

The writing operation takes place in pipeline.

Every block of data is written into a datanode and that is replicated at 3 datanodes and by default, the replication factor is 3.

So basically the replication of the data is also written in some 128 mb block but from a different rack. The data size is not reduced in this process.

Once the client finishes on all the three nodes for one particular block then it repeats the same writing process for another block of data.

Thus the writing happens in pipeline, please refer the below link which will explain the concept in more detailed way.