Re: What is default replication factor in Hadoop( big-data-h
Ideal replication factor in Hadoop is 3. However the replication factor can be changed in hdfs-site configuration file.
Reason for 3 being the ideal replication factor: Multiple copies of data blocks or input splits are stored in different nodes of the Hadoop cluster. Because of the Hadoop HDFS’s rack awareness, copies of data are stored in Datanodes of different racks as well. So, even if the whole rack goes down, data blocks can be accessed from the nodes on different rack. At any given point of time, if one Datanode goes down and one Datanode is not available, there will always be a third Datanode on which the required data block is stored.