This blog helps to understand why the transition happened from 512 bytes sector disk to 4096 bytes sector disk. The blog also gives answers to why 4096 bytes (4K) sector disk should be opted for OS installation. The blog first explains about sector layout to understand the need of migration, then gives reasoning behind the migration and finally it covers the benefits of 4K sector drive over 512 bytes sector drive.
A sector is the minimum storage unit of a hard disk drive. It is a subdivision of a track on a hard disk drive. The sector size is an important factor in the design of Operating System because it represents the atomic unit of I/O operations on a hard disk drive. In Linux, you can check the size of the disk sector using "fdisk -l" command.
Figure 1: The disk sector size in Linux
As shown in Figure 1, both the logical and physical sectors are 512bytes long for this Linux system.
The sector layout is structured as follows:
Each sector stores a fixed amount of user data, traditionally 512 bytes for hard disk drives. But because of better data integrity at higher densities and robust error correction capabilities newer HDDs now store 4096 bytes (4K) in each sector.
The number of bits stored on a given length of track is termed as areal density. Increasing areal density is a trend in the disk drive industry not only because it allows greater volumes of data to be stored in the same physical space but it also improves transfer speed at which that medium can operate. With the increase in areal density, the sector has now consumed a smaller and smaller amount of space on the hard drive surface. This creates a problem because the physical size of the sectors on hard drives has shrunk but media defects have not. If the data in a hard drive sector consumes smaller areas then error correction becomes challenging. This is because media defects of the same size can damage a higher percentage of the data in the disk which has small area for a sector than the disk which has large area for a sector.
There are two approaches to solve this problem. The first approach is to invest more disk space to ECC bytes to assure continued data reliability. But if we invest more disk space to ECC bytes this will lead to less disk format efficiency. Disk format efficiency is defined as (number of user data bytes X 100) / total number of bytes on disk. Another disadvantage is that the more ECC bits included, the disk controller requires more processing power to process the ECC algorithm.
Second approach is to increase the size of the data block and slightly increase the ECC bytes for each data block. With the increase of data block size, the amount of overhead required for each sector to store control information like gap, sync, address mark section etc. would reduce. For each sector the ECC bytes will increase but overall ECC bytes required for a disk would reduce because of larger sector. Reducing the overall amount of space used for error correction code improves format efficiency and increased ECC bytes for each sector gives capability to use more efficient and powerful error-correction algorithms. Thus, transition to a larger sector size has two benefits: improved reliability and greater disk capacity.
From a throughput perspective, the ideal block size should be roughly equal to the characteristic size of a typical data transaction. We have to acknowledge that the average file size today is more than 512 bytes. Now a days applications in modern systems use data in large blocks, much larger than the traditional 512-byte sector size. Too small block sizes cause too much transaction overhead. While in case of large block sizes each transaction transfers a large amount of unnecessary data.
The size of a standard transaction in relational data Base systems is 4K. The consensus of opinion in the hard disk drive industry has been that physical block sizes of 4K-Block would provide a good compromise. It also corresponds to paging size used by operating systems and processors.
Figure-3: Format Efficiency improvement in 4K disk
512 byte sector format | 4096 byte sector format | |
Gap, sync & address mark | 15 bytes | 15 bytes |
User data | 512 bytes | 4096 bytes |
Error-correcting code | 50 bytes | 100 bytes |
Total | 577 bytes | 4211 bytes |
Format Efficiency | 88.7% | 97.3% |
Table 1: Format Efficiency improvement in 4K disk
As we see in Figure-2, 4K sectors are 8 times as large as traditional 512 byte ones. Hence for the same data payload one need 8 times less gap, sync and address mark sections and 4 times less error correction code section. Reducing the amount of space used for error correction code and other non-data section improves format efficiency for 4K Format. Format efficiency improvement is shown in Figure-3 and Table-1, there is a gain of 8.6% format efficiency for 4K sector disk over 512byte sector disk.
Figure-4: Effect of media defect on disk density
As shown in Figure-4, the effect of media defect on disk with higher areal density is more than the disk with the lower areal density disk. As areal density increases we need more ECC bytes to retain same level of error correction capability. The 4K format provides enough space to expand the ECC field from 50 to 100 bytes to accommodate new ECC algorithms. The enhanced ECC coverage improves the ability to detect and correct processed data errors beyond the 50-byte defect length associated with the 512-byte sector format.
4K Data disks are supported on Windows Server 2012 but as boot disk only supported in UEFI mode. For Linux, 4K hard drives require a minimum of RHEL 6.1 and SLES 11 SP2. 4K boot drives are only supported in UEFI mode in Linux. Kernel support for 4K drives is available in kernel versions 2.6.31 and above. PERC H330, H730, H730P, H830, FD33xS, and FD33xD cards support 4K block size disk drives, which enables you to efficiently use the storage space. 4K disks can be used on the Dell PowerEdge Servers supporting above PERC cards.
The physical size of each sector on the disk has become smaller as a result of increase in areal densities in disk drives. If the number of disk defects does not scale at the same rate, then we expect more sectors to be corrupted and we need strong error correction capability for each sector. Disk drives with larger physical sectors and more ECC bytes for each sector provide enhanced data protection and correction algorithms. The 4K format helps to achieve better format efficiencies and improves the reliability and error correction capability. This transition will result in better user experiences, hence the 4K drive should be opted for OS installation.