Dell Power Solutions

Dell Power Solutions

Dell Magazines

Dell Magazines

Dell Power Solutions

Dell Power Solutions
Subscription Center
Advertise
Submit an Article
Magazine Extras

Dell Insight

Dell Insight Archives

The Advantages of Diskless HPC Clusters Using NAS

Baris Guler; Munira Hussain; Tau Leng, Ph.D.; and Victor Mashayekhi, Ph.D. (November 2002)

Diskless high-performance computing (HPC) clusters offer an alternative to standard HPC clusters. This article compares diskless HPC clusters and standard HPC clusters, examines how a diskless cluster operates, and presents results from benchmark tests that reveal the performance characteristics of the diskless cluster. The diskless HPC configuration examined in this article used network attached storage (NAS) as the central repository for the operating system (OS) image and as the file-system mount point for each compute node.

A diskless high-performance computing (HPC) cluster comprises compute nodes that have no local hard disk or internal storage. The compute nodes boot by using a centrally located device over a local area network (LAN). The diskless HPC cluster configuration offers many advantages, including a centralized operating system (OS).

Management, backup, and security of a system using a centralized OS image and centralized files are much easier than maintenance of an individual OS and data on each compute node—especially in heterogeneous cluster environments, where different compute nodes use different operating systems and configurations.

In addition, eliminating the hard drive(s) from each compute node reduces the total hardware cost of the cluster and can reduce costs related to drive maintenance and recovery. Operating costs also decrease because diskless clusters generate less heat and noise and consume less power.

However, a diskless HPC cluster configuration generates more network traffic than a standard HPC cluster. A network-based file system using the computation network can interfere with data communication during the computation and reduce the cluster performance. Moreover, if the network connection or the centralized file server is not available, none of the compute nodes will be accessible. Remedies exist for these drawbacks. For example, creating a RAM disk on each compute node allocates a portion of the node's main memory as a partition for the file system; the RAM disk will be used for storing the most frequently accessed files. Therefore, the compute node can access some files from local memory instead of through the network.

Probably the most undesired situation that occurs during computation is swapping. When the size of the main memory is not sufficient for the application running on the compute node, swapping to hard disk occurs and performance decreases dramatically. Swapping can have a particularly deleterious effect on performance in diskless clusters; all data accesses caused by swapping must travel through the network because the swap files are created on the centralized storage space.

Understanding the components of a diskless cluster

A diskless compute node is a server, workstation, or PC that resides on a LAN and does not have its own disk. Instead, it stores files on a network file server. Diskless nodes can reduce the overall cost of an HPC cluster because one large-capacity disk drive or storage enclosure, even with an attached server, is usually less expensive than several low-capacity drives. In addition, these nodes can simplify backups and security because all files are in one place—the file server. Also, accessing data from a large remote file server is often faster than accessing data from a small Network File System (NFS) server. However, one major disadvantage of diskless compute nodes is that they are not accessible if the network fails.

Depending on the functionality of the node's hardware, administrators can boot the node by various methods, including:

  • Network boot from the BIOS using Preboot Execution Environment (PXE) or Remote Program Load (RPL)
  • Network boot from the EPROM of the network interface card (NIC)
  • Local boot from a disk-on-chip flash device
  • Local boot using a disklike device such as a floppy or CD-ROM drive

Identifying a typical diskless HPC cluster

In the simplest diskless HPC cluster configuration, the master node functions as the Dynamic Host Configuration Protocol (DHCP) server, the boot image server, and the file server. The administrator first installs the Linux®  OS on the master node with the DHCP, NFS, and Trivial File Transfer Protocol (TFTP) services enabled and then configures these services. Next, the administrator uses the root file system of the master node or another machine to create the OS images for the diskless compute nodes.

After configuring the master node and each diskless compute node, the administrator should power up each node one at a time, maintaining a 5- to 10-second interval between each power-up. This interspersed power-up helps to ensure that DHCP assigns the proper sequential IP address for each compute node.

The sequential booting of each compute node during the initial configuration is necessary to establish a DHCP IP lease list. After the initial boot, any node can be powered up or down at any time without losing the sequential IP assignment, provided that the DHCP service has defined a long IP lease time.

Using network attached storage in diskless clusters

Large, scalable disk-storage facilities are necessary to handle output from data-intensive HPC clusters without degrading application performance. Building these facilities requires a network technology that can meet the high-performance demands in a less complex and more easily managed infrastructure. One such technology is network attached storage (NAS), a high-performance data storage method that uses hardware and software to share storage on a network. NAS devices are optimized to perform one specific task—file service.

An HPC cluster configuration using NAS can increase storage space without losing performance and without requiring a change in cluster topology or the creation of new file servers. A NAS device connects directly to the cluster network and is usually platform independent, using its own embedded OS. Such a centralized storage device is often cost-effective, highly available, and simple to configure.

To evaluate the performance of a diskless HPC cluster using NAS, Dell conducted a study that compared standard and diskless HPC clusters. The primary NAS device used in this DellTM  study was the Dell|EMC IP4700, which is designed to optimize information sharing over IP networks. The IP4700 NAS server supports both the Common Internet File System (CIFS) used by the Microsoft® Windows NT®  OS and NFS used by the UNIX®  OS. Its features include the following:

  • Two storage processors with equal memory
  • 1 GB RDRAM (4 x 256 MB per storage processor)
  • Two Intel® Pentium®  III processors at 733 MHz per storage processor (four total per IP4700)
  • 256 KB of internal level 2 (L2) cache per Intel Pentium III processor
  • 10 Fibre Channel disk drives and 730 GB of raw storage capacity
  • Internal failover (processors, power supplies, pathways) to help ensure data availability
  • Two integrated 10/100BaseT Ethernet LAN ports for management and backup (one per storage processor)
  • Two quad 10/100BaseT Ethernet LAN ports or two Gigabit Ethernet LAN ports (one per storage processor)
  • Single low-voltage differential (LVD) SCSI port to support local tape backup (one per storage processor)
  • Two serial connections (one per storage processor)
  • Support for RAID-0, RAID-1, and RAID-5
  • Remote diagnostics through EMC® CLARalert® /IP software

The Dell|EMC IP4700 NAS server can provide high-performance file serving in mixed OS environments, consolidated storage for clients, and cross-platform file sharing. Its online data warehousing provides detailed information about storage volume, paths used, online additions, and file system expansion. The IP4700 NAS server also provides IP traffic redirection as well as failover and failback for IP, NetBIOS, and shared volumes.

Comparing standard and diskless clusters

The Dell team configured two HPC clusters, one diskless and one standard, to compare performance and to identify which kind of applications are suitable for each configuration. Figure 1 shows the clusters' configuration details. Both clusters used 32 Dell PowerEdgeTM  2650 servers as the compute nodes. These servers each had 2 GB of memory and two Intel XeonTM  processors at 2.4 GHz. In the standard HPC cluster, each compute node contained one Ultra3 SCSI hard disk, which was factory installed with the Red Hat®  Linux 7.3 OS.

Figure 1. Standard and diskless HPC cluster configurations
Figure 1. Standard and diskless HPC cluster configurations

In the diskless HPC cluster, the hard disks were removed from each compute node. The compute node image was created from the hard disk of one of the standard compute nodes. The master node of the diskless cluster was a PowerEdge 2550 enabled with DHCP, NFS, and TFTP services. After the Dell|EMC IP4700 NAS server was attached to the fabric, the NFS service of the master node was transferred to the NAS server for better handling. All compute nodes and the NAS server were connected through a high-speed, blade-type Foundry Networks® FastIron®  II+ Gigabit Ethernet switch. The root file systems of all the diskless compute nodes, as well as the /usr, /home, and /opt directories, physically resided on the NAS server.

The Dell team first tested the HINT2 benchmark on a single-processor version of the PowerEdge 2650 to identify any performance differences between standard and diskless compute node configurations. Later, the team measured system performance by using the High-Performance Linpack (HPL)3 benchmark, which is best known for evaluating the performance of supercomputers and clusters for the TOP500 Supercomputer Sites list.4

Testing performance with the HINT benchmark
For this benchmark, two identical compute nodes—one with a hard disk and one without—were prepared. The Dell team set the boot parameter to mem=256M in order to limit the physical memory usage to 256 MB. They also created a 512 MB swap file on each compute node. The diskless node's swap file resided on the NAS server and the standard node's swap file resided locally as one of the partitions.

The team compiled the source code of the HINT benchmark on the compute nodes using the GNU Compiler Collection (GCC). They set the maximum memory usage to 512 MB, which shortens the length of the benchmark run but provides enough information to understand how each setup performs.

In Figure 2 , the performance results of the two runs were plotted as quality improvement per second (QUIPS) versus memory usage. As this figure shows, both compute nodes performed almost identically until reaching the physical memory boundary. After that point, the performance of the diskless node dropped much more than the performance of the standard node, because the diskless node had to access its swap file on the NAS server, whereas the standard node could swap locally. These results suggest that clusters should not swap over NFS, even if they are using a fast interconnect such as Gigabit Ethernet.

Figure 2. HINT performance results for a standard compute node and a diskless compute node
Figure 2. HINT performance results for a standard compute node and a diskless compute node

Testing performance with the HPL Benchmark
The Dell team ran the HPL benchmark on dual-processor nodes. One node was configured for the diskless HPC cluster and another for the standard HPC cluster. The results shown in Figure 3 display the performance of each compute node with respect to the order of coefficient matrix A. For larger problem sizes, the standard node performed approximately 5 percent better than the diskless node. For the smaller problem sizes, the results were not consistent with the rest of the data, and more study is required to determine accurate performance results.

Figure 3. HPL performance results for a dual-processor standard compute node and a dual-processor diskless compute node
Figure 3. HPL performance results for a dual-processor standard compute node and a dual-processor diskless compute node

The team ran a second HPL benchmark test to measure the performance of a midsize HPC cluster consisting of 32 dual-processor nodes. This test allowed the team to determine whether any scalability issues would arise with the NFS server, thereby limiting diskless HPC cluster configurations to a small scale (fewer than 16 nodes). However, no scalability, manageability, or operating problems occurred with this diskless configuration on the midsize HPC cluster.

Figure 4 shows the performance results for both the diskless HPC cluster and the standard HPC cluster. Not only did the diskless configuration perform as well as the standard one, but the diskless cluster also outperformed the standard configuration by a few gigaflops (GFLOPS). Moreover, the problem scaled very well from 1 node to 32 nodes with approximately 4.5 GFLOPS per node.

Figure 4. HPL performance results for a midsize standard cluster and a midsize diskless cluster
Figure 4. HPL performance results for a midsize standard cluster and a midsize diskless cluster

Diskless HPC clusters versus standard HPC clusters

Diskless HPC clusters have several advantages, and some disadvantages, over standard HPC clusters. Figure A details these differences.

Figure A. Comparing diskless and standard HPC clusters
Figure A. Comparing diskless and standard HPC clusters

Discovering an attractive alternative to standard HPC clusters

These Dell studies show that the performance of a diskless HPC cluster configuration suffers most when swapping over NFS occurs. However, swapping is not common in most HPC environments. Therefore, if an HPC cluster will primarily run parallel programs, the diskless HPC cluster configuration is more attractive because of its cost and infrastructure advantages and its comparable performance. Dual-processor diskless nodes do not perform as well as dual-processor standard nodes, but the diskless nodes can overcome this problem by using a RAM disk.

In general, the diskless HPC cluster is easier to configure, install, and upgrade than a standard HPC cluster. It also offers a better price/performance ratio and is more environmentally friendly. By transferring the local storage to a central storage unit, HPC clusters should not suffer a performance penalty as long as they use a fast interconnect, such as Gigabit Ethernet or the MyricomTM MyrinetTM  technology. In addition, creating a network specifically for I/O allows a diskless HPC cluster configuration to separate the storage network from the computational network. Diskless configurations, as well as the new modular blade technology, are promising advancements for HPC environments.

Baris Guler (baris_guler@dell.com) is a systems engineer and advisor in the Scalable Systems Group at Dell. His current research interests are parallel processing, diskless HPC clusters, performance benchmarking, reservoir engineering and simulation, and numerical methods. Baris has a B.S. in Petroleum and Natural Gas Engineering (PNGE) from the Middle East Technical University in Turkey, and an M.S. in PNGE from Pennsylvania State University. He is currently a Ph.D. candidate in Petroleum and Geosystems Engineering at the University of Texas at Austin.

Munira Hussain (munira_hussain@dell.com) is a systems engineer in the Scalable Systems Group at Dell. She has a B.S. in Electrical Engineering with a minor in Computer Science from the University of Illinois at Urbana-Champaign.

Tau Leng, Ph.D. (tau_leng@dell.com) is the lead engineer for HPC clustering in the Scalable Systems Group at Dell. His current research interests are parallel processing, distributed computing systems, compiler optimization, and performance benchmarking. Tau has a B.S. in Mathematics from the Fu Jen Catholic University in Taiwan, an M.S. in Computer Science from Utah State University, and a Ph.D. in Computer Science from the University of Houston.

Victor Mashayekhi, Ph.D. (victor_mashayekhi@dell.com) is a senior technical manager of the Enterprise Computing Solutions Group at Dell. His product development responsibilities at Dell have included all the cluster product offerings from Dell. His current research interests are distributed systems, database management systems, computer-supported cooperative work, concurrent software engineering, multimedia systems, clustering, and interconnect technologies. Victor has a B.A., M.S., and Ph.D. in Computer Science from the University of Minnesota.

For more information

Linux-based diskless workstations: http://www.naos.co.nz/papers/diskless/index.html

The FreeBSD Documentation Project. "Diskless Operation." FreeBSD Handbook . http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/diskless.html

© 2012 Dell | About Dell | Regulatory Compliance | Terms of Sale | Unresolved Issues | Privacy | About Our Ads and Emails | Dell Recycling | Contact | Video Sitemap | Site Map | Feedback

snWW14