Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

Summary: Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

This article applies to This article does not apply to

Symptoms

Article written by Jyothi Bhaskar of HPC and AI Innovation Lab in June 2019

Cause

None

Resolution

With this blog we announce the availability of Dell Ready Solution for Lustre with Cascade Lake processors. In this blog we present the updated technical specifications of the Lustre solution, initial performance results of the updated solution, and a comparison between the current results and the previous results. We configured the solution stack with new updates as presented in Table 1 with EDR interconnect ,verified that the installation worked as expected and ran performance checks.

The architecture diagram for the Large base configuration is shown below in Figure 1.
Please note that the server and storage models remain the same as presented earlier. Only the new updates are shown in Table 1.
SLN317174_en_US__1image(10273)

Figure 1 : Dell Ready Solution for HPC Lustre Storage : Architecture diagram of L base configuration

Table 1 : Updated technical specifications of the Ready Solution for Lustre and a quick comparison with the previous release

Hardware/Software Component	Current	Previous
Processors in OSS and MDSObject Storage Server ( OSS) and Metadata Server ( MDS)	2 x Intel Xeon™ Gold 6230 CPU with 20 cores @ 2.10GHz per OSS/MDS	2 x Intel Xeon™ Gold 6136 with 12 cores @ 3.00GHz
Processor in Integrated Manger for Lustre ( IML ) server	2 x Intel Xeon Gold 5218 with 16 cores @ 2.3GHz	2 x Intel Xeon Gold 5118 with 12 cores @ 2.3GHz
Memory DIMMs in OSS and MDS	12 x 32 GiB 2933 MT/s DDR4 RDIMMs	24 x 16GiB 2666MT/s DDR4 RDIMMs
Memory DIMMs in IML server	12 x 8GiB 2666MT/s DDR4 RDIMMs	12 x 8GB 2666MT/s DDR4 RDIMMs
BIOS	2.1.8 or later	1.4.5 or later
OS Kernel	3.10.0-957.1.3	3.10.0-862
Lustre Version	2.10.7	2.10.4
IML version	4.0.10.0	4.0.7.0
Mellanox OFED versi	4.5-1.0.1.0	4.4-1

Performance Results

We configured the updated Ready Solution as listed in Table 1 and ran performance checks with IOzone sequential, IOzone random and MDtest benchmarks to verify the performance of the updated solution. The test methodology including the benchmark commands for all tests was identical to the method used and described previously.

For all tests we used the client test bed as described in the Table 2 below

Table 2 : Client test bed

Number of Client nodes	8
Client node	C6420
Processors per client node	2 x Intel(R) Xeon(R) Gold 6248 with 20 cores @ 2.50GHz
Memory per client node	12 x 16GiB 2933 MT/s RDIMMs
BIOS	2.2.6
OS Kernel	3.10.0-957.10.1
Lustre version	2.10.7
Mellanox OFED	4.5-1.0.1.0

Sequential IOzone Performance

We ran sequential IOzone version 3.487, using the clients listed in Table 2. We ran tests from single thread up to 256 threads, with multiple threads per client past 8 threads. As per the test method the aggregate data size for the test was 2 TB. For lower thread counts lesser than 32 threads, a Lustre stripe count of 32 was used, and for thread counts greater than equal to 32, Lustre stripe count was set to 1. Caching effects were minimized as described in the previous blog.

The Lustre client side tuning parameters used for this test are listed below

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*OST*.max_rpcs_in_flight=16
lctl set_param osc.*OST*.max_dirty_mb=1024
lctl set_param osc.*.max_pages_per_rpc=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_read_ahead_per_file_mb=1024

Figure 2 : Sequential N-N Writes. A comparison of previous results with current results using Cascade Lake Lustre servers and clients

SLN317174_en_US__3image(10650)

Figure 3 : Sequential N-N Reads. A comparison of previous results with current results using Cascade Lake Lustre servers and clients

Figures 2 and 3 present the IOzone sequential read and write performance of the latest Cascade Lake based solution and compare these results to the previous Skylake based solution. Comparing with previous results, we see that there is performance improvement in sequential reads as well as writes with Cascade Lake based clients and Lustre servers for the lower thread counts below 32 threads . We can note up to slightly more than 2 times performance improvement in sequential writes as well as reads at lower thread counts below 32 threads. We believe this performance delta can be attributed to the hardware mitigations for side-channel exploits included in Cascade Lake processors (ref link). However, other contributing factors could also be faster memory in the new solution, and the updated software versions.

It can also be noted that the sequential performance at higher thread counts remains very similar to the previous solution. This is because the enhancements in Cascade Lake processors do not contribute to additional performance uplift once the solution is operating at the full potential of the backend storage controllers.

Random IOzone Performance

We ran random IOzone , version 3.487, using the clients listed in Table 2. and ran performance checks with 16, 64 and 256 threads. Similar to the previous test method, the aggregate data size was 2 TB and the stripe size was set to 4 MB. Caching effects were minimized as described in the previous blog.

The Lustre client side tuning parameters used for this test are listed below

lctl set_param osc.*OST*.max_rpcs_in_flight=256
lctl set_param osc.*.max_pages_per_rpc=1024

Figure 4 : IOzone Random N-N Reads.A comparison of previous results with current results using Cascade Lake Lustre servers and clients

Figure 4 plots the results of the random I/O tests. Comparing previous and current results we see that the trend remains the same and the performance delta observed is not statistically significant based on run to run variation.

Metadata MDtest Performance

MDTest tool version 1.9.3 was used to evaluate the metadata performance of the system. The MPI distribution used was Intel MPI. The tests were run using DNE with 2 MDTs and directory striping. The test methodology, the command used and number of files and directories created were identical to what was explained in the previous blog.

SLN317174_en_US__5image(10293)

Figure 5: Metadata operations with MDtest. A comparison of previous results with current results using Cascade Lake Lustre servers and clients

Figure 5 presents the results of the metadata tests. Comparing the current results with previous, we see that the trend for all three metadata operations remain the same. We can note a 75.4% improvement in peak file create operations, 18% drop in peak file remove operations and negligible performance delta in file stat operations. We could possibly attribute the performance deltas seen to the software and hardware updates on the solution stack as seen in Table 1.

Conclusion

We have verified and validated the updates to the Lustre Ready Solution with regards to configuration, installation and performance. Also the performance data that has been gathered is included in this blog.

Comparing previous results to current results with Cascade Lake based Lustre servers and clients

1) Sequential IO : We see up to slightly more than 2 times performance improvement with sequential writes and sequential reads at lower thread counts below 32 threads. Peak performance remains similar to the previous Skylake based solution.
2) Random IO : We can see a very similar trend in read and write performance with a performance delta not statistically significant considering run to run variation.
3) Metadata performance tests : We see an improvement in file create operations up to 75.4% at the peak. File stat operations remain very close to the results observed previously with negligible performance delta. We see about 18% drop in file remove operations at the peak while the trend in general for file remove operations remains the same and delta negligible at other thread counts.

References

1) IOzone benchmark
2) Mdtest benchmark

Affected Products

High Performance Computing Solution Resources

Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

Summary: Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

Symptoms

Cause

Resolution

Performance Results

Table 2 : Client test bed

Sequential IOzone Performance

Random IOzone Performance

Metadata MDtest Performance

Conclusion

References

Affected Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Welcome

Welcome to Dell

Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

Summary: Dell Ready Solution for HPC Lustre Storage: Cascade Lake Refresh

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

Cause

Resolution

Performance Results

Table 2 : Client test bed

Sequential IOzone Performance

Random IOzone Performance

Metadata MDtest Performance

Conclusion

References

Affected Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services