Dell EMC Ready Solution for HPC Lustre Storage : Cascade Lake Refresh


Dell EMC Ready Solution for HPC Lustre Storage : Cascade Lake Refresh


Article written by Jyothi Bhaskar of HPC and AI Innovation Lab in June 2019



With this blog we announce the availability of Dell EMC Ready Solution for Lustre with Cascade Lake processors. In this blog we present the updated technical specifications of the Lustre solution, initial performance results of the updated solution, and a comparison between the current results and the previous results . We configured the solution stack with new updates as presented in Table 1 with EDR interconnect ,verified that the installation worked as expected and ran performance checks.

The architecture diagram for the Large base configuration is shown below in Figure 1. For other available base configurations and scaling of these configurations please refer to the blog linked here.
Please note that the server and storage models remain the same as presented earlier. Only the new updates are shown in Table 1.

Figure 1 : DellEMC Ready Solution for HPC Lustre Storage : Architecture diagram of L base configuration

Table 1 : Updated technical specifications of the Ready Solution for Lustre and a quick comparison with the previous release

Hardware/Software Component Current Previous
Processors in OSS and MDSObject Storage Server ( OSS) and Metadata Server ( MDS) 2 x Intel Xeon Gold 6230 CPU with 20 cores @ 2.10GHz per OSS/MDS 2 x Intel Xeon™ Gold 6136 with 12 cores @ 3.00GHz
Processor in Integrated Manger for Lustre ( IML ) server 2 x Intel Xeon Gold 5218 with 16 cores @ 2.3GHz 2 x Intel Xeon Gold 5118 with 12 cores @ 2.3GHz
Memory DIMMs in OSS and MDS 12 x 32 GiB 2933 MT/s DDR4 RDIMMs 24 x 16GiB 2666MT/s DDR4 RDIMMs
Memory DIMMs in IML server 12 x 8GiB 2666MT/s DDR4 RDIMMs 12 x 8GB 2666MT/s DDR4 RDIMMs
BIOS 2.1.8 or later 1.4.5 or later
OS Kernel 3.10.0-957.1.3 3.10.0-862
Lustre Version 2.10.7 2.10.4
IML version 4.0.10.0 4.0.7.0
Mellanox OFED versi 4.5-1.0.1.0 4.4-1














Performance Results


We configured the updated Ready Solution as listed in Table 1 and ran performance checks with IOzone sequential, IOzone random and MDtest benchmarks to verify the performance of the updated solution. The test methodology including the benchmark commands for all tests was identical to the method used and described previously.

For all tests we used the client test bed as described in the Table 2 below

Table 2 : Client test bed

Number of Client nodes 8
Client node C6420
Processors per client node 2 x Intel(R) Xeon(R) Gold 6248 with 20 cores @ 2.50GHz
Memory per client node 12 x 16GiB 2933 MT/s RDIMMs
BIOS 2.2.6
OS Kernel 3.10.0-957.10.1
Lustre version 2.10.7
Mellanox OFED 4.5-1.0.1.0











Sequential IOzone Performance

We ran sequential IOzone version 3.487, using the clients listed in Table 2. We ran tests from single thread up to 256 threads, with multiple threads per client past 8 threads. As per the test method the aggregate data size for the test was 2 TB. For lower thread counts lesser than 32 threads, a Lustre stripe count of 32 was used, and for thread counts greater than equal to 32, Lustre stripe count was set to 1. Caching effects were minimized as described in the previous blog.

The Lustre client side tuning parameters used for this test are listed below

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*OST*.max_rpcs_in_flight=16
lctl set_param osc.*OST*.max_dirty_mb=1024
lctl set_param osc.*.max_pages_per_rpc=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_read_ahead_per_file_mb=1024





Figure 2 : Sequential N-N Writes. A comparison of previous results with current results using Cascade Lake Lustre servers and clients





Figure 3 : Sequential N-N Reads. A comparison of previous results with current results using Cascade Lake Lustre servers and clients


Figures 2 and 3 present the IOzone sequential read and write performance of the latest Cascade Lake based solution and compare these results to the previous Skylake based solution. Comparing with previous results, we see that there is performance improvement in sequential reads as well as writes with Cascade Lake based clients and Lustre servers for the lower thread counts below 32 threads . We can note up to slightly more than 2 times performance improvement in sequential writes as well as reads at lower thread counts below 32 threads. We believe this performance delta can be attributed to the hardware mitigations for side-channel exploits included in Cascade Lake processors ( ref link ). However, other contributing factors could also be faster memory in the new solution, and the updated software versions.

It can also be noted that the sequential performance at higher thread counts remains very similar to the previous solution. This is because the enhancements in Cascade Lake processors do not contribute to additional performance uplift once the solution is operating at the full potential of the backend storage controllers.



Random IOzone Performance

We ran random IOzone , version 3.487, using the clients listed in Table 2. and ran performance checks with 16, 64 and 256 threads. Similar to the previous test method, the aggregate data size was 2 TB and the stripe size was set to 4 MB. Caching effects were minimized as described in the previous blog.

The Lustre client side tuning parameters used for this test are listed below

lctl set_param osc.*OST*.max_rpcs_in_flight=256
lctl set_param osc.*.max_pages_per_rpc=1024



Figure 4 : IOzone Random N-N Reads.A comparison of previous results with current results using Cascade Lake Lustre servers and clients

Figure 4 plots the results of the random IO tests. Comparing previous and current results we see that the trend remains the same and the performance delta observed is not statistically significant based on run to run variation.



Metadata MDtest Performance


MDTest tool version 1.9.3 was used to evaluate the metadata performance of the system. The MPI distribution used was Intel MPI. The tests were run using DNE with 2 MDTs and directory striping. The test methodology, the command used and number of files and directories created were identical to what was explained in the previous blog.

Figure 5: Metadata operations with MDtest. A comparison of previous results with current results using Cascade Lake Lustre servers and clients

Figure 5 presents the results of the metadata tests. Comparing the current results with previous, we see that the trend for all three metadata operations remain the same. We can note a 75.4% improvement in peak file create operations, 18% drop in peak file remove operations and negligible performance delta in file stat operations. We could possibly attribute the performance deltas seen to the software and hardware updates on the solution stack as seen in Table 1.

Conclusion

We have verified and validated the updates to the Lustre Ready Solution with regards to configuration, installation and performance. Also the performance data that has been gathered is included in this blog.

Comparing previous results to current results with Cascade Lake based Lustre servers and clients

1) Sequential IO : We see up to slightly more than 2 times performance improvement with sequential writes and sequential reads at lower thread counts below 32 threads. Peak performance remains similar to the previous Skylake based solution.
2) Random IO : We can see a very similar trend in read and write performance with a performance delta not statistically significant considering run to run variation.
3) Metadata performance tests : We see an improvement in file create operations up to 75.4% at the peak. File stat operations remain very close to the results observed previously with negligible performance delta. We see about 18% drop in file remove operations at the peak while the trend in general for file remove operations remains the same and delta negligible at other thread counts.

References

1) Lustre Ready Soultion blog
2) Lustre Ready Solution White paper
3) Lustre Scalabilty blog
4) IOzone benchmark
5) Mdtest benchmark








Quick Tips content is self-published by the Dell Support Professionals who resolve issues daily. In order to achieve a speedy publication, Quick Tips may represent only partial solutions or work-arounds that are still in development or pending further proof of successfully resolving an issue. As such Quick Tips have not been reviewed, validated or approved by Dell and should be used with appropriate caution. Dell shall not be liable for any loss, including but not limited to loss of data, loss of profit or loss of revenue, which customers may incur by following any procedure or advice set out in the Quick Tips.

Article ID: SLN317174

Last Date Modified: 07/13/2019 11:31 AM

Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please select whether the article was helpful or not.
Comments cannot contain these special characters: <>()\
characters left.