Dell EMC Ready Solutions for AI – Deep Learning with NVIDA v1.1 and the corresponding reference architecture guide were released in February 2019. This blog quantifies the deep learning training performance on this reference architecture using imaging benchmarks in MLPerf suite. The evaluation is performed on up to eight nodes. As a result, Dell EMC’s scale-out solution can achieve comparable performance to other scale-up solutions for imaging models.
After the initial version 1.0 of Dell EMC Ready Solutions for AI – Deep Learning with NVIDIA was released, this solution was updated to version 1.1 in February 2019. Detailed information about the solution and the infrastructure can be found in the architecture guide "Dell EMC Ready Solution for AI – Deep Learning with NVIDIA". Briefly speaking, the major differences in the solution v1.1 are that the configuration M of GPU server updated from configuration K, and the GPU memory is increased to 32 GB from 16 GB. The MLPerf v0.6 benchmark suite is chosen to evaluate the performance of the solution. All the available MLPerf v0.6 training benchmarks are listed in Table 1, but this blog only focuses on ResNet-50, SSD and Mask-R-CNN models.
The hardware and software details used for this evaluation are summarized in Table 2.
Figure 1 to Figure 3 show the training time in minutes with C4140-M-32GB in the ready solution v1.1 with different MLPerf benchmarks. The testing was scaled from one node (4 V100) to eight nodes (32 V100). The Dell EMC Ready Solution for AI – Deep Learning with NVIDIA is a scale-out solution which can utilize more resources as more nodes are added in the solution. There is an alternate solution called scale-up solution from other vendors, which utilizes more GPUs within one server. We also compared our scale-out solution with other vendor’s scale-up solution* in these figures. The following conclusions can be made from these figures:
In this blog, we quantified the performance of the Dell EMC Ready Solution for Artificial Intelligence – Deep Learning with NVIDIA v1.1 using the latest MLPerf benchmarks. The results show that the scale-out solution can achieve comparable performance to other scale-up solutions for imaging models. And the additional EDR InfiniBand card does not have significant performance benefits.
*The data of scale-up systems was publicly available at the MLPerf v0.6 results web page.
Article ID: SLN319504
Last Date Modified: 11/18/2019 01:19 PM
Thank you for your feedback.