Article Number: 124151

Dell EMC Ready Solutions for HPC Life Sciences: BWA-GATK Pipeline performance tests with BeeGFS

Article Content



The purpose of this blog is to provide valuable performance information for BWA-GATK pipeline benchmark with Dell EMC Ready Solutions for HPC BeeGFS Storage. Unfortunately, we were not able to setup enough compute nodes and BeeGFS storage large enough to compare to the previous performance results published for a Lustre storage. However, the results will be helpful to estimate the amount of computational resource for a given variant calling workload.


The test cluster configurations are summarized in Table 1.

Table 1 Tested compute node configuration

Dell EMC PowerEdge C6420


2x Xeon® Gold 6248 20 cores 2.5 GHz (Cascade Lake)


12x 16GB at 2933 MTps


Red Hat Enterprise Linux Server release 7.4 (Maipo)


Mellanox EDR InfiniBand

BIOS System Profile

Performance Optimized

Logical Processor


Virtualization Technology











The tested compute nodes were connected to the BeeGFS storage via Mellanox EDR InfiniBand switches. The BeeGFS storage is connected to a bridge EDR switch, and this bridge is connected to additional EDR switch where all compute nodes are communicating. The summary configuration of the storage is listed in Table 2.


Table 2 BeeGFS solution hardware and software specifications



Management server

1 x Dell EMC PowerEdge R640


2 x Dell EMC PowerEdge R740

Storage servers

2 x Dell EMC PowerEdge R740


Management server: Dual Intel Xeon Gold 5218

MDS and SS servers: Dual Intel Xeon Gold 6230


Management server: 12 x 8 GB 2666 MT/s DDR4 RDIMMs

MDS and SS servers: 12 x 32 GB 2933 MT/s DDR4 RDIMMs

Local disks and RAID controller

Management server: PERC H740P Integrated RAID, 8GB NV cache, 6x 300GB 15K SAS hard drives (HDDs) configured in RAID10

MDS and SS servers: PERC H330+ Integrated RAID, 2x 300GB 15K SAS HDDs configured in RAID1 for OS

InfiniBand HCA

Mellanox ConnectX-6 HDR100 InfiniBand adapter

External storage controllers

On each MDS: 2 x Dell 12 Gb/s SAS HBAs

On each SS: 4 x Dell 12 Gb/s SAS HBAs

Object storage enclosures

4 x Dell EMC PowerVault ME4084 fully populated with a total of 336 drives

Metadata storage enclosure

1 x Dell EMC PowerVault ME4024 with 24 SSDs

RAID controllers

Duplex RAID controllers in the ME4084 and ME4024 enclosures


On each ME4084 Enclosure: 84 x 8 TB 3.5 in. 7.2 K RPM NL SAS3

ME4024 Enclosure: 24 x 960 GB SAS3 SSDs

Operating system

CentOS Linux release 8.1.1911 (Core)

Kernel version


Mellanox OFED version


BeeGFS file system version

7.2 (beta2)


The test data was chosen from one of Illumina’s Platinum Genomes. ERR194161 was processed with Illumina HiSeq 2000 submitted by Illumina and can be obtained from EMBL-EBI. The DNA identifier for this individual is NA12878. The description of the data from the linked website shows that this sample has a >30x depth of coverage, and it actually reaches to ~53x.


Performance Evaluation

Multiple Sample/Multiple Nodes Performance

A typical way of running NGS pipeline is to process multiple samples on a compute node and use multiple compute nodes to maximize the throughput. The number of compute nodes used for the tests was eight C6420 compute nodes, and the number of samples per node was seven samples. Hence, up-to 56 samples are processed concurrently to estimate the maximum number of genomes per day without a job failure.

As shown in Figure 1, single C6420 compute node can process 3.69 of 50x whole human genomes per day when 7 samples are processed together. For each sample, 5 cores and 20 GB memory are allocated.


Figure 1 Throughput tests with up-to 8x C6420s with BeeGFS

56 of 50x whole human genomes can be processed with 8 of C6420 compute nodes in ~54 hours.  In other words, the performance of the test configuration summarizes as 25.11 genomes per day for whole human genome with 50x depth of coverage.



As the data size of WGS has been growing constantly. The current average size of WGS is about 55x. This is 5 times larger than a typical WGS 4 years ago when we started to benchmark BWA-GATK pipeline. The increasing data size does not strain storage side capacity since most applications in the pipeline are also bounded by CPU clock speed. Hence, the pipeline runs longer with larger data size rather than generating heavier IOs.

However, more temporary files are generated during the process due to the larger data needs to be parallelized, and this increased number of temporary files opened at the same time exhausts the open file limit in a Linux operating system. One of the applications silently fails to complete by hitting the limit of the number of open files. A simple solution is to increase the limit to >150K.

The results in Figure 1 shows that the throughput tests did not hit the maximum capacity of the system. Since there was not any sign of significant slowdown by adding more samples, it must be possible to process more than 7 samples if compute nodes are setup with larger memory. Overall, the BeeGFS storage is a suitable scratch storage for NGS data processing.

Article Properties

Last Published Date

23 Nov 2020



Article Type


Rate This Article

Easy to Understand
Was this article helpful?

0/3000 characters