We are using persistent store as Dell ECS S3 compatible store for storing files.
While testing performance of our APIs
We found as we increase number of connections to S3 the throughput doesn’t increase numbers
Dell ECS Object Store Performance test scenarios
Number of S3 connections
Total File Transfered
Average Response Time
Throughput per second
|1||Evaluate performance of single part file uploads with S3 object store for varying file sizes|
|100||27000||1 hour 1 min|
|2||Evaluate performance of single part file uploads with S3 object store for varying file sizes|
|3||Evaluate performance of single part file uploads with S3 object store for varying file sizes|
This would be impacting the scalability of file transfer when we will add more nodes to our API then this will not give any increase in throughput.
Please suggest how we can handle the scalability of Object Store so that as soon we add new more or increase S3 connection then we should get more throughput.
Let us know if you need any more information.
Can you explain your test harness in any more detail? Note that some SDKs artificially limit the number of connections per host (sometimes even as low as two). You can see the actual number of concurrent connections to ECS by using netstat on the client system. Also, is there a load balancer involved?
What is the source of data? You could be limited by your data source.
What is your ECS configuration?
For testing performance, internally we use Mongoose: https://github.com/emc-mongoose/mongoose
Comments are inline:
1) Can you explain your test harness in any more detail?
--We are trying to build an HTTP API which upload and download files from Dell ECS S3 compatible storage. We tested Dell ECS S3 compatible storage directly to see if its is scalable by creating load from multiple Jmeter nodes and uploading multiple 10MB files to S3 store in parallel as stream. Test results shows that even if we increase connections to S3 it doesn't throughput. Which could prove a bottleneck when we build a scalable rest api in front of S3 store.
2) Note that some SDKs artificially limit the number of connections per host (sometimes even as low as two). You can see the actual number of concurrent connections to ECS by using netstat on the client system.
-- yes we could see number of actual connections using netstat are same as considered for the test.
3) Also, is there a load balancer involved?
--- No load balancer is involved.
4) What is the source of data? You could be limited by your data source.
--We are just uploading files from local file system as a octet/stream to S3 store.
5) What is your ECS configuration?
--- We are using default ECS configuration. Whats is the correct configuration for scalability?
We are testing performance of ECS from outside systems/machines.
It's pretty critical that some sort of load balancer is used so the load is spread across all of the nodes in your ECS cluster.
Uploading files from a single, local filesystem is not a good test either... It's quite possible that your local filesystem is the bottleneck. Instead, create a random 10MB byte array and use that to eliminate the disk as a bottleneck.
What is your client configuration? For our testing, we generally have 1:1 clients connected directly to the ECS switch to remove any external factors using the same network speed (e.g. 10 or 25GbE).
When I ask the configuration, I'm referring to the ECS HW. For 10MB writes, a 4-node U400 should be able to process 160MB/s (40MB/s/node) with 16 threads/node. An 8-node U4000 can process about 1.3GB/s for writes (~160MB/s/node) again with 16 threads/node. Increasing threads beyond that will have only minor increases in throughput.