mmr11408
1 Nickel

S3 vs. REST performance

Jump to solution

I am evaluating use of S3 vs. REST to interface to ATMOS and see a significant differences. The S3 API uses jets3t and HTTPS. The REST API uses HTTP.

1000 files (all 100 bytes each) archived onto the same EMC ATMOS via:

S3:      1061 MS per file

REST:  179 MS per file

Is that expected? Has anyone experienced a different performance and/or has suggestions for improvements?

Labels (1)
0 Kudos
1 Solution

Accepted Solutions
8 Krypton

Re: S3 vs. REST performance

Jump to solution

The S3 service running on Atmos hardware was written as a compatibility layer on top of the native Atmos API service, so it will be inherently slower.  To what degree largely depends on the workload and usage pattern.

Generally speaking, we recommend that anyone creating new applications choose the S3 API because of its proliferation (it is the most popular object protocol by a large margin).  However, as I mentioned, you may see a slight performance drop on Atmos hardware.  So you have to decide what is most important to you.

If you are planning on migrating to ECS, definitely pick S3.  If you have a large scale application and do not foresee moving to another platform any time soon, pick Atmos REST and use objectspace (Atmos-generated object IDs).  Note that ECS also supports this API, so migrating is still possible.

If you decide to use S3, there are a few things you can check that may improve performance.  First, if you don't need traffic encryption, you might consider using HTTP instead of HTTPS (S3 runs on port 8080 on Atmos).  Next, definitely check that you have HTTP keep-alive enabled on the server and client.  If you're using a load balancer, make sure it is enabled there as well.  This is especially important when using SSL/HTTPS.

If you still have performance issues or further questions, reply here or feel free to email the customer advisory team as well (ecs.customer.advisory.team@emc.com).

0 Kudos
3 Replies
8 Krypton

Re: S3 vs. REST performance

Jump to solution

The S3 service running on Atmos hardware was written as a compatibility layer on top of the native Atmos API service, so it will be inherently slower.  To what degree largely depends on the workload and usage pattern.

Generally speaking, we recommend that anyone creating new applications choose the S3 API because of its proliferation (it is the most popular object protocol by a large margin).  However, as I mentioned, you may see a slight performance drop on Atmos hardware.  So you have to decide what is most important to you.

If you are planning on migrating to ECS, definitely pick S3.  If you have a large scale application and do not foresee moving to another platform any time soon, pick Atmos REST and use objectspace (Atmos-generated object IDs).  Note that ECS also supports this API, so migrating is still possible.

If you decide to use S3, there are a few things you can check that may improve performance.  First, if you don't need traffic encryption, you might consider using HTTP instead of HTTPS (S3 runs on port 8080 on Atmos).  Next, definitely check that you have HTTP keep-alive enabled on the server and client.  If you're using a load balancer, make sure it is enabled there as well.  This is especially important when using SSL/HTTPS.

If you still have performance issues or further questions, reply here or feel free to email the customer advisory team as well (ecs.customer.advisory.team@emc.com).

0 Kudos
mmr11408
1 Nickel

Re: S3 vs. REST performance

Jump to solution

Thanks. Ran the test with HTTP, bypassing LB, and experienced better results (roughly 800MS). But this is still 3+ times the REST interface.

In the log I see:

"Keep-Alive: timeout=15, max=100, Connection: Keep-Alive"

What is the recommended value for timeout?

How is keep-alive set on the client using jets3t (I tried a few httpclient properties combinations with no avail)?

0 Kudos
8 Krypton

Re: S3 vs. REST performance

Jump to solution

Can you try using a different tool (like s3cmd or awscli) and see if you get the same response times?  Are you using the IP of the node or a host name?  SSL or non-SSL?  Are you constructing a new instance of the jets3t client for each upload or re-using a global client instance?  These all can affect application response time.  If possible, enable debug to see the headers and get the timestamps of the request/response.  This will tell you if the delay is local or not.

"log4j.logger.org.apache.http.headers=DEBUG" in the log4j.properties file will put the headers in the log.

0 Kudos