Web Server Sizing
By John W. Graham (Issue 3 2001)
Many factors contribute to your choice of Web server configurations and sizing. This article provides guidelines and describes tools to help you determine the appropriate Web server for your business.
Four commonly accepted bottlenecks in computing also apply to Web server sizing and capacity: network bandwidth, CPU, memory, and I/O. Another important factor to consider is content type: static or dynamic.
This article describes each type of bottleneck, shows how to quantify the bottlenecks through common formulas and calculations, and provides advice for sizing Web servers. It also discusses scalability, availability, and future planning considerations, including guidelines for selecting Dell® servers for various usage models.
Sizing the network
Network sizing provides the most formulaic approach to Web server sizing and capacity planning. The goal of network sizing is to ensure that no bottleneck occurs between the Web server, the network cards, and network and client requests/response.
Determining the size of the network is based on the server use and characteristics. For example:
- The amount/frequency of a request sent to the server (hits)
- The size of that request (client request)
- The size of the request (Web server response), total size of average Web page, including all objects
At a high level, basic network calculation can be expressed as follows:
bps = h x s
- b is the required network bandwidth per second (bits per second)
- h is the number of Web server hits per second
- s is the average size of each hit in bits
Bits and bytes
Networking components measure traffic differently from servers and storage components. Networking components usually measure traffic in bits/second, while servers and storage measure it in bytes/second. For example, a 100 Mbps network can transfer 12.5 MB per second. See calculations in Figure 1 . Figure 2 shows the utilization.
Figure 1. Calculating transfer rate for network
Figure 2. Network utilization
Finally, network overhead accounts for approximately 30 percent network utilization. This overhead results from all communications such as packet headers. Several vendors offer network card solutions that provide Transmission Control Protocol (TCP) acceleration; that is, network overhead can be off-loaded to a hardware card optimized for establishing and tearing down TCP connections.
Estimating number of hits
Accurate sizing requires a good estimate of the average number of hits per day. There are two approaches to estimating the number of hits per day. The first approach presumes that you have access to Web traffic information. In this case, calculate Web traffic and break the day into four-hour windows. Select the number of hits in the highest four-hour window. Multiply this figure by 6 (see Figure 3 ).
Figure 3. Web server traffic: 24 hours
Use the peak four-hour period as the reference point. Multiply it by 6 to determine hits per day. Then divide by 24 to determine hits per hour.
The second approach most likely applies to building an infrastructure from scratch, which requires a different set of steps. Identify the architecture of the Web site, including number of pages, objects per page, and the target customer. If it is an internal site on an intranet, your calculations can be based on the customer base and company or department size. If the Web site is an Internet site, talk with other people who have similar Web sites and can help you to estimate traffic characteristics.
Determining the processor size
Processor sizing is straightforward if steady state CPU utilization of less than 80 percent is assured. This is crucial to maintaining efficient response times, because the higher the CPU utilization, the longer the queue, and consequently, the longer the response time per request. If the CPU utilization is above 80 percent, queues grow exponentially rather than linearly. (See Little's Law for more information about queues).
Server-generated dynamic content, such as CGI or JavaTM servlets, makes it difficult to size the processor. The processing costs of this dynamic content will be high enough so that the Web server processing cost is not significant. This assumes that the Web server has been correctly configured.
Benchmarking data for the dynamic content generation processing that will be performed is essential. This data can be used to estimate the number of CPUs and CPU speed necessary to serve the dynamic content. This estimate requires an understanding of the proportion of hits that will cause dynamic content to be served to allow the average computational effort per hit to be calculated.
Sizing the memory
Knowing the resources to be used can help to determine the amount of memory required. For Web servers, consider the byte size of all the software resources running on the processor, beginning with the operating system. The Available Bytes counter can serve as a guide to the size of the unused portion of memory. Allow for 10 percent of memory to be free; this is also known as memory freespace.
As a rule of thumb, you can never have too much memory, because server performance degrades heavily when the system has insufficient memory. The amount of memory is generally a balance with the monetary cost. The availability of sufficient memory prevents the server from accessing the disk frequently. It also enhances the end-user experience while resulting in less work on the server.
If users request information that is stored in memory, that information can be retrieved directly from memory rather than accessing the disk.
To calculate memory requirements, add the required memory for the following consumers of RAM:
Operating system and Web server memory usage. See documentation for the applicable operating system and Web server to determine memory usage. In addition, remember that memory is directly affected by the number of concurrent connections.
Generating dynamic content. Generating dynamic content requires extra memory compared to serving static pages. If applicable, consider the amount of memory required to generate dynamic pages and the number of concurrent connections to estimate how much memory is required.
Calculating dynamic and static content ratios. When calculating the ratio of dynamic to static content served on a site, it is easy to overestimate the percentage of dynamic material, particularly because some content that appears dynamic is actually served as static content. A good rule of thumb is that dynamic content is loaded from a database, while static content loads from HTML, text, or other static types of files. Note that many Web sites today have a dynamic front-end and link to static pages. For example, visit www.microsoft.com and drill down a level or two and you will find Active ServerTM Pages (ASP) delivering the documents.
Financial sites that provide up-to-the-minute stock market quotations are good examples of a high dynamic content ratio. These sites typically provide 30-40 percent dynamic to 60-70 percent static content. Look closely at your site to assess the ratio of dynamic to static content.
Operating system and Web server caching. Two key types of caching are key to Web server performance: operating system file system caching and Web server caching. Be sure to allocate and account for memory for these processes. The system administrator usually controls these settings.
Putting it all together
Once the memory requirement is determined, this should be considered an absolute lower limit beyond which the server will fail. Add some contingency, then track the actual memory performance (especially the amount of disk access) when the site is implemented. Use the guidelines in Figure 4 .
Figure 4. Calculating lower limit of acceptable memory
Sizing the hard disk drive
Sizing the disk drives and number of drives is as important as determining memory and processor speed. It is important to never exceed 85 percent usage of disk drive space. Base the selection on 85 percent of used space and 15 percent free space. For example, with 8 GB of data: 8 (data size = GB) + 1.2 (15 percent free space based on 8 GB = 1.2) = 9.2. In this case, select one disk drive close to the 9.2 GB calculation or two smaller disks with capacity equal to 9.2.
Optimizing disk performance. Sizing and performance are not the same. Sizing is based on a conservative percentage of performance capabilities to allow for peaks and spikes in usage. Industry practice supports using no more than 80 percent of disk capacity and maintaining 20 percent of disk space for other network requirements. Choose the size of the disk based on the site size, considering the 80/20 ratio. However, Dell recommends employing a 60/40 ratio for multiple data centers-existing or planned for the future.
RAID (Redundant Array of Inexpensive Disks) is a disk subsystem that provides disk performance, reliability, or both. RAID is a set of two or more hard disks and a specialized disk controller that contains the RAID functionality.
The theory behind RAID is that instead of using one large drive to store all of the data, a set of smaller drives provides the flexibility to add redundancy and/or increase performance. There are many varieties of RAID, but the most commonly used RAID types are 0, 1, and 5.
RAID-0 offers better performance. RAID-0 is disk striping only, interleaving data across multiple disks for better performance. It does not provide safeguards against failure. For that reason, it is technically not "true" RAID because it does not provide fault tolerance. RAID-0 requires at least two hard disk drives (HDDs) and a RAID controller card.
RAID-1 provides high reliability. RAID-1 is disk mirroring that provides 100 percent duplication of data and provides a small performance benefit. Because both drives contain the same information, the RAID controller can read data from one drive while concurrently requesting data from the other drive. However, write speeds are slower since the controller must write all data twice. While RAID-1 offers high reliability, it doubles storage cost. The system will keep running if a hard disk fails. RAID-1 requires at least two HDDs and a RAID controller card.
RAID-5 provides both performance and fault tolerance. RAID-5 is one of the most commonly used RAID types. Data is striped across three or more drives for performance, and parity bits are used for fault tolerance. The parity bits from two drives are stored on a third drive. RAID-5 requires at least three HDDs and a RAID controller card.
Scalability, availability, and planning for growth
Business requirements increase over time. By adding processors, memory, and disks, Dell's PowerApp.web servers and general-purpose servers can scale up within each server, then scale out modularly. Businesses then have the option to scale their network by purchasing servers as needed, thereby realizing a return on investment (ROI) in a shorter time, rather than purchasing one large Web server and hoping to realize ROI.
Recent tests show that the PowerApp.web 120 Windows Powered systems were able to handle 1,110 concurrent users with 30,940 ASP request/minute with moderate load on the processors, shown in Figure 5 . See the article in Dell Power Solutions2 for details.
Figure 5. Test results from PowerApp.web 120
Dell PowerApp.cache can off-load Web servers from frequently accessed Web pages, and PowerApp.BIG-IPTM can provide functions in the areas of high availability, Secure Sockets Layer (SSL) termination, and application and content verification. To understand how the PowerAppTM family works together, see "Building a Scalable, Highly Available e-Business with PowerApp Server Appliances" in Dell Power Solutions , Issue 2, 2001.
Suggested Web servers
Dell offers several rack-optimized platforms for Web servers, shown in Figure 6 . The following overview shows Dell® platforms and factors to consider.
Figure 6. PowerEdge and PowerApp.web servers
PowerApp.web 110 and PowerApp.web 120. The PowerApp.web 110 and 120 are based on the PowerEdge® 350 and 1550 platforms, respectively. The PowerApp.web family is designed specifically for Web serving.
The family of servers include a pre-installed OS and Web server. The Kick-Start feature provides the ability to auto-discover remote PowerApp.web servers and configure them over the network. It allows a Web server to be up and running on the network in minutes. PowerApp.web servers also include a secure browser-based Admin Tool.
PowerEdge 1550. This server offers functionality beyond Web serving, such as providing both network infrastructure services and Web serving.
PowerEdge 2550. This server is ideal for customers who want additional capacity and services beyond Web serving, such as infrastructure services, redundant power supply, and/or additional PCI slots.
John W. Graham (email@example.com) is a product marketing manager in the Enterprise Systems Group at Dell, where he manages the PowerApp.web product line and Application Center 2000. Previously he worked at Microsoft on Windows 2000 Advanced Server. John holds degrees in Computer Information Science, and Law and Society.