Monday 3 May 2010

1. How Much Bandwidth Do You Need?

Server bandwidth is the single most important factor in the performance of your web site. The math to determine what bandwidth you need is, in essence, very simple:

hits/second * average size of a hit in bits = bits/second

That is, you need some estimate of the number of hits per second you want to be able to serve. Then you need to know the average size of one of these hits. From this, you know what sort of network bandwidth you need.

1.1. Latencies Are More Important than Bandwidth

It has become clear that the number of packets is a more significant determinant of web performance than raw bandwidth once users are beyond ordinary dial-up modems. This is because each packet must be acknowledged, and the speed of light fixed, while bandwidth is increasing. It may take 20 milliseconds to send a 1500- byte packet to a PC on a DSL line, but only 12 milliseconds to get it from the network into the PC. It will take another 20 milliseconds for the acknowledgment to get back to the sender. So the 40 milliseconds latency is more than three times as important as bandwidth in this case, and it will only get more important later.

This is why it is so important to keep the number of individual items on a page to a minimum. Still, because most browsers are multithreaded, some latencies can happen in parallel. It turns out through experimentation that the best number of embedded images on a page is about the same as the number of threads the browser uses. For example, Netscape uses four threads, and you may get best performance by breaking a single large image into four smaller ones, in which acknowledgments can proceed in parallel rather than being strictly serial. But this holds only where the browser uses HTTP persistent connections ("keepalives") to avoid the overhead of setting up a TCP connection for each of the four smaller images.

Here are some numbers to help think about latency. While the latency from CPU to memory is on the order of 100 nanoseconds, LAN latency is usually about 1 millisecond, or 10,000 times slower. Going across a campus with its own network, latencies are about 5 milliseconds. Going across the Internet, latencies range from 10 to 500 milliseconds. Satellite links can take a whole second or more.

1.2. Thinking About Bandwidth

You can get some perspective when thinking about bandwidth from the Table Note that this chart uses the decimal million (1000000) and not "mega," which is 220 = 1048576.

Bandwidth comparison
Mode of data transfer Million bits/second Comments
Fast typist 0.000035 70 words/min × 5 chars/word × 6 bits/char × M/106 × min/60 seconds.
4800 bps modem 0.004800 4800 bits/sec × M/106. This is also about maximum human reading speed.
POTS sampling rate 0.064000 Voice over plain old telephone service is sampled at 64 kbps.
ISDN, two channels bonded 0.128000
One million 10000-byte web hits in one day, evenly distributed 0.925925 106 hits/day × 80Kbits/hit × day/86400 seconds × M/106.
Audio CD 1.411200 44100 samples/second × 16 bits/sample × 2 (stereo) × M/106.
T-1 (DS-1 or primary ISDN) 1.544000 Carries 24 POTS channels with 8 kbps overhead.
Ethernet 10.00000
Token Ring 16.00000
IDE Hard disk 16.00000 Sustained throughput.
T-3 (DS-3) 44.60000 672 DS-0's, 28 DS-1's, or 7 DS-2's.
FDDI and Fast Ethernet 100.0000
ISA Bus 128.0000 16 bits @ 8 MHz.
Broadband ISDN 135.0000
ATM 154.0000
250 million 10000-byte web hits in one day, evenly distributed 231.4812 About one hit for every U.S. citizen.
EISA bus 264.0000
Wide Ultra SCSI disk controller 320.0000
100-nanosecond RAM 320.0000 32 bits / (100 × 10-9) seconds × M/106.
Gigabit Ethernet 1000.000
PCI Bus 2112.000 64 bits @ 33 MHz.
AT&T Sonet 2400.000 Long-distance fiber trunk.
CPU 3200.000 Hypothetical CPU processing 32-bit instructions at 100Mhz, one per clock cycle.
Human eye-to-brain rate 5600.000 Estimated.
Highest achieved fiber optics 16000.00 Bell Labs.
Theoretical fiber capacity 64000.00

This chart ignores latency, which varies even from bit to bit, and can be huge, especially upon startup of any component. If you're a bit romantic, you can imagine a blurry picture of virtual reality coming into focus over the years in this chart, from short symbol transmissions twenty years ago, to the limit of human audio perception today, to the limit of visual perception in the coming years.

1.3. Estimating Web Server Network Bandwidth

The following chart displays an estimate of the number of hits per second of a given size (y-axis) a given amount of bandwidth (x-axis) can handle with a 30 percent deduction for TCP/IP and other network overhead. Numbers are truncated to integers, so 0 means "less than one per second" rather than truly zero.

Web server bandwidth requirements
Hit Size 28.8K 56k ISDN(2) T1 10bT T3 100bT Ethernet
1 KB 2 4 10 132 854 3845 8544
2 KB 1 2 5 66 427 1922 4272
4 KB 0 1 2 33 213 961 2136
8 KB 0 0 1 16 106 480 1068
16 KB 0 0 0 8 53 240 534
32 KB 0 0 0 4 26 120 267
64 KB 0 0 0 2 13 60 133
132 KB 0 0 0 1 6 30 66
264 KB 0 0 0 0 3 15 33
512 KB 0 0 0 0 1 7 16
1 MB 0 0 0 0 0 3 8
2 MB 0 0 0 0 0 1 4

You can use the table to estimate, for example, how many 4K files per second your T1 line can handle. The answer is 33. Keep in mind that the table refers just to the network capacity and does not say whether the load was generated by static HTML or CGIs. That is, network capacity says nothing about the capacity of your disk or server CPU, or of a database behind the web server.

In fact, the capacity of a server to fill a network connection is distinctly nonlinear. Smaller packets require a larger overhead in terms of interrupts and packet header processing. This means that sending two packets will require more of the server than combining them into one larger packet.

The table is also a bit deceptive in that you will rarely see a smooth distribution of hits filling your network capacity. Rather, there will be peaks of three to four times the average rate per second and long troughs of no load at all.

To scale any of these connection types, you can add more network cards or modems until you reach the maximum number of cards or modems the server has room for. Then you can move up to the next connection type or to a bigger server. This is the easy way to do things, throwing more hardware into a single server. Scaling across multiple servers is typically more complicated, requiring load balancing strategies.

Tuesday 27 April 2010

O-Notations?

O -notation is the most common notation used to express an algorithm's performance in a formal manner. Formally, O -notation expresses the upper bound of a function within a constant factor. Specifically, if g (n) is an upper bound of f (n), then for some constant c it is possible to find a value of n, call it n 0, for which any value of nn 0 will result in f (n) ≤ cg (n).

Normally we express an algorithm's performance as a function of the size of the data it processes. That is, for some data of size n, we describe its performance with some function f (n). However, while in many cases we can determine f exactly, usually it is not necessary to be this precise. Primarily we are interested only in the growth rate of f, which describes how quickly the algorithm's performance will degrade as the size of the data it processes becomes arbitrarily large. An algorithm's growth rate, or order of growth, is significant because ultimately it describes how efficient the algorithm is for arbitrary inputs. O -notation reflects an algorithm's order of growth.

1. Simple Rules for O-Notation

When we look at some function f (n) in terms of its growth rate, a few things become apparent. First, we can ignore constant terms because as the value of n becomes larger and larger, eventually constant terms will become insignificant. For example, if T (n) = n + 50 describes the running time of an algorithm, and n, the size of the data it processes, is only 1024, the constant term in this expression already constitutes less than 5% of the running time. Second, we can ignore constant multipliers of terms because they too will become insignificant as the value of n increases. For example, if T 1(n) = n 2 and T 2(n) = 10n describe the running times of two algorithms for solving the same problem, n only has to be greater than 10 for T 1 to become greater than T 2. Finally, we need only consider the highest-order term because, again, as n increases, higher-order terms quickly outweigh the lower-order ones. For example, if T (n) = n 2 + n describes the running time of an algorithm, and n is 1024, the lesser-order term of this expression constitutes less than 0.1% of the running time. These ideas are formalized in the following simple rules for expressing functions in O -notation.

  • Constant terms are expressed as O (1). When analyzing the running time of an algorithm, apply this rule when you have a task that you know will execute in a certain amount of time regardless of the size of the data it processes. Formally stated, for some constant c:

    O(c) = O(1)

  • Multiplicative constants are omitted. When analyzing the running time of an algorithm, apply this rule when you have a number of tasks that all execute in the same amount of time. For example, if three tasks each run in time T (n) = n, the result is O (3n), which simplifies to O (n). Formally stated, for some constant c:

    O(cT) = cO(T) = O(T)

  • Addition is performed by taking the maximum. When analyzing the running time of an algorithm, apply this rule when one task is executed after another. For example, if T 1(n) = n and T 2(n) = n 2 describe two tasks executed sequentially, the result is O (n) + O (n 2), which simplifies to O (n 2). Formally stated:

    O(T 1)+O(T 1+T 2) = max (O(T 1), O(T 2))

  • Multiplication is not changed but often is rewritten more compactly. When analyzing the running time of an algorithm, apply this rule when one task causes another to be executed some number of times for each iteration of itself. For example, in a nested loop whose outer iterations are described by T 1 and whose inner iterations by T 2, if T 1(n) = n and T 2(n) = n, the result is O (n)O (n), or O (n 2). Formally stated:

    O(T 1)O(T 2) = O(T 1 T 2)

2. O-Notation Example and Why It Works

The next section discusses how these rules help us in predicting an algorithm's performance. For now, let's look at a specific example demonstrating why they work so well in describing a function's growth rate. Suppose we have an algorithm whose running time is described by the function T (n) = 3n 2 + 10n + 10. Using the rules of O -notation, this function can be simplified to:

O(T(n)) = O(3n 2 + 10n + 10) = O(3n 2) = O(n 2)

This indicates that the term containing n 2 will be the one that accounts for most of the running time as n grows arbitrarily large. We can verify this quantitatively by computing the percentage of the overall running time that each term accounts for as n increases. For example, when n = 10, we have the following:

Running time for 3n 2: 3(10)2/(3(10)2 + 10(10) + 10) = 73.2%
Running time for 10n: 10(10)/(3(10)2 + 10(10) + 10) = 24.4%
Running time for 10: 10/(3(10)2 + 10(10) + 10) = 2.4%

Already we see that the n 2 term accounts for the majority of the overall running time. Now consider when n = 100:

Running time for 3n 2: 3(100)2/(3(100)2 + 10(100) + 10) = 96.7% (Higher)
Running time for 10n: 10(100)/(3(100)2 + 10(100) + 10) = 3.2% (Lower)
Running time for 10: 10/(3(100)2 + 10(100) + 10) <>
Here we see that this term accounts for almost all of the running time, while the significance of the other terms diminishes further. Imagine how much of the running time this term would account for if n were 106

Java or C++ ?

i believe in honesty, not advocacy, and the honest answer is that there is no single answer. What is the organization's policy regarding languages? Must there be one and only one official language? What is the skill level of the staff? Are they gung-ho developers with advanced degrees in computer science/engineering or people who understand the business and have survival skills in software? Do they already have OO skills? In which language(s)? What sort of software development is being done: extending someone else's framework or building from scratch for resale? What sort of performance constraints does the software have? Is it space constrained or speed constrained? If speed, is it typically bound by I/O, network, or CPU? Regarding libraries and tools, are there licensing considerations? Are there strategic partnership relationships that affect the choice of languages? Many of these questions are nontechnical, but they are the kind of questions that need to be answered before the “which language” issue can be addressed.

Regarding the choice between C++ and Java, Java is a simpler language and thus it is generally easier to use. However C++ is more established and allows finer control over resources (for example, memory management), and this is required for some applications. Also, C++ has language features such as destructors, conversion operators and operator overloading that can be intimidating to many developers but are essential for building frameworks that hide complexity from users. Building the same framework in Java will necessarily expose some internal complexity to the developers who need to extend the framework. So C++ will always be needed, but it isn't the only answer and it isn't always the best answer.

The messages can be summarized neatly. (1) Avoid language bigotry and language wars. (2) Never, never present language trade-offs to management exclusively using technical criteria—when choosing between languages, the most important issues are typically business issues, not technical issues. (3) Don't assume that the organization wants to be on the leading edge. (4) Don't assume that the organization has a primary goal of furthering the careers of its developers.