How Not To Measure Efficiency
In computer science, the cost of an algorithm, or how much computing power and time it takes to run, is a central concern. As programmers and computer scientists, we find it necessary to be able to compare two algorithms to determine which has a smaller cost.
There are many less-than-adequate ways to measure the cost of an algorithm. The most common of these is to measure the real running time of the algorithm, how many seconds it takes to run. While two algorithms can be compared empirically, there are many drawbacks to and significant difficulties in doing so.
Different implementations of the same algorithm can give different empirical results. Timing results depend on the language used to write the algorithm, the compiler used to compile it, what data structures and methods the programmer used in coding the algorithm, the innate talent of the programmer, etc. Two implementations of the same “algorithm” can yield extremely different timing results.
Platform dependency is also a hurdle for empirical data. Let’s say I tell you that algorithm 1 ran in 10 seconds on computer 1 and algorithm 2 ran in 20 seconds on computer 2. Which algorithm is better? If you’re able to give me an answer, think again. I haven’t told you anything about either machine. One of them could be using a 25Mhz processor while the other could be using a 1000 MHz processor. One of them could be using a RISC chip while the other could be using a CISC chip (if this doesn’t make any sense to you, don’t worry about it). One of the machines could have many users using it concurrently while the other’s resources could be allocated exclusively for this algorithm.
“But wait” you say, “why can’t we just run both algorithms on the same machine. Won’t this solve the problem?” Yes. It will solve THIS problem. But there are others.
Algorithms do something. That may seem like a simple and dumb statement, but it really isn’t. The purpose of an algorithm is to solve some problem, to do something. But how big is this problem? In other words, what is the input size? Certain algorithms may run better on different sized inputs. Let’s say we have two sorting algorithms, and we run them both on the same machine. We have algorithm 1 sort 100 data elements, and it takes 100 seconds. We have algorithm 2 sort 100 data elements and it takes 200 seconds. So is algorithm 1 better? Now lets run them both on 1,000 data elements. Algorithm 1 takes 10,000 seconds and algorithm 2 takes 2000 seconds. What happened? Is algorithm 2 now better? As you can see, the ratio of their running times depended on the input size.