How many calculations can a computer chip perform per second? A friend needs to know for an article.
Posted by Jane Galt at December 9, 2003 6:06 PM | TrackBack | Technorati inbound linksI think you're going to need to whittle that question down just a smidge. It's kind of like asking how much horsepower an engine can generate. Simple question. Complex answer.
Depends on the computer. My old Apple IIgs could do 0.07.
Your friend might find this link helpful. Lists a whole bunch of computers in calcs per second form.
http://www.innovationwatch.com/waves_digit_time.htm
For example: The Intel Pentium IV chip performs 42 million calculations per second
Your friend might find this link helpful. Lists a whole bunch of computers in calcs per second form.
http://www.innovationwatch.com/waves_digit_time.htm
For example: The Intel Pentium IV chip performs 42 million calculations per second
The answer depends on the clock speed of the processor (such as 1GHz, 2.2GHz, etc.) and the average number of instructions that can be completed per clock cycle (also known as IPC). The two numbers are multiplied together.
The IPC depends on a lot of factors: what type of software you are running, how well the cpu is at deciding what will happen next, but mainly on the number of times it has to get information from memory instead of its registers. This occurence can be referred to as a cache miss, and will be calculated as a probability.
All this is calculated as a MIPS number (millions of instructions per second). The manufacturer of a processor will usually have this in their specifications. You can only trust the manufacturer's numbers so much though--they'll over-exaggerate.
To complicate it even futher, you also have different types of computer instructions, such as floating-point operations (FLOPS) which are usually calculated in different parts of the processor often at the same time but at a different rate...
So the answer is that there isn't really a general answer to this question because it varies with every computer.
As other comments have said, it all depends on the processor.
The Virginia Tech supercomputer, built entirely of Power Macs, goes at 9.55 trillion operations a second:
http://www.wired.com/news/mac/0,2125,61005,00.html
By comparison, my poor HP Athlon can only get a little over 4000 mips. The Athlon is close to the Intel P3, I think.
A good order of magnitude answer for current technology is about a billion, depending on your definition of every other word in the sentence (especially calculation and processor). If you're talking state of the art, then it could be 10 or 100 billion per second, again, depending on how persnickity you set your definitions. Most current processors can theoretically perform several operations per clock cycle, and reach not too too far from that optimum in actual operation. So with 2-3 GHz processors, and a few operations per second, you end up with maybe 10 billion operations per second, give or take.
Note that Zygote's link either contains erroneous information or contains rather a lot of typos (perhaps they mean billion instead of million).
What, nobody's taking advantage of this teachable moment???
A CPU's speed (usually in gigaHertz, or billions of cycles per second, these days) refers to the numbers of cycles per second the CPU goes through. Back when I was in school, things were pretty simple, because the CPU and motherboard ran on the same cycle, and the shortest operation took one cycle (some ops took 3 or 4 cycles, as I recall). An operation is something like "transfer this number from here to a register", "add these two registers", or "transfer this register to that place in memory".
So, my old Apple II, which ran at 1.0 megaHertz (1,000,000 cycles per second), was probably doing around 400k operations per second.
Nowadays, the CPU runs faster than the motherboard, so (as Travis says above), when the CPU needs to access regular memory, it has to wait a while. I think a typical motherboard bus runs at 200 megaHertz to support a 2.4 gigaHertz CPU, for example. With the CPU running 10 times the speed of the mobo, you can see a lot will depend on how often the CPU needs to access system memory, and how often the number it needs is available in local cache. If the CPU constantly needs to access memory, it would handle only 10% of the operations it would if it always has the values in cache.
And, to make it a bit more complicated, Pentium chips have added capabilities to perform operations in fewer cycles, but I'm not up on that these days.
Now, don't we all feel educated? (Savor the moment before people start correcting me.)
Another problem: Megan asked about "calculations" rather than operations. When you talk about, say, multiplying two numbers together, that's one calculation but it's many many operations. Zygote's link talks about "calculations per second", but I can't find any backup on the number displayed. (42 million calculations per second sounds about right, but I'm suspicious because there are 42 million transistors in the Pentium IV; an odd coincidence?)
Actual results using QuickBASIC4 program below.
This is an old DOS program, but the ~10:1 increase in processor speed only gives 4:1 calculation speed improvement.
I would expect that a modern JAVA or C program would be considerably faster.
Intel 2GHz Celeron: 1.14 secs/million floating point operations [megaflops]
Intel 230MHz Pentium: 4.28 secs/megaflops
DIM starttim AS SINGLE
DIM endtim AS SINGLE
DIM j AS INTEGER
DIM k AS INTEGER
CLS
PRINT TIME$
starttim = TIMER
FOR j = 1 TO 10000
FOR k = 1 TO 10000
x = COS(k)
NEXT k
NEXT j
endtim = TIMER
PRINT TIME$
PRINT "100 megaflops took "; endtim - starttim; " seconds."
BEEP
END
I think your friend is asking the wrong question.
Modern microprocessors don't execute instructions directly the way old ones did, where each reqested operation corresponded to a precise number of steps in the CPU, performed in the order that each request arrived. Rather, the instructions -- which are still with us for compatibility reasons -- are broken down into a simpler, internally-recognizable format (decoded), re-ordered for the best possible execution efficiency given the requested commands and the processor's available execution resources (scheduled), actually manipulated (executed), and then finally completed (retired) at which stage the output is delivered.
The Pentium4 complicates things a bit by using a small "trace cache" at the decode stage, but I'm already beyond the scope of the question. Point being, stating a "calculations per second" figure is correspondingly hard to answer because it may vary substantially depending on the code being fed into the machine.
If your friend can change his/her phraseology to "operations per second," s/he will probably have much better luck finding a good citation by simply performing a
site:intel.com pentium 4 operations per second
search, or similar, in Google.
Jane, Jane, Jane...
Type "calculations per second" into fricking Google!!!
The third match will work just fine.
I'm thinking about starting a blog so sycophantic twentysomething males can drool over my picture and I can have my readers do my homework.
Free Mindles!!!
AUTHORITATIVE ANSWER:
(I work on the Pentium 4 Processor professionally.)
The question is underspecified because there are many different kinds of calculations, as others have already pointed out. Since this is for "an article" I assume your friend is just looking for the most impressive-looking number.
Ignoring all regard for measuring the amount of useful work accomplished and calculating the theoretical top sustainable throughput of the simplest possible operations, the answer is 9.6
billion calculations per second (for a 3.2GHz Pentium 4 Processor.)
I stress that this number is nearly devoid of useful information. You cannot use it meaningfully in any comparisons. It is not a predictor of the performance of useful workloads. It should only be used for the purpose of getting Ooohs and Aaahs from people who do not care about the details.
It's quite possible that there's a DSP (digital signal processor) somewhere that has a higher calculation rate, but that would be a special-purpose device and I assume not what your friend is interested in.
Alternatively, if you want to consider useful calculations, you could use figures form the G5 supercomputer, which has 1100 Mac G5 2GHz duals in and does 10.28 trillion calcuations/second which gives a remarkably similar 9.3 billion per second.
9 billion is a fine number to use -
remember Clarke's 'The 9 billion Names of God'?
All this is calculated as a MIPS number (millions of instructions per second).
Oh, sure, that's what they say. But we all really know it really means Meaningless Indication of Processor Speed
I'm thinking about starting a blog so sycophantic twentysomething males can drool over my picture and I can have my readers do my homework.
Well, what's stopping you?
I believe the question involved calculations performed by a computer chip, which pretty much disqualifies the cluster machines cited above.
And, as has been pointed out, that number can vary quite a bit. PCs are, as also has been pointed out, limited in their speed and access to memory. With mainframes, this is a bit different, so a "slower" mainframe might actually have a much faster throughput. So, obviously the higher clock-rate could win a contest where a single calculation is done, but lose big where it comes to a stream of calculations.
I suggest this site as a place where you can compare processors.
As was also mentioned (albeit in passing), there's a number of things that the processor has to do before it can actually perform a calculation. First, it has to fetch an instruction from memory. Second, it has to execute that instruction, which may involve multiple (dunno what the maximum is now; I cut my assembly-language teeth on an 8080) instructions, and then place the result in memory. So, it has to access memory at least twice, and execute several clock-cycles' worth machine code. You're going to have at least 2 times when you're hitting the memory off-cycle and have to wait for it, followed by the write time delay and bus propagation delay...
"Underspecified" questions leading to "Overexaggerated" results?
There can be only one solution. Poetry.
More than the grains of rice in a rich man's bowl.
Most than the leaves in summer upon the trees of a dozen forests.
But fewer than the number of errors a Congressman can committ within a single term of office and still be re-elected.
AUTHORITATIVE ANSWER:
" x" y ((x is even Ù y is even) ® (x + y is even))
x = 2m where m is an integer [x is even]
y = 2n where n is an integer [y is even]
x + y = 2m + 2n [substituting the definitions from the first two lines]
x + y = 2(m + n) where m + n is an integer [factoring out 2]
2 | x + y ["2 divides x + y"]
so x + y is even [the conclusion]
Here's a more algebraic method:
(p ® q) Ù (p ® q') ® p'
(p' Ú q) Ù (p' Ú q') ® p'
(p' Ù (p' Ú q')) Ú (q Ù (p' Ú q')) ® p'
(p' Ù p') Ú (p' Ù q') Ú (q Ù p') Ú (q Ù q') ® p'
T Ú (p' Ù q') Ú (q Ù p') Ú F ® p'
(p' Ù q') Ú (q Ù p') ® p'
p' Ú (q' Ù q) ® p'
p' Ú F ® p'
p' ® p'
p'
In the poetry of the "Pouncer" I have finally found one more cynical than I. Thank you.
Ed, I think you misplaced your quotation marks. It should read:
In the "poetry" of the Pouncer.
If you are wondering how fast they can actually get transistors to run, they are up tp 509GHz
http://www.news.uiuc.edu/scitips/03/1106feng.html
It will be a while before procesors are going that fast though.
I am really puzzled that this question was asked...seeems easy to google up.
"Calculations" can mean many, things, but probably the most relevant is "Floating Point Operations" (FLOPS). You get faster results if you stick to integer calculations in the size most favored by the machine, but that's far less useful than floating point calculations. There are benchmarks for measuring FLOPS - but you really don't know until you run the particular code on a particular machine. Often it takes longer to read the operands and store the results than to do the calculation, so the memory speed, pattern of data storage in memory, and the operation of the cache have larger effects in MMX Pentiums than the speed of the arithmetic circuits. Also watch out for compilers that recognize the most common benchmark programs and short-cut the process - e.g., instead of multiplying 1.23456E30 *9.0001E-16 a billion times, it calculates once and recalls this result 999,999,999 times... Or it rearranges a long series of calculations so all the data can be held in registers. The "9.6 billion calculations" mentioned above is certainly holding all the data in registers (so it's not applicable to real-world requirements for billions of different numbers), and might be integer rather than floating point too.
For a given number of transistors, a DSP will beat the hell out of a general purpose CPU chip on array calculations - because the DSP is optimized just for doing arithmetic. Also, the DSP is designed to pump array data from memory through the arithmetic unit and back to memory as efficiently as possible, maybe even using multiple address/data busses. But the DSP sucks at other normal CPU work like servicing I/O devices, running the operating system, making decisions, and slinging strings around. Pentium 4's might beat a DSP in FLOPS, but that's because DSP design has lagged behind. There isn't much call for a DSP that is priced like a P4.
Yeah, markm, I'm sure that's the answer she was looking for.
The real question is: How much wood can a woodchuck chuck if a woodchuck could chuck wood?
How many boards
would the Mongols hoard
if the Mongol hordes
got bored?
- Calvin (& Hobbes)
A woodchuck would chuck as much wood as a woodchuck could chuck if a woodchuck could chuck wood.
BTW, a processor spends a lot of time moving info in and out of registers before it performs an actual calculation. There are also pipelines, speculative execution, etc., going on. You cannot measure calculation by speed directly, much as the person working on the PIV implied.
Man, lots of Geeks&trade here. Cool.
All day yesterday I was waiting and hoping Brad DeLong would take this one up as one of his "One Hundred Interesting Mathematical Calculations" ...
*sigh*.
I'm reminded of a psychology/usability study which questioned how much a computer user actually noticed the speed of the programs he was using. AFAICR, the average user noticed when the response time (interval between pushing the key and screen refreshed and released to accept new input) INCREASED by as little as 40%. They noticed it slow down,that is. However, the response time could DECREASE by around 700%
(make the machine faster and more responsive)
before most users would report noticing any improvement at all.
Dunno, Pouncer. If I was running a simulation that took two minutes per run, and was able to obtain even a 300% increase in processor speed, I think I'd notice it right away.
Just like benchmarks, you have to take this sort of result in context.
Pouncer:
"However, the response time could DECREASE by around 700% (make the machine faster and more responsive)before most users would report noticing any improvement at all. "
I dunno... I think I'd notice if a machine went from responding in x seconds to responding in merely -x seconds, let alone -6x seconds.
Ah, but percents don't go negative when starting from a positive, do they? :)
Wow, that's a lot of brain power up there. I think for the purposes of your friend's paper, you could say either "millions" or "trillions" and be accurate. They are even abbreviated MIPS and TIPS in the computer world for Millions(Trillions) of Instructions Per Second.
Quantum materiae materietur marmota monax si marmota monax materiam possit materiari?
I wasn't sure which storage file I had this in so I tried google. 1,620 hits, which is at least as relevant to the discussion as the translation itself.
Quantum materiae materietur marmota monax si marmota monax materiam possit materiari?
I wasn't sure which storage file I had this in so I tried google. 1,620 hits, which is at least as relevant to the discussion as the translation itself.
On my old Apple II at 1,000,000 megahertz. The machine instructions took 2 to 6 cycles (MH) so I always figured the average at at about 3.? - which is maybe 300,000 per second.
The people doing the programming used a lot of machine code - so the programs ran very fast.
Now days nearly all programming is done in high-level languages which translate into UNBELIEVELY inefficient machine code.Thats why people keep having to buy faster machines. The machine code for newer programming is OBSCENELY clumbersome.
An Apple II running at 1 terahertz! This I absolutely must see -- does it finish a game of Pacman before actually loading it? :)
Comments are Closed.