What would be the reason behind IISc purchasing the new CRAY XC-40 supercomputer for their SERC department?

Question

Lawrence Stewart · Answer

The early Cray machines were so fast for two reasons. The first was that Seymour Cray was a genius of a computer architect. The second was that he was willing to burn a lot of power.

I heard him speak once at Stanford, just wow. You can study the machine designs yourself.

The other factor I know a little about. The Cray machines were built with emitter coupled logic, which was very very fast, but used a lot of power compared to other circuit designs for logic gates. The transistors in ECL operate in their linear regions and never saturate. This lets them switch very fast, much faster than other schemes, but the downside is that they are always using a constant amount of power, even when just sitting there. ECL designs are usually careful to always have balanced circuits, so when current increases somewhere, it decreases somewhere else, which leads to constant current power supplies.. A third design point is that ECL gates are capable of driving low impedance loads, which lets you arrange the wiring of the computer as transmission lines. You get incident wave switching, which cuts wire delays in half, again at the expense of power.

It is a general rule of thumb that you can build a computer with a clock cycle that permits about ten gate delays between clock edges. The Cray 1 ran at 100 MHz, which, more or less, requires 1 nanosecond gates at a time when most people were building computers out of TTL with 10 or 15 nanosecond gates, plus TTLs horrible signal qualities and asymmetric output stages. Schottky TTL was a little faster, but with even worse signalling, because the high switching rates would cause glitches everywhere in nearby high impedance wiring.

ECL signalling is usually 50 ohm transmission lines, with beautifully smooth switching transients. A moderate level of discipline among the board designers can get you reliable logic boards without any drama.

In the same way that true expert generals study logistics, Cray studied power and cooling, the logistics of computer systems.

Brett Bergan · Answer

A Cray supercomputer or any supercomputer these days is an extremely high speed network with enormous amounts of RAM and millions of compute cores.

Each cabinet of a Cray supercomputer is about the size of a large refrigerator and houses dozens of rackmount CPU/GPU nodes that each have a quantity of RAM often equipped with 1TB or more per node. The Summit supercomputer at Oak Ridge has a pair of 22-core IBM Power9 CPU’s and six water cooled Volta V100 GPU’s for a total of 44 CPU cores and 30,720 GPU cores per node.

%3E One node of a Cray supercomputer is a blade housing multiple CPU’s and multiple GPU’s along with about a terabyte of RAM.
Each cabinet contains 24 of these “nodes” and are connected via extremely high speed NVLink 2.0 interconnects. The number of cabinets can vary from dozens to hundreds.

Joshua Gross · Answer

I'll add a bit more detail to expand Jim's answer. I am not sure that Seymour Cray ever worked for IBM; I've never seen evidence of that claim. He worked for a rival of IBM, the companies that became Sperry Rand, and later helped found Control Data Corporation (CDC). Cray was one of the most gifted computer engineers and businessmen of the 1950s to early 80s.

He began at ERA, an early computing company that was bought by Remington Rand, the company that produced the Univac. That company merged with Sperry, forming Sperry rand, which "survives today as Unisys [ https://en.wikipedia.org/wiki/Eckert%E2%80%93Mauchly_Computer_Corporation?wprov=sfti1 ]." Frustrated, Cray joined several colleagues who founded CDC. Cray worked to build faster and faster computers, focusing on the speed of the system, not just a fast CPU. This led to the CDC 6600, the first supercomputer. In 1972, Seymour Cray left to start his own company, Cray Research. Cray was well known to not be a team player, and he eventually started other companies, continuing as a contractor for Cray Research.

In the 1970s, 1980s, and into the 1990s, a Cray usually held the record for the fastest supercomputer. Not long before Cray's untimely death, changes in the supercomputing market led Cray Research to be sold to SGI, with most of the assets later being sold to Tera Computer Company in 2001. Tera was renamed Cray Inc., but by this time, Seymour Cray's remaining contribution was indirect. In 2019, the company was purchased by Hewlett-Packard Enterprise, and it operates as a wholly-owned subsidiary. Cray continues to play a role at the top end of the supercomputer market.

Regardless of the status of the companies he found it, Seymour Cray remains one of the the most influential computer engineers, even decades after his death.

Tony Li · Answer

The other answers here miss out on the key architectural aspect:

Cray supercomputers were designed for vector processing. They had the amazing ability to take an array of numbers and stuff them through the floating point unit and produce one result every cycle because of the pipelining of the floating point unit.

Pipelining is an incredibly important and easy way to get parallelism and parallelism gives you enormous performance gains.

We do this a bit today. Internally, processors are deeply pipelined so that even a normal instruction stream has multiple operations going on in parallel. However, we don’t code for this case, so we end up with bubbles in the pipeline all of the time. We achieve more parallelism by having multiple cores, but this is only a small multiplier for performance.

Jim Mowreader · Answer

Seymour Cray, in addition to being one of the smartest men to ever work on computers, was a firm believer in the Brute Force and Ignorance school of computer design. The whole Cray-1 computer only contains four integrated circuit part numbers: an ECL technology NOR chip with two gates in it (a 5-input and a 4-input) used for computation, a Monolithic ECL technology NOR chip with the same gate configuration used for addressing; a 16x4 bit static RAM chip he used as a shift register, and a 1024x1 bit static RAM chip he used as main memory. ECL and static RAM are exquisitely fast, but they’re also power hungry.

Cray had a really slick method of powering the system. You didn’t wire the thing directly to the power grid; ECL chips, in addition to being power-hungry, are completely intolerant of electrical fluctuations. The Cray system revolves around the Motor-Generator scheme. There is a 150-kilowatt generator head in the system being driven by a 300-horsepower electric motor. The machine is liquid cooled…and it seems that when you bought a Cray-1, they decommissioned your furnace during installation of the computer. The computer produces so much heat it’ll keep your building warm. The curious thing is, Cray loved to boast his computer was so good it didn’t have a regulated power supply, which leads to two questions: (1) where the hell are you going to find a 150kW voltage regulator that doesn’t need to be bolted to a concrete slab out back, and (2) what do you think the MG set is doing?

Valdis Klētnieks · Answer

What ever happened to the Cray supercomputers?

The famous early ones were, of course, taken out to pasture decades ago, because the average smartphone is 100 times faster. There’s still a fair number of them in various computer museums.

However, the company is still in business. After a trip through ownership by SGI, then sold to another company that then renamed itself Cray Inc. That company then bought Appro, which was the #3 supplier of systems on the Top500 list at the time. Meanwhile, HP sold off its PC business and rebadged itself HPE (HewlettPackard Enterprise), and then bought Cray in Sept 2019.

They landed the Titan system at the top of the Top500 list in 2012, which was just decommissioned in July 2019. Cray/HPE is still the #5 vendor on the Top500 as of Nov 2019, with a healthy 7% of the systems on the list.

Eugene Miya · Answer

They, by in large, run a version of Unix called UNICOS(tm, briefly named CX-OS) which is a version of Unix which does not run virtual memory.

LLNL briefly ran a new important set of ideas called NLTSS. Quora User covers some of this (we were both briefly in the High Speed Processor group). I think also LLNL briefly experimented running a version of Plan 9 briefly. I need to talk to ken about that (2 weeks).

Someone built a 2nd Cray X-MP emulator and they got COS 1.17 running (why I have no idea, UNICOS runs more efficiently than it and with more memory). And I need to to talk to Chris who did a 1/10th scale Cray. Pretty much all the earlier Cray OSes like NCAROS, the one which the NSA developed (ask me if you insist on the name), etc. got dropped.

Addendum: 3 people are directly responsible outside Cray Research for asking for Unix: myself (informally in another Ames organization), Creon Levit (fresh out of Washington Univ.) and his officemate Mark Aaker (who is possibly the most important person to get NASA running TCP/IP). Creon and Mark were employees 5 and 6 on the new NAS supercomputer facility. Their bosses, Don Senzig and Ron Bailey gave consideration to a new LLNL OS named NLTSS using a suped-up new untested protocol called LINCS.

To appreciate any of this, you have to understand supercomputers and the supercomputer market. They ignore instruction set compatibility. It’s speed above all else. IBM on the other hand could not bring themselves to do this following the consent decree and didn’t even try to get back into supercomputers until 1988.

[The difference between the Cray-1 and the Cray-2 was dropping the B- and T-registers in favor of a new high speed “Local memory”. See Russell’s CACM paper on the Cray-1 to understand these, at least see the architectural diagram. An LM is NOT a cache. I should have introduced Cray to Alan Jay Smith (“Mr Cache Memory”); I did not. The Cray-3 would also use Local Memory, but Cray was not able to effectively use them and place B- and T-registers back into the Cray-4 before his untimely death. This is part of why the 1 and 2 were in no way instruction set (object code) compatible. You might think this was stupid, but you probably could never have competed with Cray the man.]

The NASA and other users of Cray hardware wanted an OS like VMS (DEC’s proprietary written in BLISS, 32-bit OS). They didn't have a clue why this was not possible on a technical and administrative level. Mel Kalos (with Nick Metropolis) also wrote that he wanted a VAX/VMS system with 10,000x time the speed to do Monte Carlo computations (I have that reference some place).

At first Creon drank the NLTSS Kool-aid(tm). I convinced him this was a bad idea, and even gave him a copy of Lyon’s book (people who recall v6 will know what this meant). And Mark his officemate saw the light of DARPA’s TCP/IP over SNA, X.25, and protocols like DECnet. The other later important addition Mark suggested as the hardware neared completion was to include an Ethernet controller (might been slower that Hyperchannel and Ultranet and other things) but it was simple as a back-door to repair things.

Back to Unix. The good thing was that Bell Labs had purchased a Cray-1 for circuit design and simulation. They would later upgrade to an X-MP/24?. Most of their users used C. So the port of pcc(1) compiler was done before CRI even considered C as a useful language. By the X-MP (1 CPU running COS and other other running the Guest OS as the K&P Software tools were call), the color (you buy, you get to select), was instead wall paper of a Bellmac-32.

Needless to say, NASA management outside the NAS, and many in the NASA supercomputer user community, were critical of the Unix decision. Numerous 3rd party consultation took place, most notable Gene Levin of RIACS who endorsed the NAS’s (Ron Bailey’s (Ron was SC’89 Chair)) decision. I was one of many as well as Creon, Mark, Don, Ron, Andy and other who took verbal arrows in our backs.

The DOE was particularly incensed. And there were CRI-internal people who thought you need assembly language based OSes for best performance (NLTSS wasn’t, it was written in LRLTRAN/CVC, a version of Fortran with dynamic memory). They offered a “No confidence” vote. And 3 firms, labelled “Crayettes” (the term was printed in Datamation; Scientific Computer Systems (SCS, makers of the SCS-40 and SCS-30 (CHM needs 1)), firm #2 (I need to recall the name, maybe American Supercomputer with Mike Flynn), and Mike Fung’s Supertek, while not producing true supercomputers (mini-supercomputers), would be able to take up the slack. CTSS would be the fall back, and they were also instruction set compatible (and addressing limited) to the Cray-1/X. And none of these systems use virtual memory (too slow). I have ex-DOE people ticked off at me to this day. They never understood the dynamics of the Silicon Valley despite being distinguished physicist bomb designers.

Unix on a Cray (first the 2, then X-MPs and later Y-MPs) was a success. A secret contracted study (Yate’s Ventures) told CRI that they would sell “30–40 machines” were Unix running on a Cray. About 30 Cray-2s were made. Almost 100 Xs and 1s were sold and over 100 Ys and subsequent C-90s was varying #s of CPUs.

In 1985, I invited ken (Thompson) by Ames. We had been acquaintances (dmr was separate, that’s another different story), since about 1977/8. I got ken to give a rare seminar. He had also just won the Turing award. This talk was on the condition that he NOT talk about Unix. It was old material I said OK, and I said, Hey how about telling about Belle your chess engine? Sure. But you know you will get Unix questions. That was acceptable. Let his be a lesson to people inviting researchers to give talks on history. ken’s was the only seminar I arranged and introduced which got a standing ovation before he said a word. I put a recent photo of ken on a dmr question.

The main competition in the USA was a spin off of CDC (where Cray came from) was ETA Systems which had to write a new OS from scratch (EOS); big mistake. CDC users wanted the CSC 205 OS VSOS (another mistake and the FTN200 Fortran compiler (too old)).

The main competition outside the USA were the Japanese makers: Fujitsu, Hitachi, and NEC, with IBM compatible machines. They produced advanced compilers; they did their US homework (academic), and it took a few years for the rest of the USA to catch up (e.g., Convex). These were very respectable machines in their own right. I ran on 2 of these (because I had an IBM JCL background). But by this time, all the Unix based workstations (before the IBM RS/6000 RISC) were dominating the US market.

I should also note that there was an internal effort at Cray Research called Cray Labs in Boulder, CO (portended a future move to Colorado Springs) for the Cray-2 who also suggested Unix as the 2’s OS. The Labs closed later. Give my friend Gordon Garb credit for this (trying).

This is the short, linearized version of events. Some of this is technically covered in CRI’s Tim Hoel’s Winter Usenix paper of 1986. I was 1 of 2 paper chairs (the other guy was an IBM guy). History is rarely this clean sounding.

I’ve written too much.

Roy Longbottom · Answer

I have just submitted a report to ResearchGate:

https://www.researchgate.net/publication/359171179_Cray_1_Supercomputer_Performance_Comparisons_With_Home_Computers_Phones_and_Tablets
A brief description is:

The main comparisons are based on benchmark results used to verify performance of the first Cray 1, with variations from two similar vintage benchmarks. Maximum MFLOPS performance, 100% vectorisation and multiprocessor effects are also considered. Samples of historic benchmark results are included along with others for the latest compilations. Bottom line is a mid range 2021 laptop that is indicated as being between 226 and 2671 times faster than the Cray 1 from 12 comparisons, dependent on the particular application. For lower range ARM processors, the 2020 Raspberry Pi produced gains between 25 and 400 times. My 2020 mid cost Android based phone achieved 74 to 757 times. There is no one answer.

Included is my history identifying my qualifications to answer this question, with Cray 1 hands on experiences.

Dani Richard · Answer

The second half of your question / statement is incorrect.

Regular computers are NOT powerful enough to solve the problems that SuperComputers solve.

The purpose of supercomputers is to solve problems regular computer cannot solve because regular computers lack amount the computing power necessary.

The budget of SuperComputer is very very large as they are very very expensive.

Only governments and some very large companies can afford to buy or build a SuperComputer.

There are many physics problems that require SuperComputers to solve.

I suggest you look up "Computational Physics" for examples such physics problems that may require SuperComputers.

Doug Freyburger · Answer

Just another box running just another version of UNIX. I supported the JPL Cray running UniCOS for Caltech campus users around 1986–7. No different from my other server computers really.

Physically the story was different. Liquid cooled electronics. So the users had posters of the cool looking hardware that acted like a fast version of any other server computer.

John Maria · Answer

The question is vague. Better at what? A super computer is like a train, capable of hauling massive loads to a few predefined depots. Like a train, a supercomputer takes a lot of effort to load and schedule. Highly skilled and experienced people operate trains.

Servers are like big rigs, they cannot carry as much as a train, yet have more flexibility as to where they can travel, how fast, and how often.

PCs are like cars and pickup trucks, some fast, some slower, yet easy to drive, can be used almost any time and almost anywhere.

So define better. A super computer is better at simulating weather patterns, a genome, or crash testing a new automobile design. A $400 desktop or laptop is more cost effective at accessing web pages and playing solitaire.

Jim Wetterau · Answer

Cray supercomputers are a brand of large scale computers made by the Cray Computing company. Seymour Cray was one of the developers of the IBM 360 who then went off on his own to found Cray Computers to compete with IBM mainframes. He was one of the few competitors to be successful in the mainframe arena because his machines were more powerful and lower cost than their IBM competitors.

I stand corrected. Cray did indeed work for Control Data and it was Gene Amdahl who workied for IBM before founding Amdahl computers. Both of them, pioneers of the supercomputer era.

Eugene Miya · Answer

Was?

Depends which model and depends how many people you are sharing it with. A shared supercomputer / a fraction of supercomputer stops being a supercomputer at a level fraction.

It takes serious thought if you expect fully efficient performance. At the time it was also the fastest scalar machine for its time at all those times. Arrays become your primary data structure and you have to give serious thought to branching and loops. You tend to have to avoid recursion (it was still the fastest LISP machine faster than hardware LISP machines).

It never used virtual memory, so you never had to worry about demand paging. If you had to ask about the costs, you could not afford. Striped disks were faster than normal disks.

Various models had some great extra architectural features not found on commodity mainframes or micros or minis (like the Hardware Performance Monitor (HPM; got me a thumbs up from Knuth). Got me my 3rd invitation to speak at NSA. There’s other features by non-standard Crays barely documented in the open literature.

Lawrence Stewart · Answer

I can’t speak for the classic crays, but I’ve used the 160000 core “Hopper” Cray XE6 at Lawrence Berkeley Labs. (But only up to scale 8192 cores!) It was a very busy machine, so for the most part you submit jobs and wait. That was kind of irritating. On the other hand, 5 GB/sec between any nodes. Sweet!

As close as I got to the classic Seymour Cray machines was hearing him speak at Stanford and visiting the National Cryptologic Museum at Fort Meade. I did get a chance in the 90’s to design hardware with ECL though, and that was a very pleasant experience.

Michael Bauers · Answer

There were multiple Cray processors, but we can talk about the first Cray.

Wikipedia say it was the first computer to “successfully” implement vector processing. But not the first. Successfully is something I won’t get into here.

vector processing, I believe, is pipelining. Which you can think of like an assembly line. Like I have seen at a sub sandwich shop.

The Cray did use some integrated circuits. This isn’t the VLSI seen later on with microprocessors.

I had read the C shape of the Cray was about getting shorter wires. At high enough speeds, distance starts to matter more in getting data from point A to point B.

Anonymous · Answer

Seymour Cray is no longer alive and Cray Inc. has become a subsidiary of the ailing HPE company. Cray Computer Corporation went bankrupt. He also designed some successful systems for the former Control Data Corporation. CDC was broken up, but part of it lives on as Ceridian.

They may make fast computers, but it’s not a large organisation nor do they have high revenues. Today HPE Cray is the vendor of 20 % of the top 500 systems, including the current fastest computer on the planet, ORNL Frontier.

Ken Cain · Answer

They are incredibly detailed and built by hand to order. You rarely walk in and say “I'd like buy a Cray. I'll pick it up this afternoon.”

Brett Bergan · Answer

Nothing particularly special.

One of my favorite modern tragedies is the story of the Roadrunner at Los Alamos. In 2008 it was the fastest computer in the world, being the first to ever break the 1.0 pataFLOPS barrier. Sadly by 2013 it was decommissioned, torn down, and sold for scrap. All of the electronics were shredded rather than recycled. The cost to build Roadrunner was $100M.

The OS and most computation ran on 13,824 AMD (dual-core) Opteron processors. In tandem with each Opteron were a pair of IBM PowerXCell 8i “accelerator” chips that had eight cores each.

Today, the use of AMD EPYC processors is very widespread. Today’s newest “Turin” core 9965 has 192 CPU cores on each socketed processor. A pair of these offers 384 cores, 768 threads, and 768 MB L3 cache, running at a rather low 3.7 peak GHz.

The CRAY Frontier, however, does not have nearly such ambitious processors. The newest and fastest Oak Ridge system only uses the 64-core EPYC 7713, running at a peak core clock of 2 GHz.

With 9,472 processors, this is still a whopping 606,208 total CPU cores. Frontier uses a staggering 21 megawatts of power, which is 10X more power than Roadrunner used. However, with its 7nm Zen 3 superchips, it manages to eke out 1000X the compute power, making it overall 100X more efficient. Frontier was not only the fastest computer in the world until two months ago, but it was also the most efficient.

The new king of the hill is el-Capitan with its AMD EPYC APU design. Each 24-core APU has a 228-core Instinct GPU on the same socket. El-Capitan must use 43,000 processor compared to 9,400 on the Frontier due to the smaller chips. El-Capitan uses 40 megawatts to not quite double the performance of Frontier. Consequently, Frontier remains the most efficient supercomputer in the world.

David Ecale · Answer

Eugene Miya [ https://www.quora.com/profile/Eugene-Miya ] has a very good answer. I’d like to add to it:

There are a few extra points that need to be made. LTSS, which evolved into CTSS worked quite well with the single CPU Cray-1 line. Cray, however, developed the Cray Operating System (COS) in order to sell its systems commercially as LTSS & CTSS were US Government owned software Operating Systems.

In about 1984 (2 years before I joined Cray), the Board decided that neither CTSS, nor COS were the future. While COS worked well with multiple CPU Crays (the X-MP line), the future expansion into ever more exotic systems was putting a strain on the company.

A solution was suggested, ATT System V UNIX. Cray obtained a license & ken & Dennis Ritchie came to Mendota Heights to perform the port. The basic port worked & the Cray system folks went to work. … And, that’s where things got interesting.

Cray UNICOS (as it was later named) development split into two armed camps: The X-MP line, to be followed by the Y-MP, and C90 lines, and the Cray2 line. each had specific ideas on what was important to optimize.

Needless to say, this rivalry went on for a loooooooooooong time. The Cray2 was somewhat unique as it had a “foreground” processor which dispatched jobs to the other processors (2 single CPU Cray2 prototypes (called Quarter Horses for obvious reasons), and 27 4-CPU Cray 2s, with one 8-CPU Cray2 (sn2028)).

The UNICOS X-MP crowd ultimately prevailed for a time as the Cray2 UNICOS variant was spun off into the Cray3 when Mr. Cray left the company (which never made it to market).

But, then, what happened? The Cray MPP & also the development of large CPU clusters of blade microprocessor systems. UNICOS couldn’t keep up & in the end, the successor company that took over Cray went to Linux.

//DISCLAIMER: I learned Cray Assembler programming on a Cray X-MP48 running COS with a front end UNIX box from Pyramid Computer Systems in early 1986. I watched Cray change over the next 18 years. My one negative observation is that the OS folks tried to optimize too much. UNIX was designed to be an 80/20 Operating System: 80% mid/high level & 20% hardware specific level. Both UNICOS development teams tried to push this to a 60/40 level. And, while the objective was noble, the speed of hardware change and the advent of the SuperMicros ultimately put paid to all of that extra work.

Mark Hahn · Answer

Of course not - it never did. There were competitors such as NEC.

The big difference is that “classic” supercomputers were basically bespoke devices, assembled in fairly low quantities (hundreds) from relatively small building blocks, often with fairly exotic technology (ECL). All of that changed when RISC and Dennard Scaling took CMOS into the forefront. “Attack of the killer micros” was the phrase of the day (well, decade - the entire 90s).

Nowadays, supercomputer engineering is mainly about effective clustering: efficient interconnect, efficient packaging, efficient acceleration, applied to very conventional-looking CMOS microprocessors.

I’m not saying Cray was unimportant - those machines obviously contributed a lot to the eventual path forward (through RISC, which might be described as “putting Cray-like architectural approaches onto CMOS”).

Today, Cray produces fine systems, and there’s certainly something to be said for a company with deep history. They seem to aim for a “premium” segment of the supercomputer market, though.