Did Sony Help Build the 360's Processor?

Keane Ng

New member
Sep 11, 2008
5,892
0
0
Did Sony Help Build the 360's Processor?



Sony [http://www.sony.com/index.php]'s years-long quest to build the world's most powerful chip in the PS3's Cell may have inadvertently led to the creation of the Xbox 360's own processor, according to a new book by the people who designed both chips.

The story, related in "Microsoft [http://www.amazon.com/Race-New-Game-Machine-Playstation/dp/0806531010] approached IBM and asked them to build a chip for their own next-gen console, what would later become the Xbox 360. In 2003, IBM's Adam Bennett showed Microsoft the specs for the Cell, which was still in the midst of development. Microsoft then contracted IBM to build a chip, using what they had built thus far for the Cell as the core for the design.

Sony and IBM had agreed that IBM would be able to sell the Cell to whoever they wanted, but Sony hadn't figured it'd be so eager to do so, and especially not to its biggest rival. The deal led to some awkward office situations: "IBM employees hiding their work from Sony and Toshiba engineers in the cubicles next to them; the Xbox chip being tested a few floors above the Cell design teams," The Wall Street Journal [http://online.wsj.com/article/SB123069467545545011.html] wrote in its review. It wasn't all fun and games, though. Shippy wrote that he felt "contaminated" as "he sat down with the Microsoft engineers, helping them to sketch out their architectural requirements with lessons learned from his earlier work on PlayStation."

In the end it all sounds like another cautionary tale warning of the corporate hubris commonly associated with the mistakes Sony has made with the PS3. IBM eventually delivered both designs to manufacturing on time, but there was a problem with the first run. Microsoft had a plan B and had placed a backup order for manufacturing at another facility. Sony didn't for whatever reason, and had to wait on the IBM factory to get up and running again. So, in a sense, Microsoft got the chip that Sony had helped design before Sony did. The 360 hit its launch date and hit the market early, the PS3 suffered delays and the rest is history.

Shippy and Phipps are quick to stop their story from turning into a tale of how Sony may have lost the console war, however. "Both Sony and Microsoft were extremely successful at achieving their goals," they write. Whether or not you believe that, here's a whole new perspective on how we got where we are today.

[Via Eurogamer [http://www.eurogamer.net/articles/sony-helped-design-360-processor]]


Permalink
 

Joeshie

New member
Oct 9, 2007
844
0
0
Wait, I thought that the Cell processor and the Xenon processor were extremely different than one another. I've always assumed that's one of the reasons why the PS3 was so much more difficult to program for compared to the 360.
 

CoverYourHead

High Priest of C'Thulhu
Dec 7, 2008
2,514
0
0
So why don't they bond and make a mega console of death? So then I can have my Metal Gear and... I don't know, whatever franchise happens to be the talk of the town on the 360.
 

Jumplion

New member
Mar 10, 2008
7,873
0
0
Sorry guys, you're a little late to the punch [http://www.escapistmagazine.com/forums/read/9.82108?page=1] :p
 

Blank__

New member
Oct 9, 2008
78
0
0
"More difficult to program for" is because of shoddy APIs, not the hardware. That's why Dreamcast was/is the pimp-poop; Microsoft ported DirectX for it. Sony, the big bunch of whiners, must have had to develop their own interfaces for PS2/3 development, as they'd never pay Microsoft money to use their excellent, well-established suite of interfaces known as DirectX.

But as #2 pointed out, if they were writing in pure assembly (which, um, doesn't happen), development for the two would be virtually identical -- they're both RISC processors manufactured by IBM!

I guess this just proves the real winner is IBM.
 

Aardvark

New member
Sep 9, 2008
1,721
0
0
After working for IBM, I can see them copying from one project to another to save R&D time.

That doesn't explain why the things explode and burn people alive, though.
 

dragon376

New member
Jan 6, 2009
1
0
0
yes, they are both equal indeed... except that one of them is an unfinished copy of the other, built cutting so many corners that they had a 33% fail rate... so similar and yet so different...
 

BloodSquirrel

New member
Jun 23, 2008
1,263
0
0
Joeshie said:
Wait, I thought that the Cell processor and the Xenon processor were extremely different than one another. I've always assumed that's one of the reasons why the PS3 was so much more difficult to program for compared to the 360.
Both processors are made up of multiple cores.

The 360 uses 3 general purpose cores, while the PS3 uses one general purpose core and 6 (IIRC) over-glorified floating point cores. The general purpose cores are based off of the same design.

This is, btw, very old news.

Blank__ said:
"More difficult to program for" is because of shoddy APIs, not the hardware.
This is a bit of misinformation that is being spread by people who don't know much about low-level programming.

With the 360, you can run any code on any of the cores, with each core being able to directly access memory. The cell only has one general purpose core, and if you want to get any use out of the PPEs, you have to write code specifically for them (they do not have the same instruction set as the general purpose core), *and* you have to deal with the fact that they can't directly access memory, and have to go through the general purpose core. You basically have to write code on a reduced instruction set that can be compartmentalized onto each PPE's small amount of dedicated memory. That's intrinsically more complicated and time consuming than just writing normal threaded code (which is intrinsically more complicated that writing non-threaded code).
 

Blank__

New member
Oct 9, 2008
78
0
0
BloodSquirrel said:
Blank__ said:
"More difficult to program for" is because of shoddy APIs, not the hardware.
This is a bit of misinformation that is being spread by people who don't know much about low-level programming.
Right, yeah, that's me; the computer science student who has studied how the x86 works, written MACHINE CODE for a course, and writes things in MASM32. Since you went ahead and dropped an ad hominem fallacy, let me join you at your level and call you a dick. You use intelligent words to mask the fact that you are the one spreading misinformation. Nice, but ineffective when you write such things to a person who sees through them.

The cell only has one general purpose core, and if you want to get any use out of the PPEs, you have to write code specifically for them (they do not have the same instruction set as the general purpose core)
Everything I've read on this subject states that the PowerPC heart, the PPE (the 8 other cores are the SPEs, by the way), controls everything. Given that's it a RISC processor and the SPUs consist of RISC processors, why would they have separate instruction sets? Why would you have to write things specifically for each core? Even if there were different instruction sets, are you insinuating a compiler would be unable to compile things into the two different sets?

There's an entire processor dedicated to keeping simple matters out of the SPEs hands so they can crunch like crazy on their simple, parallel processes. Let's see what IBM has to say, shall we?

IBM said:
Power Architecture compatibility to provide a conventional entry point for programmers, for virtualization, multi-operating-system support, and the ability to utilize IBM experience in designing and verifying symmetric multiprocessors.

Single-instruction, multiple-data (SIMD) architecture, supported by both the vector media extensions on the PPE and the instruction set of the SPEs, as one of the means to improve game/media and scientific performance at improved power efficiency.
Woah.. So IBM wanted to keep it simple for the programmers, huh?

*and* you have to deal with the fact that they can't directly access memory, and have to go through the general purpose core.
This is wrong. Each SPE is made up of a processor (the SPU) and a memory controller. Though the Cell even has another memory controller on the die, each SPU goes through its memory controller to access memory. That's not direct enough for you?

Let's see what IBM has to say:
IBM said:
Whereas most processors reduce latency to memory by employing caches, the SPU in the CBEA implements a small local memory rather than a cache. This approach requires approximately half the area per byte and significantly less power per access, as compared to a cache hierarchy. In addition, it provides a high degree of control for real-time programming. Because the latency and instruction overhead associated with direct memory access (DMA) transfers exceeds that of the latency of servicing a cache miss, this approach achieves an advantage only if the DMA transfer size is sufficiently large and is sufficiently predictable (that is, DMA can be issued before data is needed).
Woah.. So the direct memory access in each SPE's SPU let's it... directly access memory and exceed the speed of accessing a cache? Wow! I'm glad I don't know much about low-level programming, 'cause if I did, I might think you were wrong, Squirrel!

Woah, there's even more?

Some smart guys said:
Each SPE features 256KB of local memory, more specifically, not cache. The local memory doesn?t work on its own. If you want to put something in it, you need to send the SPE a store instruction. Cache works automatically; it uses hard-wired algorithms to make good guesses at what it should store. The SPE?s local memory is the size of a cache, but works just like a main memory. The other important thing is that the local memory is SRAM based, not DRAM based, so you get cache-like access times (6 cycles for the SPE) instead of main memory access times (e.g. 100s of cycles).

What?s the big deal then? With the absence of cache, but the presence of a very low latency memory, each SPE effectively has controllable, predictable memory latencies. This means that a smart developer, or smart compiler, could schedule instructions for each SPE extremely granularly. The compiler would know exactly when data would be ready from the local memory, and thus, could schedule instructions and work around memory latencies just as well as an out-of-order microprocessor, but without the additional hardware complexity. If the SPE needs data that?s stored in the main memory attached to the Cell, the latencies are just as predictable, since once again, there?s no cache to worry about mucking things up.

Making the SPEs in-order cores made a lot of sense for their tasks. However, the PPE being in-order is more for space/complexity constraints than anything else. While the SPEs handle more specified tasks, the PPE?s role in Cell is to handle all of the general purpose tasks that are not best executed on the array of SPEs. The problem with this approach is that in order to function as a relatively solid performing general purpose processor, it needs a cache - and we?ve already explained how cache can hurt in-order cores. If there?s a weak element of the Cell architecture it?s the PPE, but then again, Cell isn?t targeted at general purpose computing, despite what some may like to spin it as.
Hmmm.. That doesn't seem to support your claims.

You basically have to write code on a reduced instruction set that can be compartmentalized onto each PPE's small amount of dedicated memory. That's intrinsically more complicated and time consuming than just writing normal threaded code (which is intrinsically more complicated that writing non-threaded code).
Ridiculous notions! Each SPE has 256 kB of local memory. That's more than most processor's L1 caches, including the PPE. Oh, noes! My program didn't compile down to 64kB, how will my processor ever be able to run it? I guess I'll have to write a multi-threaded Freecell game that uses less than 64kB of memory per core. What? That's not how things work in reality? Code can be larger than a processor's cache? Oh, nuts! =(

Your whole post is absurd. If programmers were expected to write their own compilers for their applications, your points would be somewhat valid. However, compilers were already designed with this impressive chip in mind, compilers do all the work. Pretending like EA has a bunch of guys cranking out assembly to bring Madden '10 to life is a bit of misinformation that is being spread by people who don't know much about software development.

UGGGGH. You dragged me into an internet argument. Someone quote that stupid XKCD comic on this matter.

Sources:
http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F3987257060006F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf
http://researchweb.watson.ibm.com/journal/rd/494/kahle.pdf
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2379&p=3
 

John Funk

U.N. Owen Was Him?
Dec 20, 2005
20,364
0
0
Indigo_Dingo said:
Jumplion said:
Sorry guys, you're a little late to the punch [http://www.escapistmagazine.com/forums/read/9.82108?page=1] :p
But remember, Mods are allowed to make threads without bothering to see if it has already been done several times...for reasons that probably make sense if anyone were to ever try and fathom them....
Threads =/= newsposts.
 

Riceman

New member
May 14, 2008
8
0
0
Blank__ said:
all the rant you know i am saving size of this : P
I applaud you good sir for making a logical rant AND siting sources so i can find out myself

Ok As for the processor been made by the same people...well the contractor has a right to make a profit "don't cha know". They maybe a similar design but there are not the same. I don't really see a probelem with it, it would be like comparing to computers with the same CPU's but different hardware you dont really hear about people complaining about the processors been the same do you... unless its a mac!
 

John Funk

U.N. Owen Was Him?
Dec 20, 2005
20,364
0
0
Indigo_Dingo said:
CantFaketheFunk said:
Indigo_Dingo said:
Jumplion said:
Sorry guys, you're a little late to the punch [http://www.escapistmagazine.com/forums/read/9.82108?page=1] :p
But remember, Mods are allowed to make threads without bothering to see if it has already been done several times...for reasons that probably make sense if anyone were to ever try and fathom them....
Threads =/= newsposts.
Bit of po-tay-to, po-tah-to there. What would proabably be better is if the thread that did it first was edited slightly and put on the front page. Seems simpler to me. Now of course we just have to have the exact same arguments over again.
Forum threads are separate from newsposts. News posters are not moderators. There are certain guidelines we follow for writing up news and presenting it that forum thread starters do not have to.
 

Art Axiv

Cultural Code-Switcher
Dec 25, 2008
662
0
0
Woot for OT!

Um, the question i pose : Why, a regular user, should care?
 

BloodSquirrel

New member
Jun 23, 2008
1,263
0
0
TL:DR version:

Blank__ claims to be an expert, misunderstands the concept of RISC, confuses caches and dedicated memory, and posts a lot of quotes that he doesn't understand from various places.


Blank__ said:
Right, yeah, that's me; the computer science student who has studied how the x86 works, written MACHINE CODE for a course, and writes things in MASM32. Since you went ahead and dropped an ad hominem fallacy, let me join you at your level and call you a dick. You use intelligent words to mask the fact that you are the one spreading misinformation. Nice, but ineffective when you write such things to a person who sees through them.
Well guess what? I've designed 2 microprocessors, written 3 operating systems, and have designed power armor from a box of scraps. Sure is fun to dick wave and name call on the internet as a replacement for actually demonstrating expertise!

Everything I've read on this subject states that the PowerPC heart, the PPE (the 8 other cores are the SPEs, by the way), controls everything. Given that's it a RISC processor and the SPUs consist of RISC processors, why would they have separate instruction sets?
Awww, that's cute. You heard the phrase "Reduced Instruction Set Computer" somewhere, and thought that it had anything to do with this.

RISC vs. CISC is different than general purpose vs. specialized. RISC vs. CISC is about how many operations you use to perform an operation, while general purpose vs. specialized is about what operations the processor can do versus how fast it can do them. RISC might mean that you need 2 instructions instead of one to read an arbitrary memory location, while specialized might mean that you can't do bitwise operations on the processor.

And it's actually 7. One of the cores is disabled to improve chip yields.

Woah.. So IBM wanted to keep it simple for the programmers, huh?
Pro tip: posting marketing fluff isn't going to make you look like you know what you're talking about.

Woah.. So the direct memory access in each SPE's SPU let's it... directly access memory and exceed the speed of accessing a cache? Wow! I'm glad I don't know much about low-level programming, 'cause if I did, I might think you were wrong, Squirrel!
Hate to break this to you, but that "local memory" they're talking about is separate from the system's main memory, which was rather the entire point, and which went rather over your head. Are you even reading the quotes that you're posting?


What?s the big deal then? With the absence of cache, but the presence of a very low latency memory, each SPE effectively has controllable, predictable memory latencies. This means that a smart developer, or smart compiler, could schedule instructions for each SPE extremely granularly. The compiler would know exactly when data would be ready from the local memory, and thus, could schedule instructions and work around memory latencies just as well as an out-of-order microprocessor, but without the additional hardware complexity. If the SPE needs data that?s stored in the main memory attached to the Cell, the latencies are just as predictable, since once again, there?s no cache to worry about mucking things up.
Actually, no, there isn't a compiler in the world than can predict what code is going to do to that degree, unless you are specifically writing it to be predictable, which is -get this- more difficult. If they could, we wouldn't have out-of-order microprocessors with their additional hardware complexity. There are specific types of tasks where you can do this, but reducing your game code to those tasks is no trivial, or in some cases even realistic, feat.

You're going to have to come up with a better source than "some smart guy" on that.

Ridiculous notions! Each SPE has 256 kB of local memory. That's more than most processor's L1 caches, including the PPE. Oh, noes! My program didn't compile down to 64kB, how will my processor ever be able to run it? I guess I'll have to write a multi-threaded Freecell game that uses less than 64kB of memory per core. What? That's not how things work in reality? Code can be larger than a processor's cache? Oh, nuts! =(
I didn't realize that most processors have a non-software transparent cache. Oh, wait, they don't. Even if you're writing in assembly, you're not writing code to load and unload the cache. With an SPE and it's local memory, you are. You've quite impressively managed to fail to come by any type of relevant point at all.

Your whole post is absurd. If programmers were expected to write their own compilers for their applications, your points would be somewhat valid. However, compilers were already designed with this impressive chip in mind, compilers do all the work. Pretending like EA has a bunch of guys cranking out assembly to bring Madden '10 to life is a bit of misinformation that is being spread by people who don't know much about software development.
There's a reason why developers can get so much out of console hardware, despite it being technically inferior to PC hardware, and it isn't because they're writing highly abstracted code and hoping that the compiler will sort it all out. I'm sorry, but you can't project your experience writing "Hello world" apps in Java onto writing video games which are expected to be cutting edge on limited hardware.