What Does the End of Moore's Law Mean for Gaming?

Shamus Young · Aug 31, 2015

What Does the End of Moore's Law Mean for Gaming?

This is a very strange time for computer technology. It's always a strange time for one reason or another, but that's why this time is extra strange. The only thing we were ever sure of is no longer a sure thing.

Read Full Article

freaper · Aug 31, 2015

Indeed, my PC is starting to run into trouble trying to play the latest AAA games, I'm looking forward to having to upgrade maybe for the last time.

Beetlebum · Aug 31, 2015

Isn't this always the case with all technology? If you spend enough efforts on perfecting one technology, the effectiveness eventually flattens out and it becomes too expensive for too little effect.
It is only when a new technology is invented that efforts to perfect once again become worthwhile and the whole cycle starts up again.

Thunderous Cacophony · Aug 31, 2015

Explain to the Luddite: If stuff that directly interacts with the player needs to be on a single core (at least a small amount of cores to allow serial processing), does that mean that the extra cores can be used to manage separate events, rather than ramping up the graphics?

For example, say I was making a zombie game optimized for a computer with 18 cores. I figure I need 8 for the actual player experience, and 4 more for the graphics (those proportions are pulled from nowhere). Could I then use the spare 6 cores to write what's happening in the rest of the city? For example, could you use that processing power to track herds of zombies as the move around distant areas of the city interacting with survivor outposts and natural events like forest fires, switching them over to the main cores as they got close enough to the player to become part of their game? Because they don't need to have their threads intimately connected with what's happening on a millisecond-to-millisecond basis, could you use that spare computing power to run large events in the background and report the results as necessary, instead of just trying to make things look better?

I'm a sucker for big strategy games, and one of the most interesting parts to me is when I break through the fog of war and find out what's been happening on the other side of the globe after 60 turns. I'm interested in knowing if this current plateau can be used to make games more broad by applying the same coding to a wider scope, rather than more complex.

DrunkOnEstus · Aug 31, 2015

I agree with you Shamus, Sony and Microsoft (and for that matter AMD) betting the farm on a ton of low-power cores in their CPUs has been rather disastrous if you look at the technical side of things. New N' Tasty on those consoles have big framerate issues, because the Unity engine it runs on is happiest with as much Ghz as possible, which those APUs just don't provide. My PC running a dual-core i3 has no problem with that game, because it barely gives a shit about the second core and enjoys the 3.1 Ghz. The Dishonored "HD" collection has trouble maintaining 30 FPS on those machines, which is absolutely inexcusable for a port from the last generation of consoles, and I'm more than happy to point at the CPU for that too.

I see the economical side of it, and the emergence of octo-core phones and tablets that need to use as little juice and generate as little heat as possible can't be understated. I'm sure the big 2 figured that it would make heating less of an issue (avoiding RROD-like issues), and that game programming would reach some magical zenith where more cores makes for more performance.

I'm perfectly happy with graphics now. I was perfectly happy at every generation actually, because the games I was attracted to had a distinct and enjoyable art style and aesthetic. Textures really don't need to be higher res, shadows don't need to be more accurate, and we don't need a million piles of alpha-effects on the screen to make the games look better. I'd be fine with that arms race if the cost of production wasn't bankrupting publishers and causing such a safe and conservative mindset in the AAA space, and that's where a line has to be drawn somewhere before all the big players get sucked into a graphical vortex.

I guess my point is this: Amazing graphics have never made a shitty game worth playing through, and I've never put down an enjoyable experience because there wasn't enough eye candy.

Steve the Pocket · Aug 31, 2015

Still, isn't the solution pretty obvious at this point - go back and reverse all the damage done by Wirth's Law by recoding our software from scratch to be more efficient? Or is game programming actually pretty efficient as it is, and it's everything else that's the problem?

Covarr · Aug 31, 2015

With VR right around the bend, I disagree that graphics power isn't something we need to worry about. The graphics themselves may not be getting too much more complex anymore, but it won't be too many more years before we have VR hardware pushing 4K with full antialiasing at 90FPS. If nothing else, that's gonna require a TON of VRAM.

P.S. Thanks

Xeorm · Aug 31, 2015

Thunderous Cacophony said:
Explain to the Luddite: If stuff that directly interacts with the player needs to be on a single core (at least a small amount of cores to allow serial processing), does that mean that the extra cores can be used to manage separate events, rather than ramping up the graphics?

For example, say I was making a zombie game optimized for a computer with 18 cores. I figure I need 8 for the actual player experience, and 4 more for the graphics (those proportions are pulled from nowhere). Could I then use the spare 6 cores to write what's happening in the rest of the city? For example, could you use that processing power to track herds of zombies as the move around distant areas of the city interacting with survivor outposts and natural events like forest fires, switching them over to the main cores as they got close enough to the player to become part of their game? Because they don't need to have their threads intimately connected with what's happening on a millisecond-to-millisecond basis, could you use that spare computing power to run large events in the background and report the results as necessary, instead of just trying to make things look better?

I'm a sucker for big strategy games, and one of the most interesting parts to me is when I break through the fog of war and find out what's been happening on the other side of the globe after 60 turns. I'm interested in knowing if this current plateau can be used to make games more broad by applying the same coding to a wider scope, rather than more complex.

Yes. You can do something like that if you prefer. I think the Stardock games do this (Galactic Civilization series) where they spend extra computing time on AI cycles. A more powerful computer will have better AI.

Callate · Aug 31, 2015

Honestly, I think both Sony and Microsoft under-powered on their consoles' core CPUs. If I've read the specs right, neither, despite a wealth of cores, is running at even 2GHz; my nearly five-year-old PC is faster. Squeezing things into a sub-$500 box that can fit under a television without overheating has its drawbacks. The ongoing 720p-900p-1080p 30fps/60fps debacle certainly suggests that on that side, at least, a bit more graphics processing power might not have gone amiss, either. And companies like NVidia seem to be doing some interesting things in making graphics cards take over other normally CPU-intensive tasks like physics.

I guess the question becomes less about when we'll reach the limits of available hardware as when the current limits of available hardware, or near enough, will come down in price to the point that typical consumers will have such devices in their homes.

On the plus side, recent news suggests breakthroughs in SSD technology means we can soon expect 10TB SSD drives about the size of a stick of gum. And Microsoft is making all sorts of noises about how wonderful and efficient DirectX 12 will be; time will tell. We've still got space to grow for a time, though Shamus' 10-year projection may well still prove accurate.

P-89 Scorpion · Aug 31, 2015

Callate said:
On the plus side, recent news suggests breakthroughs in SSD technology means we can soon expect 10TB SSD drives about the size of a stick of gum. And Microsoft is making all sorts of noises about how wonderful and efficient DirectX 12 will be; time will tell. We've still got space to grow for a time, though Shamus' 10-year projection may well still prove accurate.

Samsung already has already been showing off a 16TB SSD and they say they will double that next year.

Rack · Aug 31, 2015

Xeorm said:
Thunderous Cacophony said:

I'm a sucker for big strategy games, and one of the most interesting parts to me is when I break through the fog of war and find out what's been happening on the other side of the globe after 60 turns. I'm interested in knowing if this current plateau can be used to make games more broad by applying the same coding to a wider scope, rather than more complex.

Click to expand...

Yes. You can do something like that if you prefer. I think the Stardock games do this (Galactic Civilization series) where they spend extra computing time on AI cycles. A more powerful computer will have better AI.

While this kind of thing can be parallelised it's probably not the kind of thing that can really soak up that much processor time to great effect. This is probably more a time and game design limitation than anything.

Dach · Aug 31, 2015

@Thunderous Cacophony:

It's actually more complex than that. You cannot start processing certain events because they require output from operations that hasn't completed yet, but, depending on complexity, events can be split into operations that can be processed in parallel. Actually writing code to do this directly, or to write code to detect this and split the load, is difficult. This is usually an issue the development costs not being worth the increase in performance. This tends to be a bigger issue in areas like supercomputing where the difference is days vs weeks of run time.

I'm not sure of the reason for Shamus' assertion about games being CPU-bound, current generation games usually scale down their graphical fidelity if they cannot run on current systems. Certainly, graphics are easier to brute-force with computing power, but consoles are certainly not at that point. Games that hit CPU limits are usually from the RTS genre since they have to handle AI and pathing for hundreds of units at a time. These also tend to be uncommon on consoles and usually have vastly fewer units if they are.

The Almighty Aardvark · Aug 31, 2015

Thunderous Cacophony said:
Explain to the Luddite: If stuff that directly interacts with the player needs to be on a single core (at least a small amount of cores to allow serial processing), does that mean that the extra cores can be used to manage separate events, rather than ramping up the graphics?

For example, say I was making a zombie game optimized for a computer with 18 cores. I figure I need 8 for the actual player experience, and 4 more for the graphics (those proportions are pulled from nowhere). Could I then use the spare 6 cores to write what's happening in the rest of the city? For example, could you use that processing power to track herds of zombies as the move around distant areas of the city interacting with survivor outposts and natural events like forest fires, switching them over to the main cores as they got close enough to the player to become part of their game? Because they don't need to have their threads intimately connected with what's happening on a millisecond-to-millisecond basis, could you use that spare computing power to run large events in the background and report the results as necessary, instead of just trying to make things look better?

I'm a sucker for big strategy games, and one of the most interesting parts to me is when I break through the fog of war and find out what's been happening on the other side of the globe after 60 turns. I'm interested in knowing if this current plateau can be used to make games more broad by applying the same coding to a wider scope, rather than more complex.

Unfortunately it's a lot more complicated than that. Shamus was going for a fairly simplified analogy just to explain the basic gist of it.

The biggest issues with multi-core programming are dependencies and shared resources.

With your RTS example, things might be able to work (But it would definitely not map out to 18 times the speed, 10 might be optimistic). There would be a number of complications though. Turn order would royally screw a lot of things up. If one player's actions are dependent on another player's actions, then you have to wait for the previous player. And you kind of need them to, otherwise you could have two players trying to move units to the same location, or you could have two units moving to attack each other, both moving to where the other unit was on the previous turn.

Then you also have shared resources with making changes to the game board. If you have two processes trying to modify the same data, you run into an issue called race conditions. Race conditions are where you can have unpredictable behaviour based on the order in which processes modify data. Typically you deal with these by limiting the number of processes that can modify a structure at a given time, requiring all the others to wait.

Code:

y = 0;

Increment()
{
  integer x = y     (L1)
  x = x + 1         (L2)
  y = x             (L3)
}

You start off with y being equal to zero, and you have two cores calling increment at the same time. First off, let's take the scenario where the first core executes all of Increment before the second core. The first Increment makes y = 1, then the second Increment makes y = 2.

However, now let's say that the first core executes L1 and L2 before the second core. It's value of x = 0 at L1, and it increases x to 1 at L2. Then the second core executes L1 and L2. For it, x = 1 as well. L3 executes on core 1 storing 1 in y. Then L3 executes on core 2 storing 1 in y.

You get two different results all depending on the order of execution, therefore what you need to do is make sure is that only one process is executing Increment at a time, and any other process that wants to has to wait.

Even beyond all this, code that works well with multiple core is just a lot harder to write. You're going to have to invest a lot more programming and debugging hours to make sure that you're not letting by an error that might slip through in a one in a million situation.

However, there are situations where you don't have dependencies where you benefit from having a shit ton of cores. Some examples of this are physics engines, and fluid simulation. At this years Siggraph (Computer graphics conference) they had the majority of the presenters doing the computation work on the GPU instead of the CPU. Fluid simulation is hard because it requires you to compare each of tens of thousands of particles to tens of thousands of other particles, making hundreds of millions of calculations per frame [footnote]This is a bit of a simplification, you can cut this number down by ignoring particles that are too far away to have a noticeable effect[/footnote]. All of these calculations are based on the last position of each particle, so you don't need to know the new position of the other particles to calculate the position of the one you're working on.

Actually, something that's becoming increasingly popular is executing code on the GPU for programs that have absolutely no intent of drawing anything.

Dach · Aug 31, 2015

Callate said:
Honestly, I think both Sony and Microsoft under-powered on their consoles' core CPUs. If I've read the specs right, neither, despite a wealth of cores, is running at even 2GHz; my nearly five-year-old PC is faster.

It is an Accelerated Processing Unit, which has 4 CPU cores and 4 GPU cores on the same die. This has some advantages and disadvantages, but ultimately it isn't much more powerful than similarly priced PCs.

Callate said:
I guess the question becomes less about when we'll reach the limits of available hardware as when the current limits of available hardware, or near enough, will come down in price to the point that typical consumers will have such devices in their homes.

The limits are hardware are still a ways off but more problematically, consumers and to some extent business, aren't demanding more powerful hardware so tech companies don't have the volume for cheaper parts nor the incentive to research more powerful tech.

Leon Royce · Aug 31, 2015

Best time to buy a new gaming rig is two year into a new generation. If this one is going to last another 8 years, then things are great. Like someone else said above, developers can focus on gameplay, innovations, level design and story. Who knows, maybe AAA will try some new things, maybe even take risks...

Kenjitsuka · Aug 31, 2015

Xeorm said:
I think the Stardock games do this (Galactic Civilization series) where they spend extra computing time on AI cycles. A more powerful computer will have better AI.

Wow, really? That really sucks!
The goal is to make the game as good as possible on all systems, for the best experience.
It would be really unfair if my sweet ass PC would kick my ass, just because I OC'ed it up to the heavens...
Really unfair, and probably unwanted by devs. Optional would be another story though; can NEVER have too many of those!

Pyrian · Aug 31, 2015

Honestly, multi-cores are doing just fine. It is not a coincidence that the very tasks which are most CPU intensive are also the most amenable to being split up. Graphics, physics, pathfinding (and related AI operations), fog of war - what do they have in common? They're operating over many objects in a wide area. This makes them expensive - but it also makes them multi-task-able. Operations like the cited input-effect example are CPU-trivial; very early games had the same limitations.

geizr · Aug 31, 2015

The problem is almost always a limitation in the software's design, not a limitation in the hardware's capability. However, devising clever software that can efficiently do the calculation is a very difficult thing to do, and most developers are on such a deadline that they just can't afford to do it. But, regardless, allow me to give an example that demonstrates what I mean.

I once had to write some code that ran a simulation on 1 million point particles for 1 million iterations on each particle. That's a total of 1 trillion iterations! Fortunately, there were certain conditions that would occur such to allow one to not perform the simulation on the particle ever again after a certain point. Now, the naive solution to this problem is to set up a double for-loop that iterates over each advancement of the simulation and then each particle, with an if-statement in the particle's iteration to check whether you need to do simulation calculation on the particle. The strange thing is, this solution still takes an hour to run, because the computer is still doing 1 trillion iterations, even if some of those iterations don't do anything. This was indeed how I did the simulation, at first.

Then a clever thing dawned on me. Since once a particle was taken out of the simulation it stayed out, why bother iterating over it further? The easiest way to force the particle out of iteration and not have to bother checking is to structure my particles in a linked list. Then, instead of a for-loop, I used a while loop that iterated until it reached the end of the list. When a particle was forced out of the calculation, it was simply unlinked out of the list. This means that as the calculation proceeds, there are fewer and fewer iterations that actually had to be done to achieve the equivalent of doing the full 1 trillion iterations naively. Even further, the linked list was allocated as a contiguous array so that it remained aligned in memory, reducing cache thrashing (which can bring even the fastest processor to its knees).

The combination of using a linked list whose links are updated as particles are removed from the simulation and ensuring the data for the particles are aligned contiguously in memory resulted in the code going from code that did 1 trillion iterations and took 1 hour to run to code that did the equivalent calculation of 1 trillion iterations but only took 3 minutes!! From 60 minutes run-time to 3 minutes run-time! A factor 20 improvement from simply recognizing a better, more efficient way to do the same calculation (using the linked list) that also took advantage of the intrinsics of the hardware (contiguous alignment of data to prevent cache thrashing).

As an example of the effect of cache thrashing, I had an earlier program that did some calculations on a grid of numbers. Following the equations for the calculation, I was initially iterating by column and then by row, that is, column iteration on the outer loop and row iteration on the inner loop. This was purely wrong, because it constantly caused the processor to mis-predict the next line of data needed. Consequently, the processor was being forced to constantly wipe and reload lines of data from main memory instead of having it ready in the processor cache. This is cache thrashing. My code took 35 minutes to run on the entire grid.

Then, I figured out how to switch the iteration order without invalidating the calculation, iterating by row and then column. Doing this kept the data contiguous, and the processor was able to properly predict the next data line and have it preloaded while it was working on a prior data line. Doing it this way, my code only took 5 minutes to run! A factor 7 improvement!

Both these incidents happened a decade or more ago.

The point here is the hardware is already hella, stupid fast, has been for a long time now. However, our approach to developing efficient code, which sometimes requires some clever tricks, is what is significantly lagging.

Darkness665 · Aug 31, 2015

Thunderous Cacophony said:
Explain to the Luddite: If stuff that directly interacts with the player needs to be on a single core (at least a small amount of cores to allow serial processing), does that mean that the extra cores can be used to manage separate events, rather than ramping up the graphics?

First, you are not a luddite if you are going on about multi-core processors. Just sayin.

There is a level of detail that will make a difference in a game, beyond that no appreciable benefits will be had. That applies to more than graphics. Do you need to know that the unseen unit driving to a destination has a leaky left front tire? Or that it hasn't met the recommended maintenance level? No, that can all be covered with a simple die roll; One 20-sided die can cover if the unit will break down. Indeed, the breakdown itself might be incorporated in the reduction of combat efficiency by nothing more than a percentage decrease. A miniscule decrease at that.

Now, just as the ever increasing level of simulation detail approaches the wall of 'no appreciable gain' so does the programming effort. All that extra work requires actual work to be done by humans (you in this case) and frankly the game/software needs to ship at some point. Unless the game is a hobby project, then whatever level of detail/work/realism is defined differently.

What Does the End of Moore's Law Mean for Gaming?

Shamus Young

New member

freaper

snuggere mongool

Beetlebum

New member

Thunderous Cacophony

New member

DrunkOnEstus

In the name of Harman...

Steve the Pocket

New member

Covarr

PS Thanks

Xeorm

New member

Callate

New member

P-89 Scorpion

New member

Rack

New member

Dach

New member

The Almighty Aardvark

New member

Dach

New member

Leon Royce

New member

Kenjitsuka

New member

Pyrian

Hat Man

geizr

New member

Darkness665

New member