Profiling Tips

performance
optimization
profiling
fps

#1

On facebook a friend was asking about learning to code to become better at profiling on his indie projects and identifying when things are cpu vs gpu bound. I wrote up a bit about my experiences profiling effects and thought it might be apropos for discussion here too. Do any of you have other opinions/tips on profiling troubled scenes? Discuss below!

Here’s what I wrote:

“More so than learning to code, I’d recommend he look toward becoming an expert at profiling tools first. Things like pix for Xbox and razor for ps4. There’s similar for pc dev, I’ve got a msg out to my former cto about the name of it.

As an effects artist, I found the above profilers we used invaluable to identifying where a scene went wrong. Often, when effects are involved, it’s a gpu problem where a particle system was filling the screen with too many large particles. Sometimes a physics or an fx light overage. More rarely, too many instances spawning at once or overuse of a very expensive technique like our custom time of flight physics particles. I could capture a lagging frame with pix and see nanosecond timings of every single draw call call… and step through them, watching as each draws. That gives you a great targeted way to identify the worst offender (layer with the largest nanosecond timing, like <1,000,000 which is 1 ms of our 3ish ms fx gpu budget… and when you spot a layer that is over budget… stepping through allows you to visually identify what effect its coming from. There’s also a materiel name associated with that line item, which narrows the problem to “which layer within this effect is the offender.”

This is about as deep as I’d go with pix and razor but it’s just as useful for finding non fx related issues.

Though, toward the end of my time at IW, our CTO had us (the fx programmer) moving as much of this debug data as possible directly into in game overlays. For example we added the ability to see cpu load of each effect, the physics cost, etc… in addition to the list of live effects and their instance counts and particle counts.

Another tip that I’m not sure if this translates to other engines and is conceptually a bit confusing was our quickie “is it gpu or cpu bound” trick. Here’s how it worked… when a scene has dipping framerate, we would send a command to our engine to disable framerate capping (the cod engine locks the game to 60fps or less). When you do… if you see the FPS jump to above 60, this means that you were CPU bound. If it did not jump up, this means that you were GPU bound. Why is that? It’s been months since I left, so the why is getting blurry to me (and I may have the result flipped), but it was something like… uncapping the framerate will result in the same framerate before and after If you are gpu bound. The engine isn’t able to draw the scene in the allotted 16.6ms that is required for 60fps. Whether or not we cap the framerate has no bearing on that, so the FPS doesn’t change. However if you are cpu bound, uncapping the framerate… here’s where I’m blurry, would cause the framerate to raise. It’s something like… if you uncap the framerate and the framerate rises then it wasn’t the gpu that was holding you at that lower number, it’s therefore the cpu. Otherwise if the gpu is taking longer to process than the cpu then you are already showing the longer of the two… the gpu clamp where the number wouldn’t move.

Anyway, I’d recommend your friend first look int all the debug overlays that are already available in unity or unreal or whatever he’s using. Then have him dig into pix or pix for windows, or razor, etc… I learned all of these through people at work that knew them and I documented it on our internal wiki. I wonder if anyone had made intro YouTube videos?"


#2

Another invaluable debug tool we added late in the project was some filtering methods. Here’s how it worked:

Play through the game until you hit a point that was framerate fubared. We’ll say… it’s dipping from a solid 60fps down to 40fps. There’s a ton of effects going off at that moment, explosions everywhere, missile trails, atmospherics, exploding vehicles, etc…

-Pause the game at that fubar moment (assuming you already have fps counters running, you see that it’s 40 here.

-Hit a dev hotkey to bring up the fx list, showing all (or the most expensive 20 or so that will fit on a page of the debug) the effects, their instance counts, particle counts, cpu load, etc…

-In our effects tool, in the filter field… type “weapon”. You see the list of effects narrow to only the ones that have weapon in their name or path.

-Hit another dev hotkey (or button on the fx tool ui) to “Use list” meaning that only the filtered effects in the list are drawn in the scene. Now we are seeing only the muzzle flashes, tracers and impacts with all the atmospherics, explosions, vehicle deaths, etc… suppressed from drawing in the scene.

-You look at the framerate, and see that the framerate now shows 60. This tells you that the load causing the drop to 40 is NOT coming from the weapon effects.

-So, you continue typing filtering terms based on the major sub-folders of your effects. type “atmospherics” to see only the load of the atmospherics, it’s 60, not the offender. type “explosions”, it goes to 58. We know that the explosions are having an impact, but are not the main culprit! next, we type “vehicles” and the framerate goes 42. There’s your problem! One of the vehicle deaths is blowing the framerate.

-Next, continue narrowing. type “vehicles/land” into the filter to subdivide the problem between the flying and land vehicles. The framerate is 42, so we know it IS a land vehicle. At this point you have probably narrowed the live effects list from maybe over 100 to under 8. You can likely look at the list to see which is the problem effect.

-But, in order to stay empirical, I’d continue narrowing until I found the one effect (or 2 or 3, which is almost always the case, it’s a few bad apples) that was causing the dip, and all other possibilities are eliminated.

-At this point, the filtering has done it’s job, but I’d often continue this same approach within the effect itself do discover which of it’s internal layers is the offender. We had no internal filtering like this, so I’d just manually suppress half the layers and continue to AB narrow or step through turning on only one layer at a time to discover what the load of every single layers was to best understand the spread of load and where cleanup would be useful vs. have no bearing.

Fix the offender, framerate’s back at 60… high-fives!


#3

Also… here were the most common things that would be wrong and how to fix them:

  1. Too much fill. Large particles filling the whole screen when the player is close to the effect. 7ish is fine, but sometimes, 50, 60, 100 fullscreen particles would be discovered. Lower particle count to fix!

  2. FX Lights. too many, too big. Make them smaller. lower to the most minimal count possible.

  3. Physics. Too many particles with physics on. Lower the count. or even better… split the layer into two. Disable physics on one. Let the one with phyiscs have 1/6th the count and the layer without physics have 5/6. A few things bouncing often gives the impression to the player. You don’t need them all bouncing.

  4. Too many of an expensive type being used simultaneously. We had a gpu particle type we called “spark fountain” that was used for lots of gushing bouncing sparks. They were crazy expensive and calculated all of their parabolas for the motion of all of the particles lives starting on the spawn frame, distributed over several frames to manage load. Fire 20 of these on the same frame and your framerate is going to have a nasty spike. work with design to ensure that the things that are calling effects with spark fountains in them, aren’t things that will cluster. otherwise remove them from effects.

  5. Too many trails. We would eat the budgets of our “geotrails” the trails used on rpg missiles and flying vehicles all the time. These things had to be very individually load balanced. How many fighter jets are in the scene? generally not more than 2, so make them dense. Then a designer makes an invasion scene with 50 live jets! Make a custom “cheap” version that dies in a shorter time, with much fewer segments.

  6. Designer didn’t think the effect was beefy enough, so he’s playing 5 instances of it at once. /facepalm. Here’s a custom new beefier version… only play 1 please.

What other issues do you typically find are the offenders?


#4
  1. Shader complexity is another big one. We didn’t run into this much at IW because we had no shader editor which (intentionally) kept us out of trouble. Solution: simplify your shader, or ensure that complex shaders only live on things with very low counts and/or size.

#5

Sorry this is becoming ranty now, but it’s something I’ve been passionate about.

Although I’ve gotten very good at isolating and fixing problems when they are known… the problem that I haven’t wrapped my head around yet, and I haven’t seen good tools yet for, that I’d love to see tackled… is "how to know when problems exist, short of getting “framerate’s fubared” bugs or the head of your code team coming to your desk and asking you to look at a scene that he discovered was problematic.

Worst case problems that are literally tanking framerate are so bad that there’s a forcing function that gets them identified and fixed.

But there’s a whole layer of “not good” in our effects libraries that are slowly sucking unnecessary framerate, which in aggregate are costing you ms, but since they don’t raise to the level of “screwing the whole game” they don’t get identified and are causing you load everywhere.

What are ideas to know where waste is? To discover where we should be improving things. Doing optimization passes at the end of a project is something I’ve done before, but I’d love to see a more systemic method for discovery.

Maybe… heatmaps (with facing vector) coming from anylitics teams showing framerate dips, tagging lists of effects that were in the frustum at those moments. Get reports on which effects show up in these lists too frequently.

Any other ideas?


#6

So i think the software I’ve heard people use the most for PC Profiliing is Renderdoc

Other than that I just wanted to drop a couple of links that I always refer back to when I’m starting to do an optimisation pass:

Both Unreal obviously but all the core concepts will remain the same across all engines.


#7

Fantastic thread! I love stuff like this

If anyone has any introductory resources on how to use Pix/Razer/Renderdoc for debugging, I’d love to see it. I’ve poked around in all of them but the interfaces are kind of daunting, and I’m not sure what I should be looking for.

One idea is to perform SQL-like queries on all files within your vfx asset database. Assuming your file format supports it, having a tool that allows you to make queries such as “Find any effect using noise that has over 500 particles” or “Find any effect that lives longer than 10 seconds with a cull radius larger than 100” can quickly give you an idea of outliers. I find this approach to be a useful way of doing broad-strokes optimization where you’re just making sure that your effects are following performance guidelines. Then, of course, you’ll still have to spend a lot of time in-game trying to find that one effect that’s messing everything up…


#8

Bookmarked, thanks a bunch for all this info DJ!


#9

Oskar Swierad also made a couple profiling vids that I thought were great


#10

“One idea is to perform SQL-like queries on all files within your vfx asset database.”

We started to do some stuff like this on Infinite Warfare. The Magnificent Andy Lomerson of Vicarious Visions and a Tech artist who’s name is escaping me right now developed some tools to do regEx like expression based global edits across all of the effects files in our library, or subsets. We used it to tune some optimization stuff that came on late in the project, like a “simulation pause when when out of frustum” flag that was not appropriate for very large effects like skybox elements. We could do stuff like: if folder is a sub of effects/global/vehicle/space and the effect size < 2000, set param “pauseSim” = true. If folder != effects/global/vehicles/space then if size < 500, set param “pauseSim” = true. else set “pauseSim” flag = false. Something like that. It isn’t exactly a method for identifying bad things, but it was a great approach to fix globally wrong things and make changes across massive amounts of files as tech advanced toward ship.

I’d love to to move our effects tools so that these kinds of relationships can be authored directly into effects as systemic interrelated databases rather than thousands of unrelated files. For example, I’d love to setup a muzzle flash “lighting compensation” flag that was a shared param living across lots of files that can be changed in one place rather than just a number that’s used over and over agian across a pile of files.

I once started giving common params that I though might need to be globally edited strange one off decimal numbers that I tracked in a spreadsheet. like “HDR Overdrive = 1.200765” where the 765 was essentially used as a key that allowed me to use visual studio (or UltraEdit) to “find and replace in files” this value across hundreds of files if need be. That’s a very hacky approach to something I’d love to see built into our authoring environments. I’m wondering if Niagras Cusom parameters can pull this off?!? If I can figure this out, expect a video!!