PPRuNe Forums - View Single Post - Transferring CPU load to the GPU!

29th April 2023 | 06:57

#4 (permalink)

MechEngr

Joined: Oct 2019

: Non-Aircrew

Posts: 1,701

Likes: 1,084

From: USA

GPUs are generally massively parallel processors - you put in some data that needs to be acted upon in, well, parallel, and a serial stream of results comes out.

For example - you can load them with the vertices of triangles and related colors and texture map pointers and then the individual processors in the GPU consult a view transformation (rotation, scaling, perspective) matrix and apply that matrix to all that data on a per-pixel basis. The bulk of the animation is making small changes to the data or to the transformation matrix so the amount of data flowing in per-frame is small.

In 2D animations it is possible that most of the data is changing. Encoded into the data is what color each pixel is; one might have 2D scaling, a pretty simple process, but the bulk of the processing is decompressing the image data that is coming in, an operation that isn't easy to put to parallel processing. Typical compression is that each pixel is a delta from the previous one - it's tough to start in the middle of the chain of deltas and it saves no time to break the task up to schedule the outputs. Imagine breaking up the task of knitting a scarf into individual stitches or small pieces and putting all of them together.

However, if you ever see a TV using HDMI suddenly break into big blocks of random junk, you see that the input stream can have restart points. Those points add overhead, making the data to transfer larger than if they aren't there. You may also notice that in highly dynamic scenes the image gets blockier and fine details disappear - data is discarded to keep the deltas smaller and therefore the amount of data required in the data stream. It may be tempting to think of these as blocks that can be independently worked on, but the data arrives in a serial manner and has to be decoded as fast as it comes in.

The decoding part, after the decryption stage that many video sources impose, can be handled by a chip that sells for less than $10 in quantity. That may not be necessary.

Many CPUs include specific instructions to decode video streams because it is far more likely that a computer user won't have a massively parallel GPU and is certain to have a CPU. Cell phones, tablets, low end laptops - so I can see where the video stream would be as compressed as possible and targeted at the widest audience. See https://www.intel.com/content/www/us...and-newer.html for Intel Core Gen 7 processor capabilities. Also https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video going back to 2011.

Reply

0