[Tutorial] GameOfLife OpenGL/OpenCL


OpenCL is a GPGPU (General-purpose computing on graphics processing units) framework, which allows us to use the GPU for massive parallel programming. In computer vision problems are very often per-pixel, or per feature etc. Compared to OpenGL, which is API for (3D) computer graphics, OpenCL defines everything more general to let non-graphics applications benefit from the GPU Power.

In practice it happens often, that we want to make some fancy/expensive computation on images and show them in real-time. So the naive way would be using OpenCL to apply the algorithm and then use openCV to show the image. So what happens is that we upload the source image to GPU, let the kernel run on it and then download it into RAM. And after that we show the image, by loading it up to the GPU again as a texture and show it (in the most cases this happens implicit).

The Problem

So we are downloading the image and after that uploading it every frame, which consumes more time then the most kernel executions. We can avoid this overhead by accessing sharing GPU memory between OpenCL and OpenGL.

Conways Game-of-Life, or something similar is a good example doing that.

4k resolution (3840×2160) @ ~44 FPS with Geforce GTX 970. On the CPU you can´t get 44 FPS with VGA (640*480)

Use existing projects

I forked a github project, to get some existing code which used OpenCL and OpenMP.

Here is the forked GitHub project(Code here!).

Ugly code, just for demo.

Context Sharing

After we created a window with glut, the next step is toget the OpenGL context. Under Windows there is a function to get the current context.

First init the window like this:
glutInitWindowSize(800, 600);

Create Textures

We need this two times to generate the textures.

We only need GL_LUMINANCE as we don´t have colors. glTexImage2D does trigger the upload and gives us the TextureHandle.

Create OpenCL Handle

Now we need a OpenCL data handle

Require the objects ownership

OpenCL need to acquire the texture from OpenGL

So everything is prepared now we can set up parameters for the kernel

Execute the kernel

Run the kernel and wait for the kernel execution to finish. OpenCL defines global work as the dimension of the image. Local work (cl::NullRange) is a concept, which allows synchronization (e.g. line by line), but we don´t need a sync.

When working with textures, we need to make clear that we use a texture unit in the kernel. There are some advantages (hardware interpolation possible, bounds handling etc.), which we don´t use here. Note that you can only read or write to an texture.


Now lets render the texture into a quad.

Related Posts

Leave a reply

This website stores some user agent data. These data are used to provide a more personalized experience and to track your whereabouts around our website in compliance with the European General Data Protection Regulation. If you decide to opt-out of any future tracking, a cookie will be set up in your browser to remember this choice for one year. I Agree, Deny