Jump to content

Terrain Normal Update Multithreading

JoshMK

513 views

Multithreading is very useful for processes that can be split into a lot of parallel parts, like image and video processing. I wanted to speed up the normal updating for the new terrain system so I added a new thread creation function that accepts any function as the input, so I can use std::bind with it, the same way I have been easily using this to send instructions in between threads:

shared_ptr<Thread> CreateThread(std::function<void()> instruction);

The terrain update normal function has two overloads. Once can accept parameters for the exact area to update, but if no parameters are supplied the entire terrain is updated:

virtual void UpdateNormals(const int x, const int y, const int width, const int height);
virtual void UpdateNormals();

This is what the second overloaded function looked like before:

void Terrain::UpdateNormals()
{
	UpdateNormals(0, 0, resolution.x, resolution.y);
}

And this is what it looks like now:

void Terrain::UpdateNormals()
{
	const int MAX_THREADS_X = 4;
	const int MAX_THREADS_Y = 4;
	std::array<shared_ptr<Thread>, MAX_THREADS_X * MAX_THREADS_Y> threads;
	Assert((resolution.x / MAX_THREADS_X) * MAX_THREADS_X == resolution.x);
	Assert((resolution.y / MAX_THREADS_Y) * MAX_THREADS_Y == resolution.y);
	for (int y = 0; y < MAX_THREADS_Y; ++y)
	{
		for (int x = 0; x < MAX_THREADS_X; ++x)
		{
			threads[y * MAX_THREADS_X + x] = CreateThread(std::bind((void(Terrain::*)(int, int, int, int)) & Terrain::UpdateNormals, this, x * resolution.x / MAX_THREADS_X, y * resolution.y / MAX_THREADS_Y, resolution.x / MAX_THREADS_X, resolution.y / MAX_THREADS_Y));
		}
	}
	for (auto thread : threads)
	{
		thread->Resume();
	}
	for (auto thread : threads)
	{
		thread->Wait();
	}
}

Here are the results, using a 2048x2048 terrain. You can see that multithreading dramatically reduced the update time. Interestingly, four threads runs more than four times faster than a single thread. It looks like 16 threads is the sweet spot, at least on this machine, with a 10x improvement in performance.

Image1.png.efeaecf3ceaa854ddf31fa18454fdb80.png

  • Like 3


0 Comments


Recommended Comments

The reason four threads was less than 25% the speed of one is because some calculations were being skipped. I fixed that and the numbers are a little higher now, but still form the same curve.

  • Like 2

Share this comment


Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...