Improving performance with data parallelism

I know that it’s been a long time since I’ve written on multithreading (I’ve been really busy in JavaScript land!), but that is still an area which interests me and that’s why I’ll be writing on it from time to time.

As we’ve seen along this series, there are several things you might do to improve the performance of your apps. One of the most used techniques for improving performance is relying on something called data parallelism. This technique says that you should divide the received data into smaller chunks and then give each chunk to a different processor in order to maximize the throughput of your algorithm. Generally, this is a good approach for those cases where you’ve several items and need to perform some sort of operation over each one of them. In other words, this might be a good option when you need to parallelize a loop.

Notice that there are several things you should keep in mind before starting to apply this technique. The first thing you need to understand is that you should only apply this when you have a big enough collection of items (big enough depends on the work you’ll be doing with each item and you should always check your “gut feeling” with appropriate measurements). Don’t forget that data parallelism adds overhead to the main operation you’re performing: you’ll be splitting data and then you’ll probably need to get it all back again before returning the processed data back to the user.

If you’ve already decided that this is the way to go, then you need to make sure that the body of the for loop is thread safe. It goes without saying that thread safe shouldn’t be achieved through the use of locks. Doing that would completely defeat the purpose of executing several operations in parallel and that would mean that the total time of execution wouldn’t really be improved (in many scenarios, you’d probably end up degrading your algorithms). 

There’s still one last thing you should keep in mind: you should only use this technique when the order of execution for each item in the loop doesn’t matter. If you look at existing literature, you’ll see that some authors tend to call this as non-associative loops.

And that’s it for this short introduction. In the next couple of posts we’ll be seeing some code for  improving the performance of your loops (always keep in mind the observations I’ve made in this post). Keep tuned for more on multithreading.


~ by Luis Abreu on October 4, 2009.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: