Forms and Patterns of Concurrency

4/23/2015

To most developers, the word "concurrency" automatically implies "multithreading" or "parallel programming". And while multithreading is a form of concurrency, it is not the only form of concurrency. Concurrency means doing more than one thing at a time. While multithreading is a form of concurrency that uses multiple threads of execution, its not the only form of concurrency. Other forms of concurrency are asynchronous programming and reactive programming. This article is intended to highlight the different forms of concurrency and when/how to introduce them in the right places in your application.

Parallel Programming

Parallel programming takes a computation task and breaks it up into smaller tasks/chunks that get executed in parallel using multiple threads. Parallel programming aims to take advantage of multiple CPU cores in modern machines. For computations that make use of parallel programming, CPU throughput is increased which should translate to increased performance. Take the example of a news aggregator application which pulls down articles as HTML pages and is then tasked with parsing each article. To improve the performance of the app, it would be prudent to perform the parsing of the collection of articles using parallel programming. Here are a couple of factors that indicate that a task would benefit from being parallelized;
  - The task to be parallelized should be a CPU intensive/bound task. If the computation task is I/O bound such as making a long running web service call or database query, then you should be using a different form of concurrency instead; asynchronous programming. Even if you have multiple I/O bound operations that you want to execute in parallel, async programming is still a better fit.
  - The chunked tasks should be independent of each other. There should be minimal (preferrably zero) shared state between the tasks and the execution of each task should produce no side effects. This model of computation is characteristic of functional programming and is why the functional style is often used for high performance applications/servers. The chunked task code should be as functional as possible. Shared state would require synchronization, which reduces the parallelism and efficiency of the application. Side effects are also problematic as tasks can be run in different threads causing unpredictability.
  - The chunked tasks should be relatively long-running enough to mitigate the performance overhead of task division and thread provisioning or scheduling.
Sometimes the parent thread is required to wait for the completion of the sub-tasks, most modern languages these days have higher level abstractions (than Thread.join) to make this easier. In Java, the CountDownLatch class is great for this. In C#, this functionality is built into the Task class via the Wait* methods. Results from sub-tasks, if any, should be added to a thread-safe collection. These collections are optimized for multithreaded access and are often more efficient than synchronizing access to a non-thread-safe collection using a mutex / lock.

Asynchronous Programming

Asynchronous Programming is a form of concurrency in which "callbacks" or "futures" are used to respond to some operation that will complete in the future thereby freeing up an application to do other work. A "future" is a type/interface that represents the operation that will complete in the future. The model for asynchronous programming typically follows the following pattern, some operation is started that will complete at some later time. While waiting for the operation to complete, the current process is not blocked from doing other work. When the operation is completed, it notifies its "future" or invokes its callback method to let the process know that the operation has been completed.

Async programming avoids creating unnecessary threads and therefore avoids the cost of thread provisioning and context switching. Instead of creating a separate thread to execute the "future operation", async programming allows developers to both continue doing other work and respond to the "future operation" when it completes using a single thread. Java, C#, Python, Javascript and many other high level programming languages provide abstractions to make it easier to do async programming. These obviously won't be covered here but we should note what makes an operation a good candidate for async programming as opposed to parallel programming.

  - Async tasks should be I/O bound. Because CPU rates are orders of magnitude faster than disk or network rates, processes that synchronously perform I/O bound tasks (such as data transfers from a database or a remote server in the form of database or web calls) typically spend a lot of time in a blocked state, waiting for the transfer to complete. Performing such tasks asynchronously allows the CPU to continue executing other instructions for the process while waiting for the transfer to complete.

  - There should be some other work to do. If there are no other tasks/work that can be done while waiting for data transfers to complete then there is no point in running a task asynchronously. In some cases, the "other work to do" similarly involve data transfers and could also be run async.

  - Concurrent tasks that use async programming should be independent from one another. Any dependencies, such as one task using the results of another, leads to synchronous programming.

In the fictitious news aggregator app we introduced earlier, async programming should be used for the task of downloading the HTML articles concurrently. This job meets our criteria for async candidacy as each download task is I/O bound and independent of the other. This is an example of client-side async programming. Another very common client-side usage of async programming is in user interfaces. The non-blocking property of async programming allows the UI to stay responsive while some form of data transfer occurs in the background.

Server side async programming has gained popularity with the introduction of the Twisted and Node.js (or IO.js) frameworks. For certain applications async web servers scale better than multithreaded servers by foregoing the performance penalty of context switching and thread provisioning, in favor of using a single thread and servicing requests in the time in which other requests are asynchronously waiting on I/O. For an excellent explanation of server-side asynchronous programming, see the following link [http://cs.brown.edu/courses/cs168/f12/handouts/async.pdf].

Other forms of concurrency

I have no experience with Reactive Programming (Rx) so I won't spend time discussing it here. Reactive programming is programming with asynchronous data streams. It allows you to treat a stream of events like a stream of data that can be observed and subsequently manipulated, filtered and transformed. For a quick, Java-centric guide to Rx, check out this blog post (https://www.bignerdranch.com/blog/what-is-functional-reactive-programming/). For a more expressive (read: longer), C#-influenced guide to Rx, check out this other blog post (https://gist.github.com/staltz/868e7e9bc2a7b8c1f754).

You Might Also Like

0 comments