Concurrent Programming in Java
Concurrent programming is a paradigm in software development that focuses on executing multiple tasks or processes simultaneously to improve program performance and responsiveness. This allows a program to make better use of multi-core processors and handle tasks concurrently, rather than sequentially. In this beginner’s guide, we’ll introduce you to the key concepts and principles of concurrent programming.
Why Concurrent Programming?
Concurrent programming is essential in modern computing for several reasons:
- Performance Improvement: Concurrent programs can utilize multiple CPU cores efficiently, which can lead to significant performance improvements for CPU-bound tasks.
- Responsiveness: For applications with user interfaces, concurrency ensures that the application remains responsive even when performing time-consuming operations in the background.
- Resource Utilization: Concurrent programs can make better use of system resources, such as CPU time, memory, and I/O devices.
- Parallelism: Some tasks naturally lend themselves to parallel execution, like rendering graphics, processing data, or handling multiple client requests in a server application.
Key Concepts in Concurrent Programming:
- Thread: A thread is the smallest unit of execution within a program. It represents an independent flow of control that can run concurrently with other threads. Most programming languages, including Java, C++, and Python, provide thread support.
- Process: A process is a self-contained program that runs independently and has its own memory space. Processes can have multiple threads, and inter-process communication (IPC) is required for them to communicate.
- Concurrency vs. Parallelism: These terms are often used interchangeably, but they have distinct meanings. Concurrency is about managing multiple tasks in overlapping time periods, while parallelism is about executing multiple tasks simultaneously.
- Synchronization: When multiple threads or processes access shared resources (like variables, data structures, or files), synchronization mechanisms are needed to ensure data consistency and avoid conflicts, such as race conditions.
- Race Condition: A race condition occurs when two or more threads or processes access shared data concurrently, leading to unpredictable and erroneous behavior. Proper synchronization can prevent race conditions.
- Mutex (Mutual Exclusion): A mutex is a synchronization primitive that allows only one thread or process to access a shared resource at a time. It helps prevent race conditions.
- Thread Safety: Code is considered thread-safe when it can be safely executed by multiple threads without causing data corruption or unexpected behavior. Design patterns and synchronization mechanisms are used to ensure thread safety.
- Deadlock: Deadlock occurs when two or more threads or processes are unable to proceed because each is waiting for the other to release a resource. Careful design and resource management are essential to avoid deadlocks.
- Thread Pool: A thread pool is a collection of pre-initialized threads that can be reused for executing tasks, which helps minimize the overhead of thread creation and destruction.
- Concurrency Control: In database systems, concurrency control mechanisms ensure that multiple transactions can access and modify the database concurrently without causing data inconsistencies.
Challenges in Concurrent Programming:
Concurrent programming can be challenging due to the following issues:
- Race Conditions: Identifying and mitigating race conditions is essential to ensure program correctness.
- Deadlocks: Avoiding deadlocks requires careful design and a thorough understanding of synchronization primitives.
- Complexity: Concurrent programs can be harder to understand, test, and debug than sequential programs.
- Performance Trade-offs: While concurrency can improve performance, it can also introduce overhead due to synchronization and context switching between threads.
- Testing: Testing concurrent programs requires specialized techniques and tools to uncover concurrency-related bugs.
In conclusion, concurrent programming is a powerful paradigm that enables efficient use of modern hardware, improves application responsiveness, and allows for parallel execution of tasks. However, it comes with challenges related to synchronization and coordination between threads or processes. Learning how to design, implement, and troubleshoot concurrent programs is an important skill for modern software developers.
Let’s look at a simple example with a counter and two threads that increase it. The program shouldn’t be too complicated. We have an object that contains a counter that increases with method increase and retrieves it with method get and two threads that increase it.
package com.tipsontech.demo;
public class Counting {
public static void main(String[] args) throws InterruptedException {
class Counter {
int counter = 0;
public void increment() {
counter++;
}
public int get() {
return counter;
}
}
final Counter counter = new Counter();
class CountingThread extends Thread {
public void run() {
for (int x = 0; x < 500000; x++) {
counter.increment();
}
}
}
CountingThread t1 = new CountingThread();
CountingThread t2 = new CountingThread();
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(counter.get());
}
}
When I run this program more times I get different results. There are three values after three executions on my laptop.
Java Counting
884873
Java Counting
959553
Java Counting
522336
What is the reason for this unpredictable behavior? The program increases the counter in one place, in method increase that uses command counter++. If we look at the command byte code we would see that it consists of several parts:
- Read counter value from memory
- Increase value locally
- Store counter value in memory
Now we can imagine what can go wrong in this sequence. If we have two threads that independently increase the counter then we could have this scenario:
- Counter value is 115
- First thread reads the value of the counter from the memory (115)
- First thread increases the local counter value (116)
- Second thread reads the value of the counter from the memory (115)
- Second thread increases the local counter value (116)
- Second thread saves the local counter value to the memory (116)
- First thread saves the local counter value to the memory (116)
- Value of the counter is 116
In this scenario, two threads are intertwined so that the counter value is increased by 1, but the counter value should be increased by 2 because each thread increases it by 1. Different threads intertwining influences the result of the program. The reason for the program’s unpredictability is that the program has no control over the thread intertwining but the operating system. Every time the program is executed, threads can intertwine differently. In this way, we introduced accidental unpredictability (non-determinism) to the program.
To fix this accidental unpredictability (non-determinism), the program must have control of the thread intertwining. When one thread is in the method increase another thread must not be in the same method until the first comes out of it. In that way, we serialize access to the method increase.
package com.tipsontech.demo;
public class CountingFixed {
public static void main(String[] args) throws InterruptedException {
class Counter {
int counter = 0;
public synchronized void increase() {
counter++;
}
public synchronized int get() {
return counter;
}
}
final Counter counter = new Counter();
class CountingThread extends Thread {
public void run() {
for (int i = 0; i < 500000; i++) {
counter.increase();
}
}
}
CountingThread thread1 = new CountingThread();
CountingThread thread2 = new CountingThread();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println("Java Counting");
System.out.println(counter.get());
}
}
Another solution is to use a counter that can increase atomically, meaning the operation can not be separated into multiple operations. In this way, we don’t need to have blocks of concurrent code that need to synchronize. Java has atomic data types in java.util.concurrent.atomic namespace, and we’ll use AtomicInteger.
package com.tipsontech.demo;
import java.util.concurrent.atomic.AtomicInteger;
public class CountingBetter {
public static void main(String[] args) throws InterruptedException {
final AtomicInteger counter = new AtomicInteger(0);
class CountingThread extends Thread {
public void run() {
for (int i = 0; i < 500000; i++) {
counter.incrementAndGet();
}
}
}
CountingThread thread1 = new CountingThread();
CountingThread thread2 = new CountingThread();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println("Java Counting");
System.out.println(counter.get());
}
}
Atomic integer has the operations that we need, so we can use it instead of the Counter class. It is interesting to note that all methods of AtomicInteger do not use locking so there is no possibility of deadlocks, which facilitates the design of the program.
Using synchronized keywords to synchronize critical methods should resolve all problems, right? Let’s imagine that we have two accounts that can deposit, withdraw, and transfer to another account. What happens if, at the same time, we want to transfer money from one account to another and vice versa? Let’s look at an example.
package com.tipsontech.demo;
public class Deadlock {
public static void main(String[] args) throws InterruptedException {
class Account {
int balance = 100;
public Account(int balance) {
this.balance = balance;
}
public synchronized void deposit(int amount) {
balance += amount;
}
public synchronized boolean withdraw(int amount) {
if (balance >= amount) {
balance -= amount;
return true;
}
return false;
}
public synchronized boolean transfer(Account destination, int amount) {
if (balance >= amount) {
balance -= amount;
synchronized (destination) {
destination.balance += amount;
}
;
return true;
}
return false;
}
public int getBalance() {
return balance;
}
}
final Account bob = new Account(200000);
final Account joe = new Account(300000);
class FirstTransfer extends Thread {
public void run() {
for (int i = 0; i < 100000; i++) {
bob.transfer(joe, 2);
}
}
}
class SecondTransfer extends Thread {
public void run() {
for (int i = 0; i < 100000; i++) {
joe.transfer(bob, 1);
}
}
}
FirstTransfer thread1 = new FirstTransfer();
SecondTransfer thread2 = new SecondTransfer();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println("Bob's balance: " + bob.getBalance());
System.out.println("Joe's balance: " + joe.getBalance());
}
}
When I run this program on my laptop it usually gets stuck. Why does this happen? If we look closely, we can see that when we transfer money we are entering into the transfer method that is synchronized and locks access to all synchronized methods on the source account, and then locks the destination account which locks access to all synchronized methods on it.
Imagine the following scenario:
- First thread calls transfer from Bob’s account to Joe’s account
- Second thread calls transfer from Joe’s account to Bob’s account
- Second thread decreases the amount from Joe’s account
- Second thread goes to deposit the amount to Bob’s account but waits for first thread to complete the transfer.
- First thread decreases the amount from Bob’s account
- First thread goes to deposit the amount to Joe’s account but waits for the second thread to complete the transfer.
In this scenario, one thread is waiting for another thread to finish the transfer and vice versa. They are stuck with each other and the program cannot continue. This is called deadlock. To avoid deadlock it is necessary to lock accounts in the same order. To fix the program we’ll give each account a unique number so that we can lock accounts in the same order when transferring the money.
package com.tipsontech.demo;
import java.util.concurrent.atomic.AtomicInteger;
public class DeadlockFixed {
public static void main(String[] args) throws InterruptedException {
final AtomicInteger counter = new AtomicInteger(0);
class Account {
int balance = 100;
int order;
public Account(int balance) {
this.balance = balance;
this.order = counter.getAndIncrement();
}
public synchronized void deposit(int amount) {
balance += amount;
}
public synchronized boolean withdraw(int amount) {
if (balance >= amount) {
balance -= amount;
return true;
}
return false;
}
public boolean transfer(Account destination, int amount) {
Account first;
Account second;
if (this.order < destination.order) {
first = this;
second = destination;
} else {
first = destination;
second = this;
}
synchronized (first) {
synchronized (second) {
if (balance >= amount) {
balance -= amount;
destination.balance += amount;
return true;
}
return false;
}
}
}
public synchronized int getBalance() {
return balance;
}
}
final Account bob = new Account(200000);
final Account joe = new Account(300000);
class FirstTransfer extends Thread {
public void run() {
for (int i = 0; i < 100000; i++) {
bob.transfer(joe, 2);
}
}
}
class SecondTransfer extends Thread {
public void run() {
for (int i = 0; i < 100000; i++) {
joe.transfer(bob, 1);
}
}
}
FirstTransfer thread1 = new FirstTransfer();
SecondTransfer thread2 = new SecondTransfer();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println("Bob's balance: " + bob.getBalance());
System.out.println("Joe's balance: " + joe.getBalance());
}
}
Due to the unpredictability of such mistakes, they sometimes happen, but not always and they are difficult to reproduce. If the program behaves unpredictably, it is usually caused by concurrency which introduces accidental non-determinism. To avoid accidental non-determinism we should in advance design the program to take into account all intertwinings.
An example of a program that has an accidental non-determinism.
package com.tipsontech.demo;
public class NonDeterminism {
public static void main(String[] args) throws InterruptedException {
class Container {
public String value = "Empty";
}
final Container container = new Container();
class FastThread extends Thread {
public void run() {
container.value = "Fast";
}
}
class SlowThread extends Thread {
public void run() {
try {
Thread.sleep(50);
} catch (Exception e) {
}
container.value = "Slow";
}
}
FastThread fast = new FastThread();
SlowThread slow = new SlowThread();
fast.start();
slow.start();
fast.join();
slow.join();
System.out.println(container.value);
}
}
This program has accidental non-determinism in it. The last value entered in the container will be displayed.
Slow
Slower threads will enter the value later, and this value will be printed (Slow). But this needs not to be the case. What if the computer simultaneously executes another program that needs a lot of CPU resources? We have no guarantee that it will be the slower thread that enters value last because it is controlled by the operating system, not the program. We can have situations where the program works on one computer and on the other behaves differently. Such concurrent computing errors are difficult to find and they cause headaches for developers. For all these reasons this concurrency model is very difficult to do right.