Tuesday, March 18, 2008

Thread Interaction

How threads can interact with one another to communicate about—among other things—their locking status? The Object class has three methods, wait(), notify(), and notifyAll() that help threads communicate about the status of an event that the threads care about. For example, if one thread is a mail-delivery thread and one thread is a mail-processor thread, the mail-processor thread has to keep checking to see if there's any mail to process. Using the wait and notify mechanism, the mail-processor thread could check for mail, and if it doesn't find any it can say, "Hey, I'm not going to waste my time checking for mail every two seconds. I'm going to go hang out, and when the mail deliverer puts something in the mailbox, have him notify me so I can go back to runnable and do some work." In other words, using wait() and notify() lets one thread put itself into a "waiting room" until some other thread notifies it that there's a reason to come back out.

One key point to remember (and keep in mind for the exam) about wait/notify is this:

wait(), notify(), and notifyAll() must be called from within a synchronized context! A thread can't invoke a wait or notify method on an object unless it owns that object's lock.

Here we'll present an example of two threads that depend on each other to proceed with their execution, and we'll show how to use wait() and notify() to make them interact safely and at the proper moment.

Think of a computer-controlled machine that cuts pieces of fabric into different shapes and an application that allows users to specify the shape to cut. The current version of the application has one thread, which loops, first asking the user for instructions, and then directs the hardware to cut the requested shape:

public void run(){
while(true){
// Get shape from user
// Calculate machine steps from shape
// Send steps to hardware
}
}

This design is not optitnal because the user cant do anything while the machine is busy and while there are other shapes to define. We need to improve the situation.

A simple solution is to separate the processes into two different threads, one of them interacting with the user and another managing the hardware. The user thread sends the instructions to the hardware thread and then goes back to interacting with the user immediately. The hardware thread receives the instructions from the user thread and starts directing the machine immediately. Both threads use a common object to communicate, which holds the current design being processed.

The following pseudocode shows this design:

public void userLoop(){
while (true) {
// Get shape from user
// calculate machine steps from shape
// Modify common object with new machine steps

}
}

public void hardwareLoop(){
while(true){
// Get steps from common object
// Send steps to hardware } }
}
}

The problem now is to get the hardware thread to process the machine steps as soon as they are available. Also, the user thread should not modify them until they have all been sent to the hardware. The solution is to use wait() and notify(), and also to synchronize some of the code.

The methods wait() and notify(), remember, are instance methods of Object. In the same way that every object has a lock, every object can have a list of threads that are waiting for a signal (a notification) from the object. A thread gets on this waiting list by executing the wait() method of the target object. From that moment, it doesn't execute any further instructions until the notify() method of the target object is called. If many threads are waiting on the same object, only one will be chosen (in no guaranteed order) to proceed with its execution. If there are no threads waiting, then no particular action is taken. Let's take a look at some real code that shows one object waiting for another object to notify it (take note, it is somewhat complex):

 1.  class ThreadA {
2. public static void main(String [] args) {
3. ThreadB b = new ThreadB();
4. b.start();
5.
6. synchronized(b) {
7. try {
8. System.out.println("Waiting for b to complete...");
9. b.wait();
10. } catch (InterruptedException e) {}
11. System.out.println("Tota7. is: " + b.total);
12. }
13. }
14. }
15.
16. class ThreadB extends Thread {
17. int total;

18.
19. public void run( ) {
20. synchronized(this) {
21. for(int i=0;i<100;i++)>

This program contains two objects with threads: ThreadA contains the main thread and ThreadB has a thread that calculates the sum of all numbers from 0 through 99. As soon as line 4 calls the start() method, ThreadA will continue with the next line of code in its own class, which means it could get to line 11 before ThreadB has finished the calculation. To prevent this, we use the wait() method in line 9.

Notice in line 6 the code synchronizes itself with the object b—this is because in order to call wait() on the object, ThreadA must own a lock on b. For a thread to call wait() or notify(), the thread has to be the owner of the lock for that object. When the thread waits, it temporarily releases the lock for other threads to use, but it will need it again to continue execution. It's common to find code like this:

synchronized(anotherObject) { // this has the lock on anotherObject
try {
anotherObject.wait( ) ;
// the thread releases the lock and waits
// To continue, the thread needs the lock,
// so it may be blocked until it gets it.
} catch(InterruptedException e){}
}

The preceding code waits until notify() is called on anotherObject.

synchronized(this) { notify(); }

This code notifies a single thread currently waiting on the this object. The lock can be acquired much earlier in the code, such as in the calling method. Note that if the thread calling wait() does not own the lock, it will throw an IllegalMonitorStateException. This exception is not a checked exception, so you don't have to catch it explicitly. You should always be clear whether a thread has the lock of an object in any given block of code.

Notice in lines 7–10 there is a try/catch block around the wait() method. A waiting thread can be interrupted in the same way as a sleeping thread, so you have to take care of the exception:

try {
wait() ;
} catch(InterruptedException e) {
// Do something about it
}

In the fabric example, the way to use these methods is to have the hardware thread wait on the shape to be available and the user thread to notify after it has written the steps. The machine steps may comprise global steps, such as moving the required fabric to the cutting area, and a number of substeps, such as the direction and length of a cut. As an example they could be

int fabricRoll;
int cuttingSpeed;
Point startingPoint;
float[] directions;
float[] lengths;
etc..

It is important that the user thread does not modify the machine steps while the hardware thread is using them, so this reading and writing should be synchronized.

The resulting code would look like this:

class Operator extends Thread {
public void run () {
while (true){
// Get shape from user
synchronized(this){
// Calculate new machine steps from shape
notify();
}
}
}
}
class Machine extends Thread {
Operator operator; // assume this gets initialized

public void run(){
while (true){
synchronized(operator){
try {
operator.wait();
} catch(InterruptedException ie) {}
// Send machine steps to hardware
}
}
}
}

The machine thread, once started, will immediately go into the waiting state and will wait patiently until the operator sends the first notification. At that point it is the operator thread that owns the lock for the object, so the hardware thread gets stuck for a while. It's only after the operator thread abandons the synchronized block that the hardware thread can really start processing the machine steps.

While one shape is being processed by the hardware, the user may interact with the system and specify another shape to be cut. When the user is finished with the shape and it is time to cut it, the operator thread attempts to enter the synchronized block, maybe blocking until the machine thread has finished with the previous machine steps. When the machine thread has finished, it repeats the loop, going again to the waiting state (and therefore releasing the lock). Only then can the operator thread enter the synchronized block and overwrite the machine steps with the new ones.

Having two threads is definitely an improvement over having one, although in this implementation there is still a possibility of making the user wait. A further improvement would be to have many shapes in a queue, thereby reducing the possibility of requiring the user to wait for the hardware.

There is also a second form of wait() that accepts a number of milliseconds as a maximum time to wait. If the thread is not interrupted, it will continue normally whenever it is notified or the specified timeout has elapsed. This normal continuation consists of getting out of the waiting state, but to continue execution it will have to get the lock for the object:

synchronized(a){ // The thread gets the lock on 'a'
a.wait(2000); // Thread releases the lock and waits for notify
// only for a maximum of two seconds, then goes back to Runnable
// The thread reacquires the lock
// More instructions here
}

When the wait() method is invoked on an object, the thread executing that code gives up its lock on the object immediately. However, when notify() is called, that doesn't mean the thread gives up its lock at that moment. If the thread is still completing synchronized code, the lock is not released until the thread moves out of synchronized code. So just because notify() is called doesn't mean the lock becomes available at that moment.

Using notifyAll() When Many Threads May Be Waiting

In most scenarios, it's preferable to notify all of the threads that are waiting on a particular object. If so, you can use notifyAll() on the object to let all the threads rush out of the waiting area and back to runnable. This is especially important if you have several threads waiting on one object, but for different reasons, and you want to be sure that the right thread (along with all of the others) gets notified.

notifyAll(); // Will notify all waiting threads

All of the threads will be notified and start competing to get the lock. As the lock is used and released by each thread, all of them will get into action without a need for further notification.

As we said earlier, an object can have many threads waiting on it, and using notify() will affect only one of them. Which one, exactly, is not specified and depends on the JVM implementation, so you should never rely on a particular thread being notified in preference to another.

In cases in which there might be a lot more waiting, the best way to do this is by using notifyAll(). Let's take a look at this in some code. In this example, there is one class that performs a calculation and many readers that are waiting to receive the completed calculation. At any given moment many readers may be waiting.

 l.  class Reader extends Thread {
2. Calculator c;
3.
4. public Reader(Calculator calc) {
5. c = calc;

6. }
7.
8. public void run() {
9. synchronized(c) {
10. try {
11. System.out.println("Waiting for calculation...");
12. c.wait ( ) ;
13. } catch (InterruptedException e) {}
14. System.out.println("Total is: " + c.total);
15. }
16. }
17.
18. public static void main(String [] args) {
19. Calculator calculator = new Calculator();
20. new Reader(calculator).start();
21. new Reader(calculator).start();
22. new Reader(calculator).start();
23. calculator.start();
24. }
25. }
26.
27. class Calculator extends Thread {
28. int total;
29.
30. public void run() {
31. synchronized(this) {
32. for(int i=0;i<100;i++)>

The program starts three threads that are all waiting to receive the finished calculation (lines 18-24), and then starts the calculator with its calculation. Note that if the run() method at line 30 used notify() instead of notifyAll(), only one reader would be notified instead of all the readers.

Using wait() in a Loop

Actually both of the previous examples (Machine/Operator and Reader/Calculator) had a common problem. In each one, there was at least one thread calling wait(), and another thread calling notify() or notifyAll(). This works well enough as long as the waiting threads have actually started waiting before the other thread executes the notify() or notifyAll(). But what happens if, for example, the Calculator runs first and calls notify() before the Readers have started waiting? This could happen, since we can't guarantee what order the different parts of the thread will execute in. Unfortunately, when the Readers run, they just start waiting right away. They don't do anything to see if the event they're waiting for has already happened. So if the Calculator has already called notifyAll(), it's not going to call notifyAll() again—and the waiting Readers will keep waiting forever. This is probably not what the programmer wanted to happen. Almost always, when you want to wait for something, you also need to be able to check if it has already happened. Generally the best way to solve this is to put in some sort of loop that checks on some sort of conditional expressions, and only waits if the thing you're waiting for has not yet happened. Here's a modified, safer version of the earlier fabric-cutting machine example:

class Operator extends Thread {
Machine machine; // assume this gets initialized
public void run () {
while (true) {
Shape shape = getShapeFromUser();
Machinelnstructions job = calculateNewInstructionsFor(shape);
machine.addJob(job);
}
}
}

The operator will still keep on looping forever, getting more shapes from users, calculating new instructions for those shapes, and sending them to the machine. But now the logic for notify() has been moved into the addJob() method in the Machine class:

class Machine extends Thread {
List jobs = new ArrayList ();

public void addJob(Machinelnstructions job) {
synchronized (jobs) {
jobs.add(job);
jobs.notify();
}
}
public void run () {
while (true) {
synchronized (jobs) {
// wait until at least one job is available
while (jobs.isEmpty()) {
try {
jobs.wait();
} catch (InterruptedException ie) { }
}
// If we get here, we know that jobs is not empty
MachineInstructions instructions = jobs.remove(0);
// Send machine steps to hardware
}
}
}
}

A machine keeps a list of the jobs it's scheduled to do. Whenever an operator adds a new job to the list, it calls the addJob() method and adds the new job to the list. Meanwhile the run() method just keeps looping, looking for any jobs on the list. If there are no jobs, it will start waiting. If it's notified, it will stop waiting and then recheck the loop condition: is the list still empty? In practice this double-check is probably not necessary, as the only time a notify() is ever sent is when a new job has been added to the list. However, it's a good idea to require the thread to recheck the isEmpty() condition whenever it's been woken up, because it's possible that a thread has accidentally sent an extra notify() that was not intended. There's also a possible situation called spontaneous wakeup that may exist in some situations—a thread may wake up even though no code has called notify()or notifyAll(). (At least, no code you know about has called these methods. Sometimes the JVM may call notify() for reasons of its own, or code in some other class calls it for reasons you just don't know.) What this means is, when your thread wakes up from a wait(), you don't know for sure why it was awakened. By putting the wait() method in a while loop and re-checking the condition that represents what we were waiting for, we ensure that whatever the reason we woke up, we will re-enter the wait() if (and only if) the thing we were waiting for has not happened yet. In the Machine class, the thing we were waiting for is for the jobs list to not be empty. If it's empty, we wait, and if it's not, we don't.

Note also that both the run() method and the addJob() method synchronize on the same object—the jobs list. This is for two reasons. One is because we're calling wait() and notify() on this instance, so we need to synchronize in order to avoid an IllegalThreadState exception. The other reason is, the data in the jobs list is changeable data stored in a field that is accessed by two different threads. We need to synchronize in order to access that changeable data safely. Fortunately, the same synchronized blocks that allow us to wait() and notify() also provide the required thread safety for our other access to changeable data. In fact this is a main reason why synchronization is required to use wait() and notify() in the first place—you almost always need to share some mutable data between threads at the same time, and that means you need synchronization. Notice that the synchronized block in addJob() is big enough to also include the call to jobs.add(job)—which modifies shared data. And the synchronized block in run() is large enough to include the whole while loop—which includes the call to jobs.isEmpty(), which accesses shared data.

The moral here is that when you use wait() and notify() or notifyAll(), you should almost always also have a while loop around the wait() that checks a condition and forces continued waiting until the condition is met. And you should also make use of the required synchronization for the wait() and notify() calls, to also protect whatever other data you're sharing between threads. If you see code which fails to do this, there's usually something wrong with the code—even if you have a hard time seeing what exactly the problem is.


No comments:

Post a Comment