Some shared memory multiprocessors have relaxed memory access ordering, such as the Alpha processor, and the Itanium processor. A Java virtual machine running on such processors may have issues with multi-threaded program that can fail unpredictably.
In this article, we will describe some failure scenarios, and the potential remedies.
In a shared memory multiprocessor system, physical memory is shared among multiple processors. Processor specification must identify the behavior of memory when access is interleaved among the processors.
In traditional multiprocessor design, memory accesses are typically ordered. If processor X wrote 0 to a memory location, then wrote 1 to a different memory location, then the second write can complete no earlier than the first write.
With some of the newer processors with relaxed memory access ordering, memory accesses are not ordered by default. If processor X wrote 0 to a memory location, then wrote 1 to a different memory location, then either of the writes may complete first.
More precisely, accesses to the same memory location on the same processor are ordered. Accesses to different memory locations on the same processor are not ordered. Accesses to the same memory location on different processors are not ordered. Accesses to different memory locations on different processors are not ordered.
| Accessing same memory location | Accessing different memory locations | |
|---|---|---|
| on same processor | ordered | not ordered |
| on different processors | not ordered | not ordered |
Paul Jakubik has a good paper on Java and Multi-processor that talks in very detailed level about the multi-processor memory system.
Why can program fail unpredictably on the new multi-processor systems? Let's look at this example.
public class Boolean
{
boolean v;
public Boolean(boolean arg) { this.v = arg; }
public boolean booleanValue() { return this.v; }
}
public class foo extends Thread
{
Boolean b;
// declaring this method as
// public synchronized boolean read()
// would fix the race condition
public boolean read()
{
if (b == null)
{
b = new Boolean(true); // lazy init
}
return b.booleanValue();
}
public void run()
{
read();
}
public static void main(String[] args)
{
foo obj = new foo();
obj.start(); // spawn new thread in run()
System.out.println(obj.read()); // 'false' can be printed
}
}
|
The Java language provides a guarantee when objects are initialized, all fields are 0. For a boolean field, this corresponds to the value of 'false'. In the Boolean constructor, the input argument value would over-write the initial value of 'false'.
For an object to be accessible to multiple threads, the object reference must be written to a memory location visible to different threads. The above example does exactly that.
Let's look at one potential sequence of program run. For this run, it is assumed the two threads are running on different processors.
The red flag for bad code is for one thread to read a memory location, while a different thread write to that same memory location, all without synchronization protection.
This example shows using a shared flag to communicate status between threads is unsafe if there is no locking involved.
public class person extends Thread
{
boolean bankrupt;
boolean overdrawn_flag;
public void run()
{
boolean temp = bankrupt; // no signal yet, bankrupt should be false?
overdrawn_flag = true; // signal overdrawn
System.out.println(temp); // 'true' can be printed!
}
public static void main(String[] args)
{
person p = new person();
p.start();
while (p.overdrawn_flag == false) // wait for overdrawn
{
try { Thread.sleep(7); } catch (InterruptedException e) {}
}
p.bankrupt = true; // overdrawn, set bankrupt to true
}
}
|
Here is one potential sequence of program run. For this run, it is assumed the two threads are running on different processors.
The correct way to implement the above example is to use locking, in combinations of wait(), and notify(). The next listing shows the correct code.
public class person extends Thread
{
boolean bankrupt;
boolean overdrawn_flag;
public synchronized void run()
{
boolean temp = bankrupt; // no signal yet, bankrupt should be false
overdrawn_flag = true; // signal overdrawn
notifyAll(); // wake up waiters
System.out.println(temp);
}
public static void main(String[] args)
{
person p = new person();
p.start();
synchronized (p)
{
while (p.overdrawn_flag == false) // wait for overdrawn
{
try { p.wait(); } catch (InterruptedException e) {}
}
p.bankrupt = true; // overdrawn, set bankrupt to true
}
}
}
|
In the first example program, the program always uses 'true' boolean values, and yet the program prints 'false' in a non-intuitive manner.
Worse yet, the safety of immutable objects such as Boolean and String have security implications. Bad boolean values can accidentally turn off security; bad string values can leak information about other strings.
In the second example program, a flag was used to communicate between threads. Without any synchronization, reads before the signal could read values that would be set after the signal.
It is difficult to test for these kind of faults, because faulty behavior only occur on multi-processor systems under certain kind of memory interleavings.
There are many places in the system library, and in customer's code base where the codes are not multi-processor safe. These bugs were not exposed because the system ran on traditional systems historically.
The discussion about Java memory model is still on-going. So far there is not certainty about which remedies will be adopted. There are a lot of materials at William Pugh's web site.
The following are a list of possible remedies.
James Gosling, Bill Joy, Guy Steele, "The Java Language Specification", Addison Wesley, (1996)
David R. Butenhof, Programming with POSIX Threads, Addison Wesley, 1997
Paul Jakubik, "Multiprocessor Safety and Java", 1999
William Pugh, "Fixing the Java Memory Model", Proceedings of the Java Grande Conference, June 12-14, 1999