Java and Shared Memory Multiprocessor Safety

Thomas Wang, July 2000
last update August 2000

Abstract

Some shared memory multiprocessors have relaxed memory access ordering, such as the Alpha processor, and the Itanium processor. A Java virtual machine running on such processors may have issues with multi-threaded program that can fail unpredictably.

In this article, we will describe some failure scenarios, and the potential remedies.

Introduction

In a shared memory multiprocessor system, physical memory is shared among multiple processors. Processor specification must identify the behavior of memory when access is interleaved among the processors.

In traditional multiprocessor design, memory accesses are typically ordered. If processor X wrote 0 to a memory location, then wrote 1 to a different memory location, then the second write can complete no earlier than the first write.

With some of the newer processors with relaxed memory access ordering, memory accesses are not ordered by default. If processor X wrote 0 to a memory location, then wrote 1 to a different memory location, then either of the writes may complete first.

More precisely, accesses to the same memory location on the same processor are ordered. Accesses to different memory locations on the same processor are not ordered. Accesses to the same memory location on different processors are not ordered. Accesses to different memory locations on different processors are not ordered.

Memory Access Ordering Classification
Accessing same memory location Accessing different memory locations
on same processororderednot ordered
on different processorsnot orderednot ordered

Paul Jakubik has a good paper on Java and Multi-processor that talks in very detailed level about the multi-processor memory system.

Problem Code

Why can program fail unpredictably on the new multi-processor systems? Let's look at this example.

public class Boolean
{
  boolean v;
  public Boolean(boolean arg) { this.v = arg; }
  public boolean booleanValue() { return this.v; }
}
  
public class foo extends Thread
{
  Boolean b;
  // declaring this method as
  // public synchronized boolean read()
  // would fix the race condition
  public boolean read()
  {
    if (b == null)
    {
      b = new Boolean(true); // lazy init
    }
    return b.booleanValue();
  }
  public void run()
  {
    read();
  }
  public static void main(String[] args)
  {
    foo obj = new foo();
    obj.start(); // spawn new thread in run()
    System.out.println(obj.read()); // 'false' can be printed
  }
}

The Java language provides a guarantee when objects are initialized, all fields are 0. For a boolean field, this corresponds to the value of 'false'. In the Boolean constructor, the input argument value would over-write the initial value of 'false'.

For an object to be accessible to multiple threads, the object reference must be written to a memory location visible to different threads. The above example does exactly that.

Let's look at one potential sequence of program run. For this run, it is assumed the two threads are running on different processors.

  1. Spawned thread allocates a Boolean object with 'false' default value.
  2. Spawned thread writes Boolean object's reference to variable 'b'. Remember that writes to different memory location need not be ordered. Only accesses to the same memory location on the same processor are ordered.
  3. Main thread reads the value of 'b', which is the already created Boolean object.
  4. Main thread reads and prints the Boolean's object value as 'false'.
  5. Spawned thread writes 'true' to the Boolean object.

The red flag for bad code is for one thread to read a memory location, while a different thread write to that same memory location, all without synchronization protection.

More Problem Code

This example shows using a shared flag to communicate status between threads is unsafe if there is no locking involved.

public class person extends Thread
{
  boolean bankrupt;
  boolean overdrawn_flag;
  public void run()
  {
    boolean temp = bankrupt; // no signal yet, bankrupt should be false?
    overdrawn_flag = true; // signal overdrawn
    System.out.println(temp); // 'true' can be printed!
  }
  public static void main(String[] args)
  {
    person p = new person();
    p.start();
    while (p.overdrawn_flag == false) // wait for overdrawn
    {
      try { Thread.sleep(7); } catch (InterruptedException e) {}
    }
    p.bankrupt = true; // overdrawn, set bankrupt to true
  }
}

Here is one potential sequence of program run. For this run, it is assumed the two threads are running on different processors.

  1. Spawned thread sets 'overdrawn' flag to true. Remember that read and write operation to different memory location need not be ordered.
  2. Main thread detects 'overdrawn' flag as true.
  3. Main thread sets 'bankrupt' flag to true.
  4. Spawned thread reads 'bankrupt' flag as true.
  5. Spawned thread writes 'true' to terminal.

The correct way to implement the above example is to use locking, in combinations of wait(), and notify(). The next listing shows the correct code.

public class person extends Thread
{
  boolean bankrupt;
  boolean overdrawn_flag;
  public synchronized void run()
  {
    boolean temp = bankrupt; // no signal yet, bankrupt should be false
    overdrawn_flag = true; // signal overdrawn
    notifyAll(); // wake up waiters
    System.out.println(temp);
  }
  public static void main(String[] args)
  {
    person p = new person();
    p.start();
    synchronized (p)
    {
      while (p.overdrawn_flag == false) // wait for overdrawn
      {
        try { p.wait(); } catch (InterruptedException e) {}
      }
      p.bankrupt = true; // overdrawn, set bankrupt to true
    }
  }
}

Consequences Of Reading A Bad Value

In the first example program, the program always uses 'true' boolean values, and yet the program prints 'false' in a non-intuitive manner.

Worse yet, the safety of immutable objects such as Boolean and String have security implications. Bad boolean values can accidentally turn off security; bad string values can leak information about other strings.

In the second example program, a flag was used to communicate between threads. Without any synchronization, reads before the signal could read values that would be set after the signal.

It is difficult to test for these kind of faults, because faulty behavior only occur on multi-processor systems under certain kind of memory interleavings.

There are many places in the system library, and in customer's code base where the codes are not multi-processor safe. These bugs were not exposed because the system ran on traditional systems historically.

Possible Remedies

The discussion about Java memory model is still on-going. So far there is not certainty about which remedies will be adopted. There are a lot of materials at William Pugh's web site.

The following are a list of possible remedies.

  1. Limit all Java threads to run on a single processor only. Performance is negatively impacted.
  2. Always use ordered memory read and write operations. Performance is negatively impacted.
  3. Emit a memory fence opcode after every call to constructor. Fixes race conditions with constructors, but object field updates outside of constructor are not covered.
  4. Expose Thread.memoryFence() method call externally. May violate the spirit of platform neutrality.
  5. Use the 'final' keyword on the Boolean value field declaration. Under a proposed memory model, this would ensure final field updates are ordered. Need to hunt down and fix the source.
  6. Put 'synchronize', or 'volatile' keywords in appropriate places to remove race conditions. Need to hunt down and fix the source.
  7. Create a Java wrapper switch to run either in conservative memory mode, or aggressive memory mode.

References

James Gosling, Bill Joy, Guy Steele, "The Java Language Specification", Addison Wesley, (1996)

David R. Butenhof, Programming with POSIX Threads, Addison Wesley, 1997

Paul Jakubik, "Multiprocessor Safety and Java", 1999

William Pugh, "Fixing the Java Memory Model", Proceedings of the Java Grande Conference, June 12-14, 1999