Contention problems can severely impact Java application performance. We will examine how to manage these problems in a lookup server written in Java.
When analyzing the performance of a Java program on a multi-CPU machine, we may encounter a situation where a contention problem throttles the performance. When this happens, we need to investigate the cause of the problem.
Often, the problem occurs when multiple threads try to simultaneously access a synchronized method or block. If each thread remains in the synchronized method or block for too long, the other threads will need to queue and wait. Code meant to execute in parallel instead executes serially, reducing throughput for the application.
In this article, we will use a look-up server to illustrate the concept.
public class LookupServer
{
static HashMap hashmap = new HashMap();
public synchronized static Result lookup(Key thekey)
{
Result myresult = (Result) hashmap.get(thekey);
if (myresult != null)
{
return myresult;
}
else if (thekey.isValidForLookup())
{
myresult = new Result(thekey);
hashmap.put(thekey, myresult);
return myresult;
}
else
throw new RuntimeException();
}
}
The look-up server performs a look-up task using a hash table. The input is a 'Key' Object, and the output is a 'Result' object. The entire method is synchronized. Using a Java profiler, our profiling trace shows that the lock delay time for LookupServer.lookup() is very high.
Typically, there are three general ways to deal with a contention problem:
Because it appears the code inside the synchronization region is already as fast as it can be, making the synchronization region faster is not applicable in this example.
In this approach, we create multiple internal servers each of which manages a subset of the resource. A method is required to separate the data evenly between the internal servers. We will use a hash function.
public class LookupServer
{
static final LookupServer[] serverArray = new LookupServer[16];
static
{
for (int indx = serverArray.length; --indx >= 0; )
{
serverArray[indx] = new LookupServer();
}
}
HashMap hashmap = new HashMap();
synchronized Result lookupInternal(Key thekey)
{
Result myresult = (Result) hashmap.get(thekey);
if (myresult != null)
{
return myresult;
}
else if (thekey.isValidForLookup())
{
myresult = new Result(thekey);
hashTable.put(thekey, myresult);
return myresult;
}
else
throw new RuntimeException();
}
public static Result lookup(Key thekey)
{
int h = thekey.hashCode();
int bucket = h & 15; // pick a server
return serverArray[bucket].lookupInternal(thekey);
}
}
In this approach, each individual server handles 1/16 of the total work-load. Of course, contention is still possible, if two separate look-up requests hash to the same server. This is why having multiple servers may be part of a solution, but not the entire solution.
Usually hash table accesses are fast. The reason that the look-up method shows high contention may not be due to the hash table access. It might be the case that the validation method or the result construction method consumes too much time. This would require breaking up the synchronized block into multiple subsections.
If validation is found to be consuming large amount of time, we can re-code the look-up method to exclude the validation method from the synchronized block.
public class LookupServer
{
static final LookupServer[] serverArray = new LookupServer[16];
static
{
for (int indx = serverArray.length; --indx >= 0; )
{
serverArray[indx] = new LookupServer();
}
}
HashMap hashmap = new HashMap();
Result lookupInternal(Key thekey)
{
Result myresult = null;
synchronized (this) // first sync region
{
myresult = (Result) hashmap.get(thekey);
}
if (myresult != null)
{
return myresult;
}
if (thekey.isValidForLookup())
{
synchronized (this) // second sync region
{
myresult = (Result) hashmap.get(thekey); // avoid re-init
if (myresult == null)
{
myresult = new Result(thekey);
hashmap.put(thekey, myresult);
}
}
return myresult;
}
else
throw new RuntimeException();
}
public static Result lookup(Key thekey)
{
int h = thekey.hashCode();
int bucket = h & 15; // pick a server
return serverArray[bucket].lookupInternal(thekey);
}
}
A tricky point occurs when we release the monitor. Another client may enter the second block and initialize the result and put it into the hash table. That means when we want to create the Result object, we have to check a second time to see if the Result object has already been added to the hash table.
We have excluded validation from being synchronized, however we have also increased the amount of synchronization required. Previously we need to acquire one monitor for a look-up request. Now we need to acquire the same monitor twice for a look-up request. This is not desirable. Examining how we deal with object creation is key to solving this problem.
What if the result creation consumes too much time? We cannot naively move it outside the synchronization region, because we may create a situation where multiple Result objects get created from concurrent clients. Depending on the specific implementation details, this will be both incorrect and hard to debug.
We can however, use a strategy called 'delayed object initialization'. The result object constructor does nothing, and the full result object construction is delayed to a later point.
public class Result
{
...
boolean isInited = false;
public Result() {} // does nothing significant
public synchronized init(Key thekey) // second phase init
{
if (! isInited)
{
// init tasks
if (! thekey.isValidForLookup())
throw new RuntimeException();
...
isInited = true;
}
}
}
public class LookupServer
{
static final LookupServer[] serverArray = new LookupServer[16];
static
{
for (int indx = serverArray.length; --indx >= 0; )
{
serverArray[indx] = new LookupServer();
}
}
HashMap hashmap = new HashMap();
Result lookupInternal(Key thekey)
{
Result myresult = null;
synchronized (this)
{
myresult = (Result) hashTable.get(thekey);
if (myresult == null)
{
myresult = new Result(); // first phase init (does nothing)
hashmap.put(thekey, myresult);
}
}
myresult.init(thekey); // second phase init
return myresult;
}
public static Result lookup(Key thekey)
{
int h = thekey.hashCode();
int bucket = h & 15; // pick a server
return serverArray[bucket].lookupInternal(thekey);
}
}
In the second phase initialization, we make sure the object is initialized once only. We also guarantee all use of the Result object must be preceded by a second phase initialization.
This approach allows us to keep the original set-up where the monitor protecting the hashmap only need to be acquired once per look-up.
What if we did all this, and thread contention remains? Try to increase the number of servers until the bottleneck disappears.
We have presented some strategies for removing thread contention. They can be used in application tuning efforts to boost multi-threaded program performance.
Thanks go to Joseph Coha, and Michael Yawn for their contributions.
"The Java Language Specification", James Gosling, Bill Joy, Guy Steele, Addison-Wesley (1996)