Wednesday, October 19, 2011

Threading stories: Why volatile matters

Many years ago when I learned Java (in 2000) I was not so concerned about multithreading. In particular I wasn't concerned about the volatile modifier. I don't know why, but I never had problems without volatile, so maybe I thought it could not be so relevant. I've suddenly changed my mind when I first analyzed a wierd behaviour of an application that only existed when the application was deployed to a server JVM. Todays JVMs make a lot magic stuff to optimize runtime performance on server applications. In this blog I show you an example to get fimiliar with problems that arrize in multithreaded applications, when you don't recognize the importance of understanding how Java treats shared data in multithreaded programs.

This code snippet demonstrates why understanding volatile is important. Here is the code that you can use to play around. Notice in line 8 the expired variable is declared volatile:
import java.util.Timer;
import java.util.TimerTask;

public class VolatileExample {

 private volatile boolean expired;
 private long counter = 0;
 private Object mutext = new Object();

 @Override
 public Object[] execute(Object... arguments) {
  synchronized (mutext) {
   expired = false;
   final Timer timer = new Timer();
   timer.schedule(new TimerTask() {
    public void run() {
     expired = true;
     System.out.println("Timer interrupted main thread ...");
     timer.cancel();
    }
   }, 1000);
   while (!expired) {
    counter++; // do some work
   }
   System.out.println("Main thread was interrupted by timer ...");
  };
  return new Object[] { counter, expired };
 }

 private class Worker implements Runnable {
  @Override
  public void run() {
   while (!Thread.currentThread().isInterrupted()) {
    execute();
   }
  }
 }

 @SuppressWarnings("static-access")
 public static void main(String[] args) throws InterruptedException {
  VolatileExample volatileExample = new VolatileExample();
  Thread thread1 = new Thread(volatileExample.new Worker(), "Worker-1");
  Thread thread2 = new Thread(volatileExample.new Worker(), "Worker-2");
  thread1.start();
  thread2.start();
  Thread.currentThread().sleep(60000);
  thread1.interrupt();
  thread2.interrupt();
 }
}
Start that with Hotspot VM with -server option set. What you'll get is the following expected output:
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Now take out the volatile in line 8 above and restart, again with -server option set. What you should get is the following output:
Timer interrupted main thread ...
What happened? The Timer thread sets the expired flag to true but the main thread does not see the change. This is exactly what volatile is all about: it ensures that threads share the actual value of a specific variable. If you declare a variable as volatile all threads read that specific value from the memory heap. In the described example the timer thread set the expired value within the thread and this update was not reflected in the memory heap! Notice, that I cancel the timer thread when I set the expired variable to true. This causes the timer thread to die immediately after the run()-method is passed. The main memory heap may be updated now, but the worker thread continues to work on the 'cached' data in the thread memory.

Next: now restart the code again without the volatile modifier. This time you set the -client JVM option (which is the default mode on Windows). The result is the following:
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...

In the client mode the JVM obviously behaves different and does not optimize so aggressively like in server mode. So even if you missed out the volatile modifier, you may not necessarily see an error during development. The JVM options influence the way how strict the JVM optimzes your code. Without volatile it is not garanteed that data changes made by the timer thread are visible to the main thread. But in this case for instance everything still works OK in client mode, which shows that the result of your program relies on the JVM options set.

7 comments:

  1. Hey Niklas,

    volatile has also other special capabilities. It enforces memory barriers, this gives you guarantee theat all changes made to memory will be reflected to the "main" heap when a volatile is written. This comes in very handy if you want to share whole objecttrees - not just only a single field.
    There is a good video on sharing data in a mt-environment on vimeo: http://vimeo.com/28763316

    Greetings, Johannes

    ReplyDelete
  2. Hi Joshi, thx for the valuable comment. I will add an example to illustrate how volatile behaves on complex objects. I think my summary was misleading in that respect. Cheers, Niklas

    ReplyDelete
  3. Joshi, what did you mean by saying it comes in "handy" when you use volatile on objecttrees?

    ReplyDelete
  4. Say I'd like too share a POJO "Foobar" with the fields "foo" and "bar" across multiple threads and want to avoid synchronization locks - so I use volatile for the shared foobar field and set the value like this:

    foobar = new Foobar();
    foobar.setFoo("a");
    foobar.setBar("b");

    Now in this example the other thread will see the new Foobar-Object (the reference will be visible due to the volatile-keyword). But both fileds will have initial value, since only my reference is volatile and the fields of the object are not.
    How to deal with this: You could make both fileds of foobar volatile, too. But if you don't have access to the source (or like arrays) - you can't.
    And this is where volatile comes in handy:

    Foobar tmp = new Foobar();
    tmp.setFoo("a"):
    tmp.setBar("b");
    foobar = tmp;

    Now all changes made before the write to the (volatile) foobar variable will be flushed to the "main" heap and when foobar is read by the other thread, all changes will be visible to it.
    So you don't need to modify the Foobar-class. But now the ordering of assignments is important for correct behaviour and must not be "optimized" by a developer!

    I hope this helps to understand what I mean. I really recommend to watch the video from JavaZonementioned in my first comment.

    ReplyDelete
  5. Ok, in my opinion this is what makes volatile anything else then "handy" 'cause you need to set all fields volatile that you want to share. This is what I was trying to explain in my initial conclusions list. I'm still not sure why you say it has "special capabilities" or it is "handy" for situations with compley objects. This sounded like I missed something out and it made me curious. The behaviour for object references and primitives is exactly equivalent. However, i'll add an example that points out that it's not sufficient to only declare an object volatile if you like to share that object across threads.

    ReplyDelete
  6. "...  'cause you need to set all fields volatile that you want to share."
    This is not true, in my second example the Foobar class got no volatile fields an was not touched. Maybe my example was too short. (And with arrays you even can't set all fields volatile - only the reference to the array)

    "The behaviour for object references and primitives is exactly equivalent."
    That's right I never meant anything else. What I wanted to show is, that an assignment to a volatile variable also affects the visibility of non-volatile variables, hence you don't need 'all-fields-volatile-objects'if you want to share them. But you have to have in mind to do the assignemnts in correct order.

    Maybe "handy" was not the correct term. May be "needed" (in terms of arrays) fits better here.

    ReplyDelete
  7. It's a brave statement to say "it's not true" cause also in your example its the reassignment to a volatile field. Again, only volatile fields are garanteed to be shared, if you reassign then of course the complete object is flushed to the memory heap. Anything else wouldn't make sense. I think we're in agreement, apart from the fact that you present a 'workaround' to flush non-volatile variables and at the same time you state volatile has "special capabilities". In my eyes the only reason why you need a reassignment (in a specific order and which doesnt help in all the cases!!) is the limitted capability of volatile and the fact that it only spans the object reference. Imagine a situation where volatile of foobar variable would span all the FooBar fields, that would be handy in my opinion. Don't worry your example wasn't too short it just doesnt change the fact that if you want to share a variable you need to make it volatile.

    ReplyDelete