Статьи

Параллелизм Java: тупики скрытых потоков


Большинство Java-программистов знакомы с
концепцией
взаимоблокировки потоков Java . По сути, он включает 2 потока, ожидающих друг друга вечно Это условие часто является результатом проблем с упорядочиванием блокировок (синхронизированных) или ReentrantLock (чтение или запись).

Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting to lock monitor 0x0237ada4 (object 0x272200e8, a java.lang.Object),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x0237aa64 (object 0x272200f0, a java.lang.Object),
  which is held by "pool-1-thread-2"

Хорошей новостью является то, что HotSpot JVM всегда может обнаружить это условие для вас … или это так?

Недавняя
проблема взаимоблокировки потоков, затрагивающая производственную среду Oracle Service Bus, вынудила нас вернуться к этой классической проблеме и выявить существование «скрытых» ситуаций тупиковых ситуаций.

В этой статье будет продемонстрировано и воспроизведено с помощью простой Java-программы очень специальное условие блокировки блокировки, которое не обнаружено в последней версии HotSpot JVM 1.7.
В конце статьи вы также найдете
видео, объясняющее пример программы Java и используемый метод устранения неполадок.
Место преступления

Я обычно хотел бы сравнить основные проблемы параллелизма Java с местом преступления, где вы играете главную роль следователя.
В этом контексте «преступление» — это фактический сбой в работе ИТ-среды вашего клиента. Ваша работа заключается в:
  • Соберите все доказательства, подсказки и факты (дамп потока, журналы, влияние на бизнес, показатели нагрузки …)
  • Опрос свидетелей и экспертов в области (группа поддержки, служба доставки, поставщик, клиент …)
The next step of your investigation is to analyze the collected information and establish a potential list of one or many “suspects” along with clear proofs. Eventually, you want to narrow it down to a primary suspect or root cause. Obviously the law “innocent until proven guilty” does not apply here, exactly the opposite.
Lack of evidence can prevent you to achieve the above goal. What you will see next is that the lack of deadlock detection by the Hotspot JVM does not necessary prove that you are not dealing with this problem.
The suspect
In this troubleshooting context, the “suspect” is defined as the application or middleware code with the following problematic execution pattern.
  • Usage of FLAT lock followed by the usage of ReentrantLock WRITE lock (execution path #1)
  • Usage of ReentrantLock READ lock followed by the usage of FLAT lock (execution path #2)
  • Concurrent execution performed by 2 Java threads but via a reversed execution order
The above lock-ordering deadlock criteria’s can be visualized as per below:
Now let’s replicate this problem via our sample Java program and look at the JVM thread dump output.
Sample Java program
This above deadlock conditions was first identified from our Oracle OSB problem case. We then re-created it via a simple Java program. You can
download the entire source code of our program
here.
The program is simply creating and firing 2 worker threads. Each of them execute a different execution path and attempt to acquire locks on shared objects but in different orders. We also created a deadlock detector thread for monitoring and logging purposes.
For now, find below the Java class implementing the 2 different execution paths.

 package org.ph.javaee.training8;

import java.util.concurrent.locks.ReentrantReadWriteLock;

/**
 * A simple thread task representation
 * @author Pierre-Hugues Charbonneau
 *
 */
public class Task {
      
       // Object used for FLAT lock
       private final Object sharedObject = new Object();
       // ReentrantReadWriteLock used for WRITE & READ locks
       private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
      
       /**
        *  Execution pattern #1
        */
       public void executeTask1() {
            
             // 1. Attempt to acquire a ReentrantReadWriteLock READ lock
             lock.readLock().lock();
            
             // Wait 2 seconds to simulate some work...
             try { Thread.sleep(2000);}catch (Throwable any) {}
            
             try {              
                    // 2. Attempt to acquire a Flat lock...
                    synchronized (sharedObject) {}
             }
             // Remove the READ lock
             finally {
                    lock.readLock().unlock();
             }           
            
             System.out.println("executeTask1() :: Work Done!");
       }
      
       /**
        *  Execution pattern #2
        */
       public void executeTask2() {
            
             // 1. Attempt to acquire a Flat lock
             synchronized (sharedObject) {                 
                   
                    // Wait 2 seconds to simulate some work...
                    try { Thread.sleep(2000);}catch (Throwable any) {}
                   
                    // 2. Attempt to acquire a WRITE lock                   
                    lock.writeLock().lock();
                   
                    try {
                           // Do nothing
                    }
                   
                    // Remove the WRITE lock
                    finally {
                           lock.writeLock().unlock();
                    }
             }
            
             System.out.println("executeTask2() :: Work Done!");
       }
      
       public ReentrantReadWriteLock getReentrantReadWriteLock() {
             return lock;
       }
}
As soon ad the deadlock situation was triggered, a JVM thread dump was generated using JVisualVM.
As you can see from the Java thread dump sample. The JVM did not detect this deadlock condition (e.g. no presence of Found one Java-level deadlock) but it is clear these 2 threads are in deadlock state.
Root cause: ReetrantLock READ lock 
behavior
The main explanation we found at this point is associated with the usage of the ReetrantLock READ lock. The read locks are normally
not designed to have a notion of ownership. Since there is not a record of which thread holds a read lock, this appears to prevent the HotSpot JVM deadlock detector logic to detect deadlock involving read locks.
Some improvements were implemented since then but we can see that the JVM still cannot detect this special deadlock scenario.
Now if we replace the read lock (execution pattern #2) in our program by a write lock, the JVM will finally detect the deadlock condition but why?
Found one Java-level deadlock:
=============================
«pool-1-thread-2»:
  waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
  which is held by «pool-1-thread-1»
«pool-1-thread-1»:
  waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object),
  which is held by «pool-1-thread-2»

 Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object),
  which is held by "pool-1-thread-2"

Java stack information for the threads listed above:
===================================================
"pool-1-thread-2":
       at sun.misc.Unsafe.park(Native Method)
       - parking to wait for  <0x272239c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
       at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
       at org.ph.javaee.training8.Task.executeTask2(Task.java:54)
       - locked <0x272236d0> (a java.lang.Object)
       at org.ph.javaee.training8.WorkerThread2.run(WorkerThread2.java:29)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:722)
"pool-1-thread-1":
       at org.ph.javaee.training8.Task.executeTask1(Task.java:31)
       - waiting to lock <0x272236d0> (a java.lang.Object)
       at org.ph.javaee.training8.WorkerThread1.run(WorkerThread1.java:29)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:722)
This is because write locks are tracked by the JVM similar to flat locks. This means the HotSpot JVM deadlock detector appears to be currently designed to detect:
  • Deadlock on Object monitors involving FLAT locks
  • Deadlock involving Locked ownable synchronizers associated with WRITE locks
The lack of read lock per-thread tracking appears to prevent deadlock detection for this scenario and significantly increase the troubleshooting complexity.
I suggest that you read
Doug Lea’s comments on this whole issue since concerns were raised back in 2005 regarding the possibility to add per-thread read-hold tracking due to some potential lock overhead.
Find below my troubleshooting recommendations if you suspect a hidden deadlock condition involving read locks:
  • Analyze closely the thread call stack trace, it may reveal some code potentially acquiring read locks and preventing other threads to acquire write locks.
  • If you are the owner of the code, keep track of the read lock count via the usage of the lock.getReadLockCount() method
I’m looking forward for your feedback, especially from individuals with experience on this type of deadlock involving read locks.
Finally, find below a video explaining such findings via the execution and monitoring of our sample Java program.