jstack - the missing manual

29 August 2011

The jstack utility is part of Hotspot JVM, it is a command line program that obtains the current stack trace of all java threads. Actually it is one of the strong points of java that you can always get a stack trace for all threads of a process and also display all currently held locks. With ‘unmanaged’ c programs it is not quite so easy, not at all.

So when you are running jstack on Linux (or any other Unix) then there are some details that are missing from the documentations

  • The jstack utility and JVM have to be from the same version of the JDK. If you have several versions of java installed then go figure.
  • The jstack utility and traced java process should run under the same user account. This needs some explanation

Let us trace the system calls for running jstack on process id 10015

strace -f  -oo /usr/java/jdk1.6.0/bin/jstack  -l  10015

grep -F 'connect(' o

11558 connect(6, {sa_family=AF_FILE, path="/tmp/.java_pid10015"}, 110) = 0
11558 connect(6, {sa_family=AF_FILE, path="/tmp/.java_pid10015"}, 110) = 0

Interestingly jstack uses a named pipe with the file name of /tmp/.java_pid in order to communicate with the JVM of the traced process. Communication via the named pipe goes through a simple string based protocol

11619 connect(6, {sa_family=AF_FILE, path="/tmp/.java_pid10015"}, 110) = 0
11619 write(6, "1", 1)                  = 1
11619 write(6, "\0", 1)                 = 1
11619 write(6, "threaddump", 10)        = 10
11619 write(6, "\0", 1)                 = 1
11619 write(6, "-l", 2)                 = 2
11619 write(6, "\0", 1)                 = 1
11619 write(6, "\0", 1)                 = 1
11619 write(6, "\0", 1)                 = 1

...

11619 read(6, "0", 1)                   = 1
11619 read(6, "\n", 1)                  = 1
11619 read(6, "2011-08-28 11:17:49\nFull thread "..., 128) = 128
11619 write(1, "2011-08-28 11:17:49\nFull thread "..., 128) = 128
11619 read(6, "95e5400 nid=0x2aef waiting on co"..., 128) = 128
11619 write(1, "95e5400 nid=0x2aef waiting on co"..., 128) = 128
11619 read(6, "ne\n\n\"DestroyJavaVM\" prio=10 tid="..., 128) = 128
11619 write(1, "ne\n\n\"DestroyJavaVM\" prio=10 tid="..., 128) = 128
11619 read(6, "Locked ownable synchronizers:\n\t-"..., 128) = 128
11619 write(1, "Locked ownable synchronizers:\n\t-"..., 128) = 128
11619 read(6, "va.lang.Thread.State: TIMED_WAIT"..., 128) = 128
11619 write(1, "va.lang.Thread.State: TIMED_WAIT"..., 128) = 128
11619 read(6, "va.util.concurrent.locks.Abstrac"..., 128) = 128

Aha, so jstack just sends a string command to the traced JVM, and the JVM returns all report lines over the same named pipe’s socket.

So while the program is running, we can look at the named pipe

stat /tmp/.java_pid10015

Size: 0               Blocks: 0          IO Block: 4096   socket
Device: fd00h/64768d    Inode: 14699683    Links: 1
Access: (0600/srw-------)  Uid: (  315/  mmoser)   Gid: (  231/     sps)
Access: 2011-08-28 11:17:07.000000000 +0300
Modify: 2011-08-28 11:13:20.000000000 +0300
Change: 2011-08-28 11:13:20.000000000 +0300

The access permissions say that only one account, the user account that created the java process, can access the named pipe. Even the root user can’t talk to this named pipe directly.

A script to automate usage of jstack

All this can be obtained from system information, just by knowing the process id number.

  • Can see who is running the process, the permissions and owner of the /proc/[pid]/ directory say so (the linux operating system maintains this directory per running process )
  • The /proc/[pid]/maps includes all shared memory ranges used by the process, so each shared library has it’s text/code segment here + full path of the executable that is mapped in. Each Hotspot java process loads a library called libjvm.so - so by finding the path name of libjvm.so we can infer the path of the used JDK version.

The script that can do all this is here It should be used as follows:

Usage: ./do-jstack <pid> [ -m -F ]

Description

The script performs the following steps
 - check if <pid> is that of a running process
 - check if <pid> is running from the same user account as current account
 - check if <pid> is that of a java process (running hostpot JVM)
 - Find jstack from the same JVM as that of the running process.
 - run jstack

The first command line parameter is the process id of the java process to monitor,
after the first option come additional options passed as is to jstack (optional) 

Script that continuously monitors a given java process

A similar script runs jstack at fixed time intervals , this can be used to track locking problems, etc. etc.

It also prints the CPU usage of each java thread, the only other program that knows how to display the CPU usage along the stack trace is “visual vm”: http://visualvm.java.net/ This feature is very useful if you want to detect threads that got stuck in infinite loops for example, those will hang around the same location in the code, but will consume high percentage of CPU time.

It also shows Virtual Memory size + Resident set size for the whole process.

Usage: do-jstack-loop -p <pid> [ -d <delay time> ] [ -o <log file> ]

Description

The script runs jstack in a loop with a period of <delay time> (default value 10 seconds). 

Usage:
    -p <pid>     process to monitor, the process should host the hotspot JVM.
    -d <delay>   repeat jstack with this delay (in seconds)
    -o <logfile> Logs all output to the <logfile>

The script also performs the following steps
 - check is <pid> is that of a running process
 - check of <pid> is running from the same user account as current account
 - check is <pid> is that of a java process (running hostpot JVM)
 - Find jstack from the same JVM as from running process.
 - run jstack with that version of jvm.

For each JAVA thread it also shows the Operating system thread id (LWP id) and CPU Usage (in %) of that thread.
This information can be used to debug applications that got into infinite loops.

The report looks as follows

------------------------------------------------------------------------------
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mmoser   14431  2014 32 70914 11468   0 08:51 pts/0    00:01:54 java -cp . test

2011-08-29 08:57:44
Full thread dump Java HotSpot(TM) Client VM (20.1-b02 mixed mode, sharing):

=== OS Thread Id: 14449 CPU Usage: 3%  ===
"Thread-9" prio=10 tid=0xb4636c00 nid=0x3871 waiting on condition [0xb3fd2000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at test$1.run(test.java:19)

   Locked ownable synchronizers:
    - None

=== OS Thread Id: 14448 CPU Usage: 3%  ===
"Thread-8" prio=10 tid=0xb4635400 nid=0x3870 waiting on condition [0xb4024000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at test$1.run(test.java:19)

   Locked ownable synchronizers:
    - None

------------------------------------------------------------------------------
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mmoser   14431  2014 32 70914 11468   0 08:51 pts/0    00:01:55 java -cp . test

2011-08-29 08:57:46
Full thread dump Java HotSpot(TM) Client VM (20.1-b02 mixed mode, sharing):

=== OS Thread Id: 15241 CPU Usage: 0%  ===
"Attach Listener" daemon prio=10 tid=0x08bc3c00 nid=0x3b89 waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

Log useful data for any process in loop (not just java)

The following script loops and prints some information about a given process, the gathered information can help to determine if there is a leak of some sort.

./run_top -p <process id> [ -d <delay> ] [ -o <log file> ]

Usage:
    -p <pid>     process id to monitor
    -d <delay>   Optional parameter - delay between measurements in seconds. 
         The default value is 10 seconds
    -o <logfile> write output to log file.

Monitors a process continuously and prints the following statistics into a file
  - Size of resident set size, i.e. the size of physical memory in use
  - Size of Virtual memory, i.e. the size of address space reserved for use
  - Number of threads
  - Number of sockets/files/pipes.

The report looks as follows:

29/8/2011 9:32:17
    Resident set size:   11380 kB
    Virtual memory size: 281232 kB
    Number of threads:   13
    Files:       4
    Sockets:         1
    pipes:           0
29/8/2011 9:32:27
    Resident set size:   11380 kB
    Virtual memory size: 281232 kB
    Number of threads:   13
    Files:       4
    Sockets:         1
    pipes:           0
29/8/2011 9:32:37
    Resident set size:   11380 kB
    Virtual memory size: 281232 kB
    Number of threads:   13
    Files:       4
    Sockets:         1
    pipes:           0
29/8/2011 9:32:47
    Resident set size:   11380 kB
    Virtual memory size: 281232 kB
    Number of threads:   13
    Files:       4
    Sockets:         1
    pipes:           0