What do you do when your Java application consumes 100% of the CPU? Turns out you can easily find the problematic thread(s) using built-in UNIX and JDK tools. No profilers or agents required.
For the purpose of testing we'll use this simple program:
For the purpose of testing we'll use this simple program:
public class Main { public static void main(String[] args) { new Thread(new Idle(), "Idle").start(); new Thread(new Busy(), "Busy").start(); } } class Idle implements Runnable { @Override public void run() { try { TimeUnit.HOURS.sleep(1); } catch (InterruptedException e) { } } } class Busy implements Runnable { @Override public void run() { while(true) { "Foo".matches("F.*"); } } }As you can see, it starts two threads.
Idle
is not consuming any CPU (remember, sleeping threads consume memory, but not CPU) while Busy
eats the whole core as regular expression parsing and executing is a surprisingly complex process. Let's run this program and forget about it. How can we quickly find out that Busy
is the problematic piece of our software? First of all we use top to find out the process id (PID
) of the java
process consuming most of the CPU. This is quite straightforward:$ top -n1 | grep -m1 javaThis will display the first line of
top
output containing "java
" sentence:22614 tomek 20 0 1360m 734m 31m S 6 24.3 7:36.59 javaThe first column is the PID, let's extract it. Unfortunately it turned out that
top
uses ANSI escape codes for colors - invisible characters that are breaking tools like grep
and cut
. Luckily I found a perl script to remove these characters and was finally able to extract the PID of java
process exhausting my CPU:$ top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' 'The
cut -f1 -d' '
invocation simply takes the first value out of space-separated columns:22614Now when we now the problematic JVM PID, we can use
top -H
to find problematic Linux threads. The -H
option prints a list of all threads as opposed to processes, the PID column now represents the internal Linux thread ID:$ top -n1 -H | grep -m1 java $ top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' 'The output is surprisingly similar, but the first value is now the thread ID:
25938 tomek 20 0 1360m 748m 31m S 2 24.8 0:15.15 java 25938So we have a process ID of our busy JVM and Linux thread ID (most likely from that process) consuming our CPU. Here comes the best part: if you look at jstack output (available in JDK), each thread has some mysterious ID printed next to its name:
"Busy" prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000] java.lang.Thread.State: RUNNABLE at java.util.regex.Pattern$Node.study(Pattern.java:3010)That's right, the
nid=0x645a
parameter is the same as thread ID printed by top -H
. Of course to not make it too simple, top
uses decimal notation while jstack
prints in hex. Again there is a simple solution, printf "%x":$ printf "%x" 25938 6552Let's wrap all we have now into a script and combine the results:
#!/bin/bash PID=$(top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ') NID=$(printf "%x" $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')) jstack $PID | grep -A500 $NID | grep -m1 "^$" -B 500
PID
holds the java
PID and NID
holds the thread ID, most likely from that JVM. The last line simply dumps the JVM stack trace of the given PID and filters out (using grep
) the thread which has matching nid
. Guess what, it works:$ ./profile.sh "Busy" prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000] java.lang.Thread.State: RUNNABLE at java.util.regex.Pattern$Node.study(Pattern.java:3010) at java.util.regex.Pattern$Curly.study(Pattern.java:3854) at java.util.regex.Pattern$CharProperty.study(Pattern.java:3355) at java.util.regex.Pattern$Start.<init>(Pattern.java:3044) at java.util.regex.Pattern.compile(Pattern.java:1480) at java.util.regex.Pattern.<init>(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:823) at java.util.regex.Pattern.matches(Pattern.java:928) at java.lang.String.matches(String.java:2090) at com.blogspot.nurkiewicz.Busy.run(Main.java:27) at java.lang.Thread.run(Thread.java:662)Running the script multiple times (or with
watch
, see below) will capture Busy
thread in different places, but almost always inside regular expression parsing - which is our problematic piece!Multiple threads
In case your application has multiple CPU-hungry threads, you can usewatch -n1 ./profile.sh
command to run the script every second and get semi real-time stack dumps, most likely from different threads. Testing with the following program:new Thread(new Idle(), "Idle").start(); new Thread(new Busy(), "Busy-1").start(); new Thread(new Busy(), "Busy-2").start();you'll see stack traces either of
Busy-1
or of Busy-2
threads (in different places inside Pattern
class), but never Idle
.
I have just one word to say WOW, you made id surprisingly simple with your example. I usually prefer prstat if running on Solaris.
ReplyDeleteprstat -L -p PID will print thread level details.
Javin
What is Synchronization in Java - 20 points
Super! I thought this AWK script http://blog.tsunanet.net/2010/08/jpmp-javas-poor-mans-profiler.html was good, but yours is even better.
ReplyDeleteNice article which inspired a variation on this theme, as a simpler script with less dependencies (and still looking into cutting one more dependency): http://ileriseviye.org/blog/programlama/variation-on-a-theme-which-java-thread-consumes-the-most-of-your-cpu/
ReplyDeleteI love your solution, much cleaner and concise. I didn't know ps can be used as well to obtain CPU statistics. Keep working on this script, I'll link to your version in my article.
DeleteGreat post. I'd rather use a "kill -3" (dump full thread stack, but you'll find every needed information, including the thread we're looking at) instead of jstack.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeletesome simple version:
ReplyDelete#!/bin/sh
PID=$(top -b -n1 | grep -m1 java | cut -f1 -d' ')
NID=$(printf '%x' $(top -b -n1 -H | grep -m1 java | cut -f1 -d' '))
jstack $PID | sed -e '/./{H;$!d;}' -e 'x;/'$NID'/!d;'
Very good article Tomasz,
ReplyDeleteProper thread dump & thread per CPU analysis is always a critical skill to have when supporting and developing Java / Java EE application. I also have a Java High CPU guide article providing similar approach for other OS such as AIX & Windows.
Thanks.
P-H