Tuesday, March 25, 2008

Thread Dump

If an application seems stuck, or is running out of resources, a thread dump will reveal the state of the server.

Java's thread dumps are a vital tool for server debugging. Because servlets are intrinsically multithreaded, it is very possible to create deadlocks without realizing it, or to have runaway threads that consume resources and cause OutOfMemory exceptions. That's especially true when you start adding third-party software like databases, EJB, and Corba ORBs.

Thread dump by sending a signal

On Windows
  1. Press the Ctrl+Break keys. The thread dump is generated and displayed in the command window.

  2. Scroll back in the command window until you reach the beginning of the dump "Full thread dump:"



On Unix, kill -QUIT will produce a thread dump. If Resin is running in a console window, Ctrl-\ sends a QUIT signal and produces a thread dump.

Thread dump if signalling doesn't work

You get a thread dump without signalling the process by starting the JVM with some extra arguments to allow a debugger to attach. You can then attach with the debugger at any time to get a thread dump. This technique works on all operating systems.

Here are some step by step instructions:

  • Start Resin with some extra arguments that allow a debugger to attach:
(prompt) -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5432
  • Wait until you believe the application is in a state of deadlock or there are runaway threads.
  • In another terminal (window), use jdb to connect to the running instance of Resin:
$JAVA_HOME/bin/jdb -connect com.sun.jdi.SocketAttach:hostname=localhost,port=5432

jdb will show something like:

Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
>
  • Use the "suspend" command and then the "where all" command to get a thread dump:
> suspend

All threads suspended.
> where all

tcpConnection-6862-3:

[1] java.lang.Object.wait (native method)
[2] com.caucho.server.TcpServer.accept (TcpServer.java:650)
[3] com.caucho.server.TcpConnection.accept (TcpConnection.java:208)
[4] com.caucho.server.TcpConnection.run (TcpConnection.java:131)
[5] java.lang.Thread.run (Thread.java:536) tcpConnection-543-2:
[1] java.lang.Object.wait (native method)
[2] com.caucho.server.TcpServer.accept (TcpServer.java:650)
[3] com.caucho.server.TcpConnection.accept (TcpConnection.java:208)
[4] com.caucho.server.TcpConnection.run (TcpConnection.java:131)
[5] java.lang.Thread.run (Thread.java:536)

...
  • Use the "resume" command to resume the process
> resume

Unix users (and Cygwin users on Windows) will recognize the opportunity to make a script:

#!/bin/sh
echo -e "suspend\nwhere all\nresume\nquit" | $JAVA_HOME/bin/jdb -connect \
com.sun.jdi.SocketAttach:hostname=localhost,port=5432

There appears to be no overhead or performance penalties involved in having the JVM start with the options that allow a debugger to attach.

Understanding the thread dump

In any case, you'll eventually get a trace that looks something like the following (each JDK is slightly different):
Full thread dump:

"tcpConnection-8080-2" daemon waiting on monitor [0xbddff000..0xbddff8c4]
at java.lang.Object.wait(Native Method)
at com.caucho.server.TcpServer.accept(TcpServer.java:525)
at com.caucho.server.TcpConnection.accept(TcpConnection.java:190)
at com.caucho.server.TcpConnection.run(TcpConnection.java:136)
at java.lang.Thread.run(Thread.java:484)

"tcpConnection-8080-1" daemon waiting on monitor [0xbdfff000..0xbdfff8c4]
at java.lang.Object.wait(Native Method)
at com.caucho.server.TcpServer.accept(TcpServer.java:525)
at com.caucho.server.TcpConnection.accept(TcpConnection.java:190)
at com.caucho.server.TcpConnection.run(TcpConnection.java:136)
at java.lang.Thread.run(Thread.java:484)

"tcpConnection-8080-0" daemon waiting on monitor [0xbe1ff000..0xbe1ff8c4]
at java.lang.Object.wait(Native Method)
at com.caucho.server.TcpServer.accept(TcpServer.java:525)
at com.caucho.server.TcpConnection.accept(TcpConnection.java:190)
at com.caucho.server.TcpConnection.run(TcpConnection.java:136)
at java.lang.Thread.run(Thread.java:484)

"tcp-accept-8080" runnable [0xbe7ff000..0xbe7ff8c4]
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:413)
at java.net.ServerSocket.implAccept(ServerSocket.java:243)
at java.net.ServerSocket.accept(ServerSocket.java:222)
at com.caucho.server.TcpServer.run(TcpServer.java:415)
at java.lang.Thread.run(Thread.java:484)

"resin-cron" daemon waiting on monitor [0xbe9ff000..0xbe9ff8c4]
at java.lang.Thread.sleep(Native Method)
at com.caucho.util.Cron$CronThread.run(Cron.java:195)

"resin-alarm" daemon waiting on monitor [0xbebff000..0xbebff8c4]
at java.lang.Thread.sleep(Native Method)
at com.caucho.util.Alarm$AlarmThread.run(Alarm.java:268)

"Signal Dispatcher" runnable [0..0]

"Finalizer" daemon waiting on monitor [0xbf3ff000..0xbf3ff8c4]
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162)

"Reference Handler" daemon waiting on monitor [0xbf5ff000..0xbf5ff8c4]
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:420)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110)

"main" waiting on monitor [0xbfffd000..0xbfffd210]
at java.lang.Thread.sleep(Native Method)
at com.caucho.server.http.ResinServer.waitForExit(ResinServer.java:674)
at com.caucho.server.http.ResinServer.main(ResinServer.java:821)
at com.caucho.server.http.HttpServer.main(HttpServer.java:95)

Each thread is named. Here are some of the common names:

Thread NameDescription
tcp-accept-8080Resin thread listening for new connections

on port 8080.

tcpConnection-8080-3Resin servlet thread handling

a connection from port 8080.

tcp-cronResin's run-at thread
tcp-alarmResin's alarm thread

There should be one tcp-accept-xxx thread for each http and srun that Resin's listening for. The tcp-accept-xxx thread should almost always be in socketAccept.

There should be several tcpConnection-xxx-n threads. Each of these is the servlet thread. On a busy server, these can appear anywhere in your code. If several appear in one location, you've likely got some sort of deadlock or at least a slow lock. Idle threads are either in tcpAccept or httpRequest or runnerRequest (for keepalive threads.)

For deadlocks, look at the "waiting on monitor" threads and any case where lots of threads are stuck at the same location.

SEE ALSO

No comments:

Post a Comment