Using incorrect JVM Garbage Collector

Bug #541520 reported by Aaron J. Zirbes
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tomcat6 (Ubuntu)
Fix Released
Medium
Thierry Carrez

Bug Description

Binary package hint: tomcat6

The default garbage collector for tomcat should be the Concurrent Mark-Sweep (CMS) Collector as it is the recommended GC for Web Application Severs.

The default garbage collector doesn't guarantee quick response times, and often times causes hangs during garbage collection.

As Tomcat 6 is a web application server, it should use the CMS GC.

Here is the diff to fix it:

diff /etc/default/tomcat6 /etc/default/tomcat6.original
17a18,20
> # Use a concurrent garbage collector for improved response time
> JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"
>

To do this, the following option can be added to `/etc/default/tomcat6`

...

# Arguments to pass to the Java virtual machine (JVM).
#JAVA_OPTS="-Djava.awt.headless=true -Xmx128M"

# Use a concurrent garbage collector for improved response time
JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"

...

Reference:
http://java.sun.com/javase/technologies/hotspot/gc/memorymanagement_whitepaper.pdf

Additional Material:
http://java.sun.com/javase/technologies/hotspot/gc/index.jsp

Thanks for making Ubuntu Server enterprise ready!

Tags: patch
Revision history for this message
Aaron J. Zirbes (ajz) wrote :
tags: added: patch
Revision history for this message
Chuck Short (zulcss) wrote :

Which version is this for?

Regards
chuck

Changed in tomcat6 (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Revision history for this message
Thierry Carrez (ttx) wrote :

I think this is valid for any 6.0.x versions. I asked Jason Brittain for advice on how much this is desirable in the default install.

Changed in tomcat6 (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Jason Brittain (jason-brittain) wrote :

My short opinion is: Yes, it does make sense to use the CMS GC for Tomcat by default.

Aaron: Thanks very much for the link to Sun's Hotspot memory management white paper, and for the suggestion to use CMS by default. I have seen quite a few production Tomcat environment config files over the years, each one specifying to use a different set of the GC startup switches, and I often wondered which one made the most sense. The white paper you gave us the link for has text that specifically says that the CMS GC is particularly well suited for web servers, or similar situations where lowest response time is very important. Until now, I had not seen much credible documentation that specifically said that.

From the perspective of looking at this problem over a period of years, the right answer actually changes because the JVM's features change. Right now, with the features of the current Sun JDKs and OpenJDKs, the CMS GC appears to be the right GC algorithm to use for a Tomcat JVM because CMS is well supported: it's been in the code for quite some time (since the 1.4.1 release, according to the docs), and I have not heard any complaints about its stability.

Of course we're only talking about the package defaults and general recommendations, and it is up to the user to carry out some performance testing to find what works best for their webapp(s), with their own performance goals in mind.

For both the Sun Hotspot JDK and OpenJDK running a Tomcat JVM, I recommend using -XX:+UseConcMarkSweepGC as the default, and in the case where Tomcat is running on a single CPU chip that has either a single or dual core inside, I also recommend using -XX:+CMSIncrementalMode. But, incremental mode probably shouldn't be the default in the Tomcat package's defaults file because we don't know what hardware the user is going to run it on. We can add the additional option in there, commented out, however:

# When using the CMS garbage collector, enable this option if you run Tomcat
# on a machine with exactly one CPU chip that contains one or two cores.
#JAVA_OPTS="$JAVA_OPTS -XX:+CMSIncrementalMode"

Also, this GC setting should be valid for Tomcat 5.5.x and Tomcat 6.0.x running on any HotSpot JVM (including OpenJDK) version 1.4.1 or higher, though I'd suggest using it only on version 1.5.0 or higher. I believe that means it should go into both the tomcat55 and tomcat6 packages.

Some other JVMs, including recent GNU gij, and Apache Harmony JDKs both take these JVM arguments and at least these switches don't cause any error. But, I don't know whether it has any positive or negative effects.

Thierry Carrez (ttx)
Changed in tomcat6 (Ubuntu):
assignee: nobody → Thierry Carrez (ttx)
importance: Low → Medium
status: Confirmed → In Progress
Revision history for this message
Thierry Carrez (ttx) wrote :

Here is the patch I'll apply as soon as maven-repo-helper gets synced, just in case Debian wants to apply it before :)

Revision history for this message
Jason Brittain (jason-brittain) wrote :

Thierry:

There is part of this patch's change that may cause authbind to fail:

-# You may pass JVM startup parameters to Java here.
-#JAVA_OPTS="-Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -Xmx128m"
+# You may pass JVM startup parameters to Java here. If unset, the default
+# options (-Djava.awt.headless=true -Xmx128m) will be used.
+#JAVA_OPTS="-Djava.awt.headless=true -Xmx128m"

Removing -Djava.net.preferIPv4Stack=true means that the JVM's network stack might default to IPv6, and in that case authbind will (seemingly mysteriously) fail. Authbind works only with IPv4. I think few users use IPv6 today. It might eventually get popular, and if it does, it would be a good idea if we're not always making software that defaults to IPv4. But, I think the default should be IPv4, especially when certain JDK bugs are encountered only when the JDK's startup switches do not specify IPv4. So, I'm proposing we add back the -Djava.net.preferIPv4Stack=true startup switch.

Revision history for this message
Thierry Carrez (ttx) wrote :

It should not break because authbind is special-cased in the init script:

Init script sets:
JAVA_OPTS="-Djava.awt.headless=true -Xmx128m"
Then sources the defaults file, which may or may not override JAVA_OPTS
Then if authbind is set, it does:
JAVA_OPTS="$JAVA_OPTS -Djava.net.preferIPv4Stack=true"

So -Djava.net.preferIPv4Stack=true will always be added if AUTHBIND=yes.
My change in the defaults file is just to reflect the JAVA_OPTS default value in the init script, as a starting point for someone that wants to override it.

Revision history for this message
Jason Brittain (jason-brittain) wrote :

Ahh, yes. That's true. I now remember that I made sure to enable that switch whenever
authbind is enabled. But, also, I seem to remember some past issues that were fixed by
setting -Djava.net.preferIPv4Stack=true by default.. though I can't find any of those in the
ASF Tomcat bugzilla at the moment. It might be safe to remove it in that case.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package tomcat6 - 6.0.24-2ubuntu1

---------------
tomcat6 (6.0.24-2ubuntu1) lucid; urgency=low

  [ Thierry Carrez ]
  * Uploading what 6.0.24-5 should be (upload is blocked in Debian due to
    current infrastructure issues), in order to meet Beta2Freeze.

  [ Niels Thykier ]
  * Added optimised garbage collection options to tomcat6's default options.
    Thanks to Aaron J. Zirbes and Thierry Carrez for research and the patch.
    (Closes: LP: #541520)
  * Updated the changelog to mention closed CVE's in the 6.0.24-1 release.
  * Applied patch from Arto Jantunen fixing an issue with cleaning up the
    pid-file. (Closes: #574084)

  [ Ludovic Claude ]
  * debian/tomcat6.postrm: fix removal of Tomcat (Closes: #567548)
  * Set UTF-8 as default character encoding - Patch by Thomas Koch
    (Closes: #573539)
  * Set the major, minor and build versions when calling Ant
    (Closes: LP: #495505)
  * Rebuild with a more recent version of maven-repo-helper which puts
    the javax jars at the correct location in the Maven repository.
    Fixes several FTBFS in other packages.
 -- Thierry Carrez <email address hidden> Wed, 31 Mar 2010 10:47:51 +0200

Changed in tomcat6 (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Gabriel Nell (gabriel-nell) wrote :

Folks -- sorry for chiming in here after the train has apparently left the station :)

I've been working on optimizing my Tomcat configuration for the past couple days, and this optimization has mostly centered around the garbage collector. I'm concerned that specifying the CMS collector is classic premature optimization. If I absorbed this change, my application performance would have gotten suddenly worse. But let's not talk about my specific case.

To excerpt from the same whitepaper referenced above (section 6, first paragraph):

"Thus, the initial recommendation for selecting and configuring a garbage collector is to do nothing! That is, do not specify usage of a particular garbage collector, etc. Let the system make automatic choices based on the platform and operating system on which your application is running."

As this implies, the Hotspot JVM will actually select a collector for you based on the system resources. Further, Sun's GC tuning document found at

http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#available_collectors.selecting

Advises that heap size should actually be the first knob you adjust before switching collectors.

I'm all for choosing smart defaults. But I think in this case the advice from the guys who make Java is pretty clear. Let the JVM decide, and if it's not good enough, only then optimize for the metrics you care about, and do not begin by choosing a specific collector.

Revision history for this message
Jason Brittain (jason-brittain) wrote :

Gabriel: It's certainly alright to still discuss this. It's a
complex issue, with (I think) only very rough solutions.

I agree that the JVM authors are saying that if you cannot know
in advance very much about how the Java program is going to be
using the JVM, and if you cannot know about the environment in
which the JVM runs (how much memory hardware is installed, what
kind of CPU or CPUs the JVM is running on, etc) then we should
not make assumptions that can easily be incorrect. The JVM could
inspect some things at startup time (before the program has a
chance to begin running), and can make an educated guess about
which GC algorithm to select based only on the hardware it finds.

But, we actually do know some things that the JVM can't know
before the Java program begins to run. We know that the
application is a web server -- this has very large implications.
The JVM cannot guess something this significant when it is
auto-selecting a GC algorithm during JVM startup. It is true
that not every webapp works the same, so to get the best
performance for your own webapp you have no choice but to
hand-tune the GC. It has always been that way, and still is,
regardless of whether we specify a default GC algorithm or not.
We also chose a default max heap size, which is pretty low (maybe
we could afford a somewhat larger max heap now?). But we know
exactly what it is set to, by default, which also tells us more
about the hardware configuration before startup. If that default
max heap size (still the most coarse tuning knob) is not what the
webapp should have, which will often be the case, then the system
administrator must edit the defaults file and change it.. and
while they're in there they can certainly change the GC algorithm
default if they'd like to. About not knowing which CPU(s) are in
the machine: most CPUs are now multicore, and much of the worry
about single core versus multicore seems to have gone away
because even with a single chip you still have multiple
processors. About the only assumption we're making here is that
we're not trying to choose defaults for old machines. We're
assuming at least single chip, multicore.

As I said in my first comment: Of course we're only talking about
the package defaults and general recommendations, and it is up to
the user to carry out some performance testing to find what works
best for their webapp(s), with their own performance goals in
mind.

Revision history for this message
Gabriel Nell (gabriel-nell) wrote :

The point that "we know something the JVM can't know", eg, that this JVM is running a web server, is fair. I'm still not convinced that specifying the CMS collector is the right application of this knowledge, and I'm new to Ubuntu (from a contributor standpoint, anyway) so I don't know what the project goals are for the default configuration. But for now let me move down what I hope is a fruitful path. It sounds like (and please correct me if I'm wrong) we want to choose settings that:

1) at a minimum, "work" (eg, after running apt-get, tomcat actually boots up and serves requests)
2) has reasonable defaults to suit the sort of hardware we expect people to run web servers on

Currently our defaults are:

-Xmx128M -XX:+UseConcMarkSweepGC

Which satisfies #1, but I still think is not great for #2. Maybe as you alluded we'd do better by not specifying the maximum heap size? The rationale was that 64MB is not usually enough, and that made sense under JDK1.4, when this was the default heap size. However it looks like in 1.5 and later, the heap selection was changed to be based on the amount of available RAM

***
initial heap size:
Larger of 1/64th of the machine's physical memory on the machine or some reasonable minimum. Before J2SE 5.0, the default initial heap size was a reasonable minimum, which varies by platform. You can override this default using the -Xms command-line option.

maximum heap size:
Smaller of 1/4th of the physical memory or 1GB. Before J2SE 5.0, the default maximum heap size was 64MB. You can override this default using the -Xmx command-line option.
***

So if we don't give any heap parameters to the JVM:

- The heap will be at least 128MB (satisfy goal #1) on any machine with 512MB or more of RAM
- The JVM is free to choose a heap much larger than 128MB, based on the available memory (satisfy goal #2)

For this reason I'd suggest we remove the "-Xmx128M" portion from the command line.

References:
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#par_gc.ergonomics.default_size
http://java.sun.com/javase/6/docs/technotes/guides/vm/gc-ergonomics.html
http://java.sun.com/docs/hotspot/gc5.0/ergo5.html

Revision history for this message
Aaron J. Zirbes (ajz) wrote : Re: [Bug 541520] Re: Using incorrect JVM Garbage Collector

I'm OK with moving the "-Xmx128M" parameter out onto one of the comment
lines as this is a commonly known and understood parameter. Most google
(bing?) searches will advise you to adjusting this parameter when you
come upon memory problems.

If we want to ensure that we don't assign "UseConcMarkSweepGC" on
single-core processors, the installer script could check for this by
doing something sneaky like...

CORE_COUNT=`grep -E "^processor\s*:" /proc/cpuinfo | wc -l`
if ((( $CORE_COUNT > 1 ))) then
    JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"
fi

As I've never written post-install scripts I'm not sure how hard this
would be, but it's a thought.

Theoretically it could be added to /etc/init.d/tomcat6...

(P.S. Don't trust my BASH syntax, I didn't test it. Treat it as
pseudo-code.)

--
Aaron

Gabriel Nell wrote:
> The point that "we know something the JVM can't know", eg, that this JVM
> is running a web server, is fair. I'm still not convinced that
> specifying the CMS collector is the right application of this knowledge,
> and I'm new to Ubuntu (from a contributor standpoint, anyway) so I don't
> know what the project goals are for the default configuration. But for
> now let me move down what I hope is a fruitful path. It sounds like (and
> please correct me if I'm wrong) we want to choose settings that:
>
> 1) at a minimum, "work" (eg, after running apt-get, tomcat actually boots up and serves requests)
> 2) has reasonable defaults to suit the sort of hardware we expect people to run web servers on
>
> Currently our defaults are:
>
> -Xmx128M -XX:+UseConcMarkSweepGC
>
> Which satisfies #1, but I still think is not great for #2. Maybe as you
> alluded we'd do better by not specifying the maximum heap size? The
> rationale was that 64MB is not usually enough, and that made sense under
> JDK1.4, when this was the default heap size. However it looks like in
> 1.5 and later, the heap selection was changed to be based on the amount
> of available RAM
>
> ***
> initial heap size:
> Larger of 1/64th of the machine's physical memory on the machine or some reasonable minimum. Before J2SE 5.0, the default initial heap size was a reasonable minimum, which varies by platform. You can override this default using the -Xms command-line option.
>
> maximum heap size:
> Smaller of 1/4th of the physical memory or 1GB. Before J2SE 5.0, the default maximum heap size was 64MB. You can override this default using the -Xmx command-line option.
> ***
>
> So if we don't give any heap parameters to the JVM:
>
> - The heap will be at least 128MB (satisfy goal #1) on any machine with 512MB or more of RAM
> - The JVM is free to choose a heap much larger than 128MB, based on the available memory (satisfy goal #2)
>
> For this reason I'd suggest we remove the "-Xmx128M" portion from the
> command line.
>
> References:
> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#par_gc.ergonomics.default_size
> http://java.sun.com/javase/6/docs/technotes/guides/vm/gc-ergonomics.html
> http://java.sun.com/docs/hotspot/gc5.0/ergo5.html
>
>

Revision history for this message
Thierry Carrez (ttx) wrote :

This bug being closed, could we move this (interesting) discussion to a new bug report, something like "further refining of Tomcat's JVM default parameters" ? I'm all open to improvement in that area, but i'm far from being a Tomcat memory expert or someone that knows how it's used by most people, so your input is invaluable.

Revision history for this message
Gabriel Nell (gabriel-nell) wrote :

Done (#568823). As I say I'm still not convinced about the CMS collector as a default, but I'll follow up later if I become more passionate about it. :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.