Java Garbage Collectors

We had earlier explored Garbage Collection and it’s basic concepts in our previous post. One thing we need to understand straight up, is that JVM does not have one single type of garbage collector. It has four different ones, each with it’s own advantages and disadvantages.  What to use ultimately depends  on the application you are using. These 4 collectors though have a common feature, they are generational in nature. Which basically means they split the managed heap into different segments. Now let us examine each of the Collector types one by one.

Serial Collector

The simplest one, mainly for single threaded environments( 32 bit Windows) or small heaps. Basically it freezes all application threads when doing the garbage collection. And that means it’s suitable only for stand alone applications. You can never use it in a server side environment.  This can be turned on by using -XX:+UseSerialGC JVM  argument.

Parallel( Throughput) Collector

Default collector for JVM,  it’s  biggest advantage is that it uses multiple threads to scan through the heap and compact it. The problem is that whenever it is doing a minor or full GC collection, it will stop all the threads, pausing the application. Basically if you have an application, where it’s ok to have a pause time or you want to optimize using lower CPU overhead, go for it.

CMS Collector

Conncurrent Mark Sweep collector as the name indicates, uses multiple threads at a time( concurrent) to “mark” the heap for unused objects which can be recycled(sweep). It goes into a Stop The World mode only in two cases. When initializing the initial marking of roots, ie objects in old generation reachable from thread entry points. Or the application has changed the heap state, while algorithm was running concurrently, forcing it to do some final touches and ensure right objects are marked.

How does this work?

Coming to your next question, when we speak of Mark and Sweep, there are two parts here. As the name suggests Mark is when unused objects are marked out for deletion. By default Mark status of every new object is set to false(0). Now all reachable objects are set to Mark status true(1). What this algorithm does is a depth first search approach. Here every object is considered as a node, and all nodes( objects) reachable from this node are visited. This continues till all reachable nodes are visited.

In Sweep phase, all those objects whose marked value is false are deleted from the memory heap. And all other reachable objects are marked as false. This process is run again to release any marked objects. It is also called as a tracing garbage collector, as the entire collection of objects directly or indirectly accessible is traced out.

The major problem with this is that it could encounter a promotional failure, where a race condition occurs between the young and old generations. Basically, what happens here is the collector has failed to make space to promote objects from young to old. That effectively means it first has to create the space, which ends up creating a Stop the World condition, that it was meant to avoid in the first place. To avoid this issue, either increase size of old generation or the entire heap or allocate more background threads to the collector.

Also it uses more CPU to provide a higher throughput, using multiple threads to perform scanning and collection. This can be used for server applications, which have to be constantly running and can’t afford a pause. The algorithm can be activated by XX:+USeParNewGC to enable it.

G1 Collector

Introduced in Java 7, this was designed to support heap memory larger than 4 GB. This uses multiple background threads to scan through heap, dividing it into various regions. It begins by scanning the region with the maximum number of garbage objects, accounts for it’s name G1( Garbage First). This can be enabled using the –XX:+UseG1GC flag.

There is every good chance of heap being cleared before the background threads have finished scanning unused objects. And in such cases we could encounter a Stop the World scenario, with collector stopping the application, till scanning is done. Another advantage G1 has is that compacts the heap, on  the go, the CMS collector does this only during full Stop the World collection.

G1 Collector String deduplication

Since strings take up most of the heap along with the internal char[] arrays, a new optimization has been made in Java 8. This enables G1 collector to identify strings duplicated across the heap and make them point to the same internal char[] array. This can be enabled using -XX:+UseStringDeduplicationJVM argument and ensures multiple copies of same string are avoided in the heap.

Another improvement in Java 8 is removal of permgen part of heap. This space was meant for class meta data, static variables, interned strings. For larger applications it was imperative to optimize and tune this portion of the heap, and more often than not it resulted in OutOfMemory exception. With JVM itself taking care of this functionality, it would go a long way in improving the performance.



About Ratnakar Sadasyula

I am a 40 year old Blogger with a passion in movies, music,books, Quizzing and politics. A techie by profession, and a writer at heart. Seeking to write my own book one day.
This entry was posted in Garbage Collection, JVM and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s