Monday, March 17, 2008

Garbage Collection

How Does the Garbage Collector Work?
You just can't be sure. You might hear that the garbage collector uses a mark and sweep algorithm, and for any given Java implementation that might be true, but the Java specification doesn't guarantee any particular implementation. You might hear that the garbage collector uses reference counting; once again maybe yes maybe no. The important concept to understand is when does an object become eligible for garbage collection? In a nutshell, every Java program has from one to many threads. Each thread has its own little execution stack. Normally, you (the programmer) cause at least one thread to run in a Java program, the one with the main() method at the bottom of the stack. However, there are many really cool reasons to launch additional threads from your initial thread. In addition to having its own little execution stack, each thread has its own life cycle. For now, all we need to know is that threads can be alive or dead. With this background information, we can now say with stunning clarity and resolve that an object is eligible for garbage collection when no live thread can access it. (Note: Due to the vagaries of the String constant pool, the exam focuses its garbage collection questions on non-String objects, and so our garbage collection discussions apply to only non-String objects too.)

Based on that definition, the garbage collector does some magical, unknown operations, and when it discovers an object that can't be reached by any live thread, it will consider that object as eligible for deletion, and it might even delete it at some point. (You guessed it; it also might not ever delete it.) When we talk about reaching an object, we're really talking about having a reachable reference variable that refers to the object in question. If our Java program has a reference variable that refers to an object, and that reference variable is available to a live thread, then that object is considered reachable. We'll talk more about how objects can become unreachable in the following section.

Can a Java application run out of memory? Yes. The garbage collection system attempts to remove objects from memory when they are not used. However, if you maintain too many live objects (objects referenced from other live objects), the system can run out of memory. Garbage collection cannot ensure that there is enough memory, only that the memory that is available will be managed as efficiently as possible.

Writing Code that Explicitly Makes Objects Eligible for Collection
Nulling a Reference
The first way to remove a reference to an object is to set the reference variable that refers to the object to null. Examine the following code:

1. public class GarbageTruck {
2. public static void main(String [] args) {
3. StringBuffer sb = new StringBuffer("hello");
4. System.out.println(sb);
5. // The StringBuffer object is not eligible for collection
6. sb = null;
7. // Now the StringBuffer object is eligible for collection
8. }
9. }

The StringBuffer object with the value hello is assigned to the reference variable sb in the third line. To make the object eligible (for GC), we set the reference variable sb to null, which removes the single reference that existed to the StringBuffer object. Once line 6 has run, our happy little hello StringBuffer object is doomed, eligible for garbage collection.

Reassigning a Reference Variable
We can also decouple a reference variable from an object by setting the reference variable to refer to another object. Examine the following code:

class GarbageTruck {
public static void main(String [] args) {
StringBuffer s1 = new StringBuffer("hello");
StringBuffer s2 = new StringBuffer("goodbye");
System.out.println(s1);
// At this point the StringBuffer "hello" is not eligible
s1 = s2; // Redirects s1 to refer to the "goodbye" object
// Now the StringBuffer "hello" is eligible for collection
}
}

Objects that are created in a method also need to be considered. When a method is invoked, any local variables created exist only for the duration of the method. Once the method has returned, the objects created in the method are eligible for garbage collection. There is an obvious exception, however. If an object is returned from the method, its reference might be assigned to a reference variable in the method that called it; hence, it will not be eligible for collection. Examine the following code:

import java.util.Date;
public class GarbageFactory {
public static void main(String [] args) {
Date d = getDate();
doComplicatedStuff();
System.out.println("d = " + d) ;
}

public static Date getDate() {
Date d2 = new Date();
StringBuffer now = new StringBuffer(d2.toString());
System.out.println(now);
return d2;
}
}

In the preceding example, we created a method called getDate() that returns a Date object. This method creates two objects: a Date and a StringBuffer containing the date information. Since the method returns the Date object, it will not be eligible for collection even after the method has completed. The StringBuffer object, though, will be eligible, even though we didn't explicitly set the now variable to null.

Isolating a Reference
There is another way in which objects can become eligible for garbage collection, even if they still have valid references! We call this scenario "islands of isolation."

A simple example is a class that has an instance variable that is a reference variable to another instance of the same class. Now imagine that two such instances exist and that they refer to each other. If all other references to these two objects are removed, then even though each object still has a valid reference, there will be no way for any live thread to access either object. When the garbage collector runs, it can usually discover any such islands of objects and remove them. As you can imagine, such islands can become quite large, theoretically containing hundreds of objects. Examine the following code:


public class Island {
Island i ;
public static void main(String [] args) {

Island i2 = new Island();
Island i3 = new Island();
Island i4 = new Island();

i2.i = i3 // i2 refers to i3
i3.i = i4 // i3 refers to i4
i4.i = i2 // i4 refers to i2

i2 = null
i3 = null
i4 = null

// do complicated, memory intensive stuff
}
}

When the code reaches // do complicated, the three Island objects (previously known as i2,i3, and i4) have instance variables so that they refer to each other, but their links to the outside world (i2, i3, and i4) have been nulled. These three objects are eligible for garbage collection.

This covers everything you will need to know about making objects eligible for garbage collection. Study Figure below to reinforce the concepts of objects without references and islands of isolation.




Forcing Garbage Collection
The first thing that should be mentioned here is that, contrary to this section's title, garbage collection cannot be forced. However, Java provides some methods that allow you to request that the JVM perform garbage collection. For example, if you are about to perform some time-sensitive operations, you probably want to minimize the chances of a delay caused by garbage collection. But you must remember that the methods that Java provides are requests, and not demands; the virtual machine will do its best to do what you ask, but there is no guarantee that it will comply.

In reality, it is possible only to suggest to the JVM that it perform garbage collection. However, there are no guarantees the JVM will actually remove all of the unused objects from memory (even if garbage collection is run). It is essential that you understand this concept.

The garbage collection routines that Java provides are members of the Runtime class. The Runtime class is a special class that has a single object (a Singleton) for each main program. The Runtime object provides a mechanism for communicating directly with the virtual machine. To get the Runtime instance, you can use the method Runtime.getRuntime(), which returns the Singleton. Once you have the Singleton you can invoke the garbage collector using the gc() method. Alternatively, you can call the same method on the System class, which has static methods that can do the work of obtaining the Singleton for you. The simplest way to ask for garbage collection (remember—just a request) is
  System.gc();

Theoretically, after calling System.gc(), you will have as much free memory as possible. We say theoretically because this routine does not always work that way. First, your JVM may not have implemented this routine; the language specification allows this routine to do nothing at all. Second, another thread might grab lots of memory right after you run the garbage collector.

This is not to say that System.gc() is a useless method—it's much better than nothing. You just can't rely on System.gc() to free up enough memory so that you don't have to worry about running out of memory.

Now that we are somewhat familiar with how this works, let's do a little experiment to see if we can see the effects of garbage collection. The following program lets us know how much total memory the JVM has available to it and how much free memory it has. It then creates 10,000 Date objects. After this, it tells us how much memory is left and then calls the garbage collector (which, if it decides to run, should halt the program until all unused objects are removed). The final free memory result should indicate whether it has run. Let's look at the program:

1. import java.util.Date;
2. public class CheckGC {
3. public static void main(String [] args) {
4. Runtime rt = Runtime.getRuntime();
5. System.out.println("Total JVM memory: "+ rt.totalMemory());
6. System.out.println("Before Memory = "+ rt.freeMemory());
7. Date d = null;
8. for(int i = 0; i<10000; i-n-) {
9 . d = new Date () ;
10. d = null;
11. }
12. System.out.println("After Memory = "+ rt.freeMemory());
13. rt.gc(); // an alternate to System.gc()
14. System.out.println("After GC Memory = "+ rt.freeMemory());
15. }
16. }

Now, let's run the program and check the results:
Total JVM memory: 1048568
Before Memory = 703008
After Memory = 458048
After GC Memory = 818272

As we can see, the JVM actually did decide to garbage collect (i.e., delete) the eligible objects. In the preceding example, we suggested to the JVM to perform garbage collection with 458,048 bytes of memory remaining, and it honored our request. This program has only one user thread running, so there was nothing else going on when we called rt.gc(). Keep in mind that the behavior when gc() is called may be different for different JVMs, so there is no guarantee that the unused objects will be removed from memory. About the only thing you can guarantee is that if you are running very low on memory,

Tricky Little finalize() Gotcha's
There are a couple of concepts concerning finalize() that you need to remember.

For any given object, finalize () will be called only once (at most) by the garbage collector.

Calling finalize() can actually result in saving an object from deletion.

Let's look into these statements a little further. First of all, remember that any code that you can put into a normal method you can put into finalize(). For example, in the finalize() method you could write code that passes a reference to the object in question back to another object, effectively uneligiblizing the object for garbage collection. If at some point later on this same object becomes eligible for garbage collection again, the garbage collector can still process this object and delete it. The garbage collector, however, will remember that, for this object, finalize() already ran, and it will not run finalize() again.

No comments:

Post a Comment