Garbage Collector

The garbage collector checks the heap for objects which are no longer used by application. If such objects exist then the GC removes those objects from the heap.

Process of finding unused objects:Every application maintains a set of roots.

Roots are like pointers to the objects on heap. All the global and static object pointers are considered as application roots. Any local variable on thread stack is considered as application root. This list of roots is maintained by JIT compiler and CLR and is made available to GC.

There are different kinds of roots in GC

1. Strong Reference – If a strong reference exists for an object then it is considered in use and is not collected during the next GC collection.

2. Weak Reference – This is also a reference but the object can live till the next GC collection. Weak Reference works like a cache for object.

 When the GC starts running, it treats all objects as garbage and makes an assumption that none of the objects on heap are accessible. It then starts walking with the list of application roots and starts building a graph of accessible objects. It marks all the objects on heap accessible if the objects are directly accessible as application root or indirectly accessible via any other object. For each application, the GC maintains a tree of references that tracks the objects referenced by the application. Using this approach, GC builds a list of live objects and then walks through the heap in search of objects which are not present in this list of live objects. After finding out the objects which are not present in this list of live objects, it marks them all as garbage and starts to compact the memory to remove holes which were created by unreferenced (dead) objects. It uses memcpy function to move the objects from one memory location to other and modifies the application roots to point to new locations. If there is a live reference to the object then it is said to be strongly rooted. .Net also has a concept of Weak Reference. Any object can be created as a weak reference which tells GC that we want to access this object but if the GC is going through garbage collection then it can collect it. Weak reference is generally used for very large object which is easy to create but is costly to maintain in memory.

GC Sequence

These steps occur during each GC Collection

1. Execution Engine Suspension – The EE is suspended until all managed thread have reached a point in their code execution deemed “safe”

2. Mark – Objects that don’t have roots are marked garbage

3. Plan – GC creates a budget for each generation being collected and then determines the amount of fragmentation that will exist in the managed heap as a result of GC collection

4. Sweep – Deletes all objects marked for deletion

5. Compact – Move all non-pinned objects that survived GC to the lower end of heap

6. Execution Engine Restart – Restart the execution of managed threads

To improve the performance, GC does several optimizations like large object heap and Generations. Objects which are in size greater than 85,000 bytes are allocated on large object heap. Moving large objects in memory is costly so GC maintans a seprate heap for large objects which it never compacts. GC also maintains generations of objects. Whenever a new object is to be allocated and the managed heap doesn’t have ­enough memory for the new object, a GC collection is performed. For the first time, every object in heap is considered in Gen 0. After that GC performs a collection the objects which survive are moved to Gen 1 and similarly which survives Gen 1 collection move to Gen 2. GC makes the assumption that a new object will have a short lifetime and a old object will have a longer lifetime. Whenever new memory is required, GC tries to collect memory from Gen 0 and if enough memory can’t be obtained from Gen 0 collections then a Gen 1 or even Gen 2 collection is performed.

 

Leave a Reply