Summary Day-2

What we discussed today

JVM architecture

  1. Java memory management JVM manages memory without having user to explictly manage memory by using something called as Garbage Collector. Garbage collection is a process of reclaiming the runtime unused memory automatically. In other words, it is a way to destroy the unused objects. JVM divides memory into 3 parts

    1. Stack -- All variables which are scoped exclusively to a method are stored here including method parameters and local variables. They get cleared automatically when the method execution is over. by poping off the stack entry in LIFO order.
    2. Heap -- This has nothing to do with heap data structure, any object created using new operator is stored here. This memory is shared across all threads. This memory is cleared by Garbage Collector.
    3. PermGen -- This is a special memory area separated from main heap area and stores metadata about classes, methods and static items i.e anything defined with the static keyword. Stringpool was also a part of this space till java 7 but was moved out as the space in here is limited. Apart from the above stuff this space also stored JIT Information, bytecode. Due to limited space in PermGen Out of memory error was pretty common in Java 7 and below. In Java 8 PermGen was removed and replaced with Metaspace which is a part of Heap memory. Since Metaspace is part of Heap its garbage collected in an even better manner and can automatically scale based on the need which was not possible with PermGen.
    4. It is possible to tune both metaspace and heap by using the following parameters. Note that Metaspace parameters only work with java 8 and above and PermGen parameters only work with java 7 and below.
      1. PermGen parameters -XX:PermSize=200m sets PermGen size to 200MB and -XX:MaxPermSize=300m sets max PermGen size to 300MB.
      2. Metaspace parameters are as follows
        1. MetaspaceSize and MaxMetaspaceSize – we can set the Metaspace upper bounds.
        2. MinMetaspaceFreeRatio – is the minimum percentage of class metadata capacity free after garbage collection
        3. MaxMetaspaceFreeRatio – is the maximum percentage of class metadata capacity free after a garbage collection to avoid a reduction in the amount of space
  2. Garbage Collection in Oracle's HotSpot JVM works using Mark Sweep algorithm which works as follows. Reference

    1. Before we understand the garbage collection process let's see different sections of the heap.

      1. Heap is divided into 3 parts.
        1. Young Generation
        2. Old Generation / Tenured Generation
        3. Permanent Generation PermGen
      2. Young Generation is further divided into 3 parts
        1. Eden Space
        2. Survivor Space
          1. Survivor space 0 (S0)
          2. Survivor space 1 (S1)
    2. First when a new object is created its is stored in eden space, when garbage collector starts its minor cycle it marks all objects which are unrefrenced in young generation. Then eden space that are unreferenced are cleared and the objects that have references are moved to S0. In the next minor cycle unreferenced objects are cleared asusual but all referenced objects in S0 and Eden space are moved to S1. In the next cycle all unreferenced objects are cleared but the ones holding references are moved from S1 and eden space to S0. Also as this process keeps happening objects that survive have a counter attached which tracks how many cycles the object has survived.

    3. Once the counter reaches a certain threshold the object is moved from young generation to old generation or Tenured Generation.

    4. Major cycle cleans up the tenured generation but it is less frequent as it is slow and involves a lot of long lived objects which probably don't need cleaning as they are still being used so a lot of work is wasted, this allows us to keep minor cycles faster as it tracks fewer number of objects

    5. Keeping GC cycles fast is important as GC is stop of the world as in application pauses when GC is running to ensure memory integrity intact.

    6. Now that we understand how GC works let's formally see what's mark and sweep algorithm and different steps involved in it

      1. Mark -- In this phase all the objects which are reachable from the root objects are marked as live. Root objects are objects which are directly referenced by the program. For example, all the objects which are referenced by local variables in the stack are marked as live. All the objects which are not marked as live are considered as garbage.
      2. Sweep -- In this phase all the objects which are not marked as live are deleted from the memory and the memory is reclaimed.
      3. Compaction -- In this phase all the live objects are moved to one end of the memory and the free space is moved to the other end of the memory. This is done to avoid fragmentation of memory.
  3. There are 4 types of Garbage Collectors in HotSpot JVM

    1. Serial GC -- This is the simplest GC which uses only one thread to do all the work. This GC is best suited for single core machines. This GC is used by default when we run java with -client option.
    2. Parallel GC -- This GC uses multiple threads to do the work. This GC is best suited for multi core machines. This GC is used by default when we run java with -server option.
    3. Concurrent Mark Sweep (CMS) GC -- The Concurrent Mark Sweep (CMS) collector is designed for applications that prefer shorter garbage collection pauses and that can afford to share processor resources with the garbage collector while the application is running. Typically applications that have a relatively large set of long-lived data (a large tenured generation) and run on machines with two or more processors tend to benefit from the use of this collector. However, this collector should be considered for any application with a low pause time requirement. The CMS collector is enabled with the command-line option -XX:+UseConcMarkSweepGC. Reference
    4. Garbage First (G1) GC -- The Garbage-First (G1) garbage collector is a server-style garbage collector, targeted for multiprocessor machines with large memories. It attempts to meet garbage collection (GC) pause time goals with high probability while achieving high throughput. Whole-heap operations, such as global marking, are performed concurrently with the application threads. This prevents interruptions proportional to heap or live-data size. Reference
  4. APIs are used to expose functionality of an application to other applications or users. For example, if you want to use google maps in your application you can use google maps API to do so. Similarly if you want to use facebook login in your application you can use facebook login API to do so.

    1. So in general Application Programing Interfaces or APIs are nothing but tools/libraries provided by a company or an author to be able to use their application or library. In a general sense any external function or method exposed either as an SDK/Library or over the internet using any internet protocol such as HTTP/RPC is called an API.
    2. REST APIs are HTTP based APIs that use JSON for data interchange
    3. SOAP APIs are HTTP based APIs that use XML for data interchange
    4. APIs are used to expose functionality of an application to other applications or users. For example, if you want to use google maps in your application you can use google maps API to do so. Similarly if you want to use facebook login in your application you can use facebook login API to do so.
    5. Further reading
      1. IBM - What's an API
      2. AWS - What's an API
  5. HTTP Structure -- There are 2 main parts of using HTTP, one we make a request to fetch/modify a HTTP resource then the server responds with a response.

    1. Both HTTP Request and HTTP Response have 2 parts to it Header and Body.
    2. Request header has information about HTTP protocol version and HTTP method being used along with additional metadata describing content type, content length, encoding etc.
    3. Response header has information about HTTP protocol version and HTTP status code along with additional metadata describing content type, content length, encoding etc.
    REQUEST 
        HEADERS
        BODY
    
    RESPONSE
        HEADERS
        BODY -JSON =? REST / XML =>SOAP
    
    1. Common HTTP Methods are
    GET -> GET/READ
    POSt -> UPDATE/SEND
    PUT -> CREATE
    DELETE -> DELETE
    
    1. Common HTTP Status Codes are
    HTTP
    2XX - SUCCESS
    3XX - REDIRECT
    4XX - USER ERROR
    5XX - SERVER ERROR
    

Things I misunderstood or got wrong

  1. 1.3 PermGen -- This is a special memory area separated from main heap area and stores metadata about classes, methods and static items i.e anything defined with the static keyword. Stringpool was also a part of this space till java 7 but was moved out as the space in here is limited. Apart from the above stuff this space also stored JIT Information, bytecode. Due to limited space in PermGen Out of memory error was pretty common in Java 7 and below. In Java 8 PermGen was removed and replaced with Metaspace which is a part of Heap memory. Since Metaspace is part of Heap its garbage collected in an even better manner and can automatically scale based on the need which was not possible with PermGen.
    1. It is possible to tune both metaspace and heap by using the following parameters. Note that Metaspace parameters only work with java 8 and above and PermGen parameters only work with java 7 and below.
      1. PermGen parameters -XX:PermSize=200m sets PermGen size to 200MB and -XX:MaxPermSize=300m sets max PermGen size to 300MB.
      2. Metaspace parameters are as follows
        1. MetaspaceSize and MaxMetaspaceSize – we can set the Metaspace upper bounds.
        2. MinMetaspaceFreeRatio – is the minimum percentage of class metadata capacity free after garbage collection
        3. MaxMetaspaceFreeRatio – is the maximum percentage of class metadata capacity free after a garbage collection to avoid a reduction in the amount of space
  2. 2 ReRead Garbage collection I made it a bit clearer than what we discussed in the morning. Plus I've found new images and docs that make understanding easier Yay !!!
  3. 3 There are 4 types of Garbage Collectors in HotSpot JVM
    1. Serial GC -- This is the simplest GC which uses only one thread to do all the work. This GC is best suited for single core machines. This GC is used by default when we run java with -client option.
    2. Parallel GC -- This GC uses multiple threads to do the work. This GC is best suited for multi core machines. This GC is used by default when we run java with -server option.
    3. Concurrent Mark Sweep (CMS) GC -- The Concurrent Mark Sweep (CMS) collector is designed for applications that prefer shorter garbage collection pauses and that can afford to share processor resources with the garbage collector while the application is running. Typically applications that have a relatively large set of long-lived data (a large tenured generation) and run on machines with two or more processors tend to benefit from the use of this collector. However, this collector should be considered for any application with a low pause time requirement. The CMS collector is enabled with the command-line option -XX:+UseConcMarkSweepGC. Reference
    4. Garbage First (G1) GC -- The Garbage-First (G1) garbage collector is a server-style garbage collector, targeted for multiprocessor machines with large memories. It attempts to meet garbage collection (GC) pause time goals with high probability while achieving high throughput. Whole-heap operations, such as global marking, are performed concurrently with the application threads. This prevents interruptions proportional to heap or live-data size. Reference

What I missed

  1. what's curl ? -- Curl is a tool used to make Network requests for various protocols including HTTP. So any networking thing we can do with a browser or a programing language the chances are we can do it with curl
  2. Link between API and HTTP -- APIs are like functions, when they are exposed via network the internet we can use HTTP to access them
  3. How do you use it Web APIs -- We access them by using HTTP client libraries provided in various languages