How Java Works ? A Basic Introduction to the Java Virtual Machine

Rashmin Mudunkotuwa
6 min readAug 12, 2023

--

Photo by Mike Kenneally on Unsplash

Java Virtual Machine (JVM) is the runtime which programs written with the Java programming language executes on. When running/developing Java programs it is not essential to have an understanding about how a Java program executes or about the insides of the JVM. So for most developers JVM is kind of a magic box which just gets the job done. But knowing more about the very thing that drives the whole Java language and other related languages can be very beneficial to a programmer.

I’ve been recently digging into the insides of the JVM and thought of sharing my ideas about how java works and insides of the Java Virtual Machine.

Java Virtual Machine

Java Virtual Machine (JVM) is a abstract machine which is used to execute a type of code, bytecode . You can think of it as an intermediary between our code and the computer hardware, which takes our code as an input, processes it and executes it in the computer hardware giving the developer the intended result.

Bytecode

Bytecode is the file type the JVM understands. Its an intermediate representation of a Java program which is created by compiling Java code (using javac). It gets its name because each opcode (operation) is single byte in size. Bytecode could be compiled into machine code and run in a computer.

Compilation

First step of running a java program is compiling it. If you have a single Java file you can trigger the compilation using the provided command line tool javac

javac HelloWorld.java
java compilation

The above code will compile a provided Java file and create files of the type .class which includes the bytecode. If there are any inconsistencies in the source code, the compilation would fail outputting a compile time error.

You can look into the created class file using the provided tool javap to get a look inside an class file.

javap HelloWorld.class

Execution

After creating the .class file by compilation, we can start an instance of the JVM using java syntax and it will trigger an execution path with multiple complex steps which would ultimately result in executing our provided code.

java HelloWorld

The first thing which would happen is JVM would need to get our .class file and load it into the JVM memory areas. This initial process would be achieved via JVM Classloaders.

What is Classloading ?

In very abstract terms, what classloading does is scanning and going through the provided .class file and loading the contents of the class file inside the memory areas of the JVM. Then the execution engine can refer to those stored data and carry on the execution of our code.

There are three types of classloaders in JVM. Namely,

  1. Bootstrap Classloader
  2. Extentions Classloader
  3. Application Classloader

Bootstrap Classloader’s duty is to load the base/core Java classes which are essential for a Java program to run. In earlier Java versions, these core classes were include in the file rt.jar which was located in the jre/lib directory, but in later Java versions the contents of the rt.jar have been divided into modular components.

Extentions Classloaders duty is to load the classes inside the lib/ext directory which could include any extensions we use inside our code.

The Application Classloader is the most used out of the three, and is responsible for loading the use defined classes. It will scan the class-path of our program and load the classes inside of that.

Classloading Process

There are two main steps in the classloading process.

  1. Loading
  2. Linking

Loading

In the loading process the classloader reads the binary representation of the class file i.e .class file, and creates a representation of it inside the JVM’s runtime memory. This representation is called the Class Object and is located inside the Method Area of the JVM Memory.

Linking

After the loading process. The linking is started. There are 3 steps in linking.

  1. Verification — Making sure about the correctness of the class file. Verifying that the class is up to the Java specifications.
  2. Preparation — Allocating memory for the static blocks/fields and assigning default values (not initial values !) to static variables.
  3. Resolving — Resolving the (symbolic) references inside the class file.
compilation, linking and loading

Resolving

In the resolving phase of linking, the classloader would resolve the constant_pool table which is an entity inside the .class file/ class object which is similar to a symbol table which specifies fields/methods/references which are inside the Java class. In the class file, the references to other classes are denoted symbolically, without a concrete memory address to refer to. Resolving would search the JVM memory and assign concrete references to those symbolic references. If a class which is not loaded yet is found inside the .class file, it will trigger the loading/linking of that class itself which can trigger in a recursive process of loading and linking.

After the loading and linking of the bytecode, the class is stored in the JVM memory (? will talk about in an upcoming section) successfully and ready for initialization.

Initialization

Initialization of a class file is triggered when the class is first mentioned in the code by using the newkeyword/static field reference or if its an initialization class defined when executing the program (e.g. Main class).

In the initialization phase, the static blocks are executed and the static variables are assigned their initial provided values.

Runtime Memory Areas

In the above paragraph I’ve mentioned storing class file data in the JVM Memory, multiple times. Where exactly is this data stored as a result of loading/linking/initializing ? Answer is Runtime Memory Areas.

JVM Runtime Memory Area is designated memory space which is divided into multiple sections which stores execution related/class file related data.

I have listed major areas of the Runtime Memory Area below

Method Area

Method area is part of the Runtime Memory which store the class file related data. The Runtime Constant Pool, Field Metadata, Class Metadata, Method Metadata and the Bytecode itself etc. is store inside this Method area.

Program Counter (PC)

Program counter is a small memory area which stores the address of the currently executing operation which is an essential information to the execution of a Java program. Each thread has its own PC.

Heap

Stores all the instances of classes/arrays. Shared between all the threads,

JVM Stack

Holds local variables and partial results. Contains Stack Frames. Each thread has its own JVM Stack.

Stack Frame

A new frame is created inside the stack when a method is invoked. Will store the local variables and partial results regarding that method. If another method is invoked from inside that method, a new stack frame is created for the newly invoked method. Only one frame is active at a time in a given thread.

Execution

In the above sections I talked briefly about how a Java source code compiles and get loaded into the JVM Runtime memory areas.

Lets talk about how that data is executed.

This part of the process is achieved via the Execution Engine of the JVM which consists of two main parts. (Execution Engine includes many other components but will not talk about them in this article)

  1. Interpreter
  2. JIT (Just in Time) Compiler

Java as a programming language is a hybrid interpreted and compiled language, which means Java code is both interpreted and compiled. How this happens in a nutshell is when the class file code begins executing, the bytecode is directly interpreted using the JVM interpreter, no compiling is done. Main reason for this can be said as the startup speed and the execution speed (no need to wait for the code to be compiled to execute it)

When interpreting is going on, the JVM execution engine identifies the warm and hot parts of our code, which can be said as the code blocks which are frequently executing or can be optimized. These warm and hot code blocks are compiled using the JIT compiler and the compilation is done execution engine switches from interpreting the bytecode to executing the compiled code.

There are multiple levels to this compilation process which is called a tiered compilation. I will not go into details about the execution engine or tiered compilation in this article.

This was a brief attempt to demystify the execution of a Java code. I will go into further detail about individual components of the JVM in a later article. Hope you gained something from this. Please do comment if you have any feedback. Cheers !

Want to Connect ?

- Medium
- LinkedIn
- Twitter(X)
- Threads

Find me everywhere @rashm1n.
Resources

- https://docs.oracle.com/javase/specs/jvms/se17/html/index.html

--

--

Rashmin Mudunkotuwa
Rashmin Mudunkotuwa

Written by Rashmin Mudunkotuwa

Software Engineer | Interested in Cloud Computing, Microservices, API Development, and Software as a whole.

No responses yet