How compiling code works

In pre-.NET applications like VB6, the code are being converted to Native Code or Machine Code (.dll, .exe) via compiler and these codes will be run by the operating system. The problem with these languages is that they are not portable. This means that they can only run on a specific operating system that the compiler has designed for, so we cannot run these applications on different operating systems.

Native code is specific to processor architecture, not the Operating System.

In .NET applications, the codes are being converted to Intermediate Language (IL) via compiler and this application cannot be understood by the operating system until we compile it again for a specific operating system using Common Language Runtime (CLR).

In CLR there is another compiler named Just In Time (JIT) compiler that compiles IL Code into comprehensible codes for each operating system. This compilation runs every time we run an application and doesn't save in the computer and that's the reason the .NET applications are a little bit slower than those pre-.NET applications because in each run JIT compiler just before running the application compiles the IL code to the native code for that operating system.

VB6 Program Execution Pipeline

Code -> VB6 Compiler -> Machine Code (.dll or .exe)  

.Net Program Execution Pipeline

Code -> C#, VB Compiler -> Intermediate Language (IL) [.dll, .exe] -> CLR (JIT Compiler) -> Machine Code  

Machine code

Machine code is a set of instructions in machine language. The CPU can directly execute it. A programmer can write a computer program using a high-level programming language such as C, C++, Java etc. These languages have a syntax similar to the English language and it is easier for the programmer to read and understand. However, these programs are not understandable by a computer. Therefore, the program or the source code is converted to machine-understandable machine code. A compiler or an interpreter performs this conversion.

Bytecode

Bytecode is created after compiling the source code. It is an intermediate code. The bytecode is executable by a virtual machine. Moreover, the virtual machine converts the bytecode into machine code.

Java programs mainly use bytecodes. When compiling a Java source code, the Java compiler converts that source code into a bytecode. Furthermore, this bytecode is executable by the Java Virtual Machine (JVM). The JVM converts the bytecode into machine code. Any computer with a JVM can execute that bytecode. In other words, any platform that consists of a JVM can execute a Java Bytecode.

The equivalent of bytecode in .NET languages is the IL code. .NET compiler converts codes to IL and then JIT converts IL to machine code. In Java, the compiler converts codes to bytecodes and JVM converts bytecodes to machine code.

Unmanaged Code and Managed Code

Unmanaged code refers to the code written in a programming language such as C or C++, which is compiled directly into machine code. It contrasts with managed code, which is written in C#, VB.NET, Java, or similar, and executed in a virtual environment (such as .NET or the JavaVM) which kind of “simulates” a processor in software.

The main difference is that managed code “manages” the resources (mostly the memory allocation) for you by employing garbage collection and by keeping references to objects opaque. Unmanaged code is the kind of code that requires you to manually allocate and deallocate memory, sometimes causing memory leaks (when you forget to deallocate) and sometimes segmentation faults (when you deallocate too soon). Unmanaged also usually implies there are no runtime checks for common errors such as null pointer or array bounds overflow.

The term "unmanaged resource" is usually used to describe something not directly under the control of the garbage collector. For example, if you open a connection to a database server this will use resources on the server (for maintaining the connection) and possibly other non-.net resources on the client machine if the provider isn't written entirely in managed code.

This is why, for something like a database connection, it's recommended you write your code within using statement:

using (var connection = new SqlConnection("connection_string_here"))
{
    // Code to use connection here
}

As this ensures that Dispose() is called on the connection object, ensuring that any unmanaged resources are cleaned up.

Managed Code = C#, VB.NET, JAVA
Unmanaged Code = C, C++

Ahead-Of-Time vs Just-In-Time Compiler

A compiler is a program that converts a program in language X to a program in language Y. The language Y can be anything (native machine code, intermediate code/bytecode, some other language Z, or the same language itself). A compiler is not necessarily a program that converts a program in language X to machine code. For example, a compilation happens from high-level C# code to CIL using a compiler csc.exe.

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL) is the code generated by compiling c# code to an intermediate code which again needs to be compiled for a specific operating system.

The time in Ahead-Of-Time and Just-In-Time compilers refers to the runtime. So, in an Ahead-Of-Time compiler, the compilation happens before the program is run, usually added as a build step. While in the Just-In-Time compiler, the compilation keeps happening while the program is being run.

To put it in C#.NET perspective, the CIL is generated by MSBuild using csc.exe without any consideration of whether the CLR uses JIT or AOT compiler. It is at the time of running the program that the JIT or AOT compiler comes into action. AOT compiler compiles entire assemblies (in CIL, or language X) into native machine code (language Y) before the program is run. JIT compiler compiles individual methods and classes(in CIL, or language X) into native machine code(language Y) when the methods are called.

Points To Remember:

  • IL is also called MSIL, CIL and Managed Code.
  • Assemblies have an extension of .dll or .exe depending on the type of application.
  • .Net assemblies contain IL, whereas pre-.NET assemblies contain Native Code (Machine Code)
  • .Net application execution consists of 2 steps:
    1. Compilation = Source Code to IL
    2. Execution or JIT Compilation = IL to Platform specific native code
  • CLR provides several benefits. Garbage Collection is one of them.
  • C#, VB and F# can only generate managed code (IL) whereas C++ can generate both managed code (IL) and un-managed code (Native Code)
  • The native code is not stored permanently anywhere, after we close the program the native code is thrown away. When we execute the program again, the native code gets generated again.