Published: 07 Sep 2009
By: Granville Barnett

In the final part of our series looking at CIL we explore the two instruction sets that CIL comprises of.

Contents [hide]

Introduction

In this, the concluding part of our series looking at CIL, we look at both the base and object model instruction sets. Together, the former makeup the CIL instruction set. The reader should note that a fair number of instructions are very rarely used. In fact, some instructions are semantically equivalent to one another.

The outline for this part of the CIL series is as follows: first we introduce a notation which we will use to depict stack transitions; we discuss, briefly, stack semantics; the two instruction sets are described; and finally we disassemble a C# program to drive discussion on the two instruction sets that CIL comprises of.

Stack Transitions

When we discuss the semantics of CIL instructions it is important that we provide the relevant constructs to describe pre- and post-stack state for any given CIL instruction. Some instructions do not affect the state of the stack at all (i.e., they neither pop nor push something from/onto the stack). However, others do.

To assist us in visualizing the varying state of the evaluation stack we will use the following textual notation:

…, value1, value2 ? , …, result

For every instruction we discuss we will include a stack transition description. The left side of the ? denotes the state of the evaluation stack before the CIL instruction is executed. To the right of the ? is the new state of the stack after the CIL instruction has been executed. For example, if a CIL instruction pushed some result onto the evaluation stack then it would be on the right hand side of the ?. The top-of-stack value is the rightmost value per each side of the ?, for example, value2 is the top-of-stack value before the execution of the CIL instruction, and result is the top-of-stack value after the CIL instruction has been executed.

Note:

This notation is used in the ECMA-335 specification of the CLI.

Before we proceed we must briefly cover the semantics of value and reference types. For this there is a fairly concise and short sentence within the ECMA specification of the CLI that will serve us for this article “In contrast to reference types, value types are not accessed by using a reference, but are stored directly in the location of that type.”

When we deal with reference types we first need to have a managed reference on the stack, we then need to dereference the appropriate value of that reference type from there. In contrast, value types are passed by value.

CIL Instruction Sets

As was mentioned in the first part of this series CIL comprises of two instruction sets:

  • Base instructions; and
  • Object model instructions.

The base instruction set provides the very core low-level instructions that are not dependent on the object model employed. They form a Turing-complete set of instructions. In total there are 67 base instructions. Some examples of instructions within the base instruction set include:

  • stloc: store a value into a local variables list.
  • add: adds two values together and pushes the result onto the stack.
  • ldarg: load an argument from the arguments list and push its value onto the stack.
  • ldarga: load an argument from the arguments list and push the address of the argument onto the stack.

The object model instruction set are designed to work with the common type system, they allow one to load and store information from and to fields, create a new object, manipulate arrays, initialize the state of an object, and so on. Examples of object model instructions include:

  • newobj: allocates space for an uninitialized object and calls its constructor.
  • ldfld: push the value of a given field onto the stack.
  • stelem: stores a value at the array index provided.
  • ldstr: pushes a string object onto the stack.

For a full list of the base and object model instruction sets please see Partition III: CIL Instruction Set in the ECMA-335 CLI specification.

When you disassemble a C# program you will see a healthy mixture of both base, and object model instructions as we will see in the next section.

Disassembling an Example Program

We will use the following C# program to discuss some of the base and object model instructions. Whether a CIL instruction falls within the base, or object model instruction sets is irrelevant, as such they are not distinguished from one another. The reader is politely referred to the ECMA-335 specification to see which instruction set a particular instruction falls within.

Listing 1: Types

If you use your disassembly tool of choice to inspect the CIL for the previous C# code example you should see something similar to the following:

Listing 2: CIL Dump of Type Example

We will not be covering that which was explained in part 2 of this series, so most of the CIL dump the reader should already be familiar with. First, let us start by looking at the CIL within the Main method.

Listing 3: CIL for Main method

The first thing to note is that we initialize a single entry within out locals list of type Part3.Person. If you quickly flick back to the C# program you will see that the Main method comprises of a single variable by the name of p which is lexically scoped to the Main method. The variable p will be the only thing stored in the Main methods locals list. If we proceed further you will see that we push two strings onto the stack via the use of the ldstr instruction. The stack transition diagram for the ldstr instruction is as follows:

…, ? …, string

As you can see the ldstr instruction simply pushes a string onto the stack, it does not require any other object or value to be on the stack prior to its execution.

Continuing on the next instruction we see used is the newobj instruction. The stack transition for newobj is as follows:

…, arg1, …, argN ? …, obj

Unlike the ldstr instruction, newobj will expect the correct number of arguments to be on the stack when calling the constructor indicated by the programmer.

For example, in this case we call the constructor that expects two string arguments, thus the newobj instruction will pop those two string arguments off the stack and pass them as arguments to our constructor. Please note that before the arguments are passed the newobj instruction will initialize all the fields of the Person type to null or 0 (null for reference types, and 0 for value types; therefore the m_firstName, and m_lastName fields will be initialized to null). With the constructor now having been called the initialized object reference is pushed onto the stack.

Progressing further through the Main method we see that we store the object reference into slot 0 of the Main methods locals list.

The stack transition for the stloc instruction is as follows:

…, value ? ...

The value on the stack is stored into the index specified, in this case index 0 of the locals list denoted by stloc.0.

Now that we have our object reference stored we see that we need it again in a few instructions time as we call the System.Console.WriteLine method passing p as its argument. The call to the WriteLine method expects there to be an object reference on the stack. To set the stack up ready for the method call we load our p variable from index 0 of the locals list and push its object reference onto the stack, the subsequent method call then pops that object reference off the stack to use as its parameter.

The stack transition for ldloc is as follows:

… ? …, value

The stack transition for the call instruction is as follows:

…, arg1, arg2, …, argN ? …, retVal

The call instruction calls the method denoted by the method descriptor, which, in this case is the static method System.Console.WriteLine that takes a single object parameter, has a return type of void, and is defined in mscorlib. The retVal in the stack transition is optional, and because WriteLine has a return type of void the method call will not result in any value being pushed onto the evaluation stack.

Now that we have discussed some of the interesting CIL instructions emitted for the Main method we will proceed to look at how the Person type is described using CIL.

Listing 4: Person Type

In the previous part of this series we covered .field, as well as constructors. However, the first few instructions within the constructor are quite interesting.

The first instruction is very important, we are loading an object reference to this object onto the stack. Before all instance methods are invoked we first have to push an object reference to this onto the evaluation stack. With the this object reference on the stack we then call the constructor of our base class object. For instance methods the following is the order of its arguments:

thisPtr, arg1, …, argN

With that in mind we then subsequently store the values of the two strings we pushed onto the stack into their respective fields.

Remember that ldarg.1 is the first string parameter to the constructor, and ldarg.2 is the second.

Finally, the next instruction of note is that of ldfld in the ToString method.

Just like the previous example the first argument is the object reference to this. With the this object reference pushed onto the stack we then push the value of its m_firstName onto the stack, the same is then done for the m_lastName field. Finally, we invoke the string.Format method which takes the three items on the evaluation stack: our string (see the ldstr instruction); and then the values of m_firstName, and m_lastName respectively. As string.Format returns a string when we return to the caller there is a string on the evaluation stack which is consumed by the call to WriteLine in the Main method.

Summary

In this part of the series covering CIL we looked at the two instruction sets that CIL comprises of, and used a sample C# program and its emitted CIL code to discuss the semantics of the instructions from both the base and object model instruction sets.

<<  Previous Article Continue reading and see our next or previous articles Next Article >>

About Granville Barnett

Sorry, no bio is available

This author has published 32 articles on DotNetSlackers. View other articles or the complete profile here.

Other articles in this category


Developing a Hello World Java Application and Deploying it in Windows Azure - Part I
This article demonstrates how to install Windows Azure Plugin for Eclipse, create a Hello World appl...
Android for .NET Developers - Building a Twitter Client
In this article, I'll discuss the features and capabilities required by an Android application to ta...
Ref and Out (The Inside Story)
Knowing the power of ref and out, a developer will certainly make full use of this feature of parame...
Developing a Hello World Java Application and Deploying it in Windows Azure - Part II
In this article we will see the steps involved in deploying the WAR created in the first part of thi...
Android for .NET Developers - Using Web Views
In this article, I'll show a native app that contains a web-based view. The great news is that HTML ...
Top
 
 
 

Please login to rate or to leave a comment.

Free Agile Project Management Tool from Telerik
TeamPulse Community Edition helps your team effectively capture requirements, manage project plans, assign and track work, and most importantly, be continually connected with each other.