home comics writing pictures archive about

2016-01-16 - In IL: Variables and Types

In IL

Variables are important to any program as they determine where information gets stored. Types are equally important as they determine the format of that information. So to start this investigation off we are going to look at how and where information is stored in an IL program. The following C# program declares a bunch of variables and then writes the string representation of those variables to the screen.

Program.cs
using System;
using System.Collections;
namespace Variables
{
class Program
{
static void Main(string[] args)
{
bool b = true;
char c = 'c';
float f = float.MaxValue;
double d = double.MaxValue;
sbyte sb = sbyte.MaxValue;
short sh = short.MaxValue;
int i = int.MaxValue;
long l = long.MaxValue;
byte ub = byte.MaxValue;
ushort ush = ushort.MaxValue;
uint ui = uint.MaxValue;
ulong ul = ulong.MaxValue;
decimal dl = decimal.MaxValue;
object o = new object();
string s = "s";
ArrayList al = new ArrayList();
Console.WriteLine(b);
Console.WriteLine(c);
Console.WriteLine(o);
Console.WriteLine(s);
Console.WriteLine(f);
Console.WriteLine(d);
Console.WriteLine(sb);
Console.WriteLine(sh);
Console.WriteLine(i);
Console.WriteLine(l);
Console.WriteLine(ub);
Console.WriteLine(ush);
Console.WriteLine(ui);
Console.WriteLine(ul);
Console.WriteLine(dl);
Console.WriteLine(al);
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

If you compile this program and then look at the generated IL code using ildasm or ilspy you will see something like the following (Note that you may need to compile it in debug mode to stop the compiler from optimizing out some variables).

Main
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 230 (0xe6)
.maxstack 6
.locals init ([0] bool b,
[1] char c,
[2] float32 f,
[3] float64 d,
[4] int8 sb,
[5] int16 sh,
[6] int32 i,
[7] int64 l,
[8] uint8 ub,
[9] uint16 ush,
[10] uint32 ui,
[11] uint64 ul,
[12] valuetype [mscorlib]System.Decimal dl,
[13] object o,
[14] string s,
[15] class [mscorlib]System.Collections.ArrayList al)
IL_0000: nop
IL_0001: ldc.i4.1
IL_0002: stloc.0
IL_0003: ldc.i4.s 99
IL_0005: stloc.1
IL_0006: ldc.r4 3.4028235e+038
IL_000b: stloc.2
IL_000c: ldc.r8 1.7976931348623157e+308
IL_0015: stloc.3
IL_0016: ldc.i4.s 127
IL_0018: stloc.s sb
IL_001a: ldc.i4 0x7fff
IL_001f: stloc.s sh
IL_0021: ldc.i4 0x7fffffff
IL_0026: stloc.s i
IL_0028: ldc.i8 0x7fffffffffffffff
IL_0031: stloc.s l
IL_0033: ldc.i4 0xff
IL_0038: stloc.s ub
IL_003a: ldc.i4 0xffff
IL_003f: stloc.s ush
IL_0041: ldc.i4.m1
IL_0042: stloc.s ui
IL_0044: ldc.i4.m1
IL_0045: conv.i8
IL_0046: stloc.s ul
IL_0048: ldloca.s dl
IL_004a: ldc.i4.m1
IL_004b: ldc.i4.m1
IL_004c: ldc.i4.m1
IL_004d: ldc.i4.0
IL_004e: ldc.i4.0
IL_004f: call instance void [mscorlib]System.Decimal::.ctor(int32,
int32,
int32,
bool,
uint8)
IL_0054: newobj instance void [mscorlib]System.Object::.ctor()
IL_0059: stloc.s o
IL_005b: ldstr "s"
IL_0060: stloc.s s
IL_0062: newobj instance void [mscorlib]System.Collections.ArrayList::.ctor()
IL_0067: stloc.s al
IL_0069: ldloc.0
IL_006a: call void [mscorlib]System.Console::WriteLine(bool)
IL_006f: nop
IL_0070: ldloc.1
IL_0071: call void [mscorlib]System.Console::WriteLine(char)
IL_0076: nop
IL_0077: ldloc.s o
IL_0079: call void [mscorlib]System.Console::WriteLine(object)
IL_007e: nop
IL_007f: ldloc.s s
IL_0081: call void [mscorlib]System.Console::WriteLine(string)
IL_0086: nop
IL_0087: ldloc.2
IL_0088: call void [mscorlib]System.Console::WriteLine(float32)
IL_008d: nop
IL_008e: ldloc.3
IL_008f: call void [mscorlib]System.Console::WriteLine(float64)
IL_0094: nop
IL_0095: ldloc.s sb
IL_0097: call void [mscorlib]System.Console::WriteLine(int32)
IL_009c: nop
IL_009d: ldloc.s sh
IL_009f: call void [mscorlib]System.Console::WriteLine(int32)
IL_00a4: nop
IL_00a5: ldloc.s i
IL_00a7: call void [mscorlib]System.Console::WriteLine(int32)
IL_00ac: nop
IL_00ad: ldloc.s l
IL_00af: call void [mscorlib]System.Console::WriteLine(int64)
IL_00b4: nop
IL_00b5: ldloc.s ub
IL_00b7: call void [mscorlib]System.Console::WriteLine(int32)
IL_00bc: nop
IL_00bd: ldloc.s ush
IL_00bf: call void [mscorlib]System.Console::WriteLine(int32)
IL_00c4: nop
IL_00c5: ldloc.s ui
IL_00c7: call void [mscorlib]System.Console::WriteLine(uint32)
IL_00cc: nop
IL_00cd: ldloc.s ul
IL_00cf: call void [mscorlib]System.Console::WriteLine(uint64)
IL_00d4: nop
IL_00d5: ldloc.s dl
IL_00d7: call void [mscorlib]System.Console::WriteLine(valuetype [mscorlib]System.Decimal)
IL_00dc: nop
IL_00dd: ldloc.s al
IL_00df: call void [mscorlib]System.Console::WriteLine(object)
IL_00e4: nop
IL_00e5: ret
} // end of method Program::Main
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171

Now there is a lot there but we are just interested in is how variables are declared so we are going to ignore most of it for now. Line 63 to 79 contains what we are interested in.

Main Variables
.locals init ([0] bool b,
[1] char c,
[2] float32 f,
[3] float64 d,
[4] int8 sb,
[5] int16 sh,
[6] int32 i,
[7] int64 l,
[8] uint8 ub,
[9] uint16 ush,
[10] uint32 ui,
[11] uint64 ul,
[12] valuetype [mscorlib]System.Decimal dl,
[13] object o,
[14] string s,
[15] class [mscorlib]System.Collections.ArrayList al)
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

.locals is an IL directive that is used to declare the variables of a method. The init keyword indicates that these variables should be initialized to their default values. The declaration of local variables is very similar to how arrays are declared in many languages. It starts with a number in square brackets that denotes the variable’s index. Next there’s a type identifier which denotes the type of the variable. Finally there is the optional name of the variable. IL instructions reference variables either by their index or their name which gets translated to the index when the program is assembled.

If you look at the rest of the method you can see these indexes being used. For example on line 122 we see “ldloc.0” followed by “call void [mscorlib]System.Console::WriteLine(bool)”. In the C# code we call Console.WriteLine on line 27 and pass it a bool argument when we go to print the b variable. Looking in the locals array we see “[0] bool b” on line 63 which we know is the declaration of the b variable with index 0 and type bool. So “ldloc.0” is likely doing something with the b variable so that it can be printed by a call to Console.WriteLine.

Since the .locals directive contains the name of the variable it is fairly easy to map C# types to IL types. bool is bool, char is char, float32 is float, int64 is long etc. These types are built-in types that are inherently understood by the virtual machine. Some are more complicated though like the C# decimal type declared as “valuetype [mscorlib]System.Decimal” and ArrayList declared as “class [mscorlib]System.Collections.ArrayList”. These are user defined types described in assemblies that can be accessed by the virtual machine to look up their definition. Square brackets surround the name of the assembly in which the type is defined. Following the square brackets is the fully-qualified name (Includes but the name of the type and the namespace in which it is defined) of the type.

The types that start with “valuetype” are user defined value types. A value type variable directly contains the data of that variable. The types that start with “class” are user defined reference types. Reference types contain references to a location where the data of that variable is stored. A value type variable can be converted to a reference type variable through a process known as boxing. This involves copying the variable’s data to another location and then creating a reference type variable that references that location. Every value type has a corresponding reference type. The reverse is not generally the case.

The built-in types can also be classified as either value or reference types. The following table describes some of the built in types and specifies what kind of type they are.

Built-in Type Description Kind
bool True/false value Value type
char Unicode 16-bit char. Value type
float32 IEC 60559:1989 32-bit float Value type
float64 IEC 60559:1989 64-bit float Value type
int8 Signed 8-bit integer Value type
int16 Signed 16-bit integer Value type
int32 Signed 32-bit integer Value type
int64 Signed 64-bit integer Value type
uint8 Unsigned 8-bit integer Value type
uint16 Unsigned 16-bit integer Value type
uint32 Unsigned 32-bit integer Value type
uint64 Unsigned 64-bit integer Value type
object Base of all types Reference Type
string Unicode string Reference Type

Next time we will see how IL variables work with a Visual Basic .NET program.

Comments: