2016-02-13 - In IL: Variables in Visual Basic .NET

Last time we looked at how variables were defined in the IL generated by compiling a C# program. Now we will do the same with a Visual Basic .NET (VB) program and see what changes. Let’s start with a VB program that does the same thing as the C# program.

Module1.vb

Module Module1

Sub Main()

Dim b As Boolean = True

Dim c As Char = "c"c

Dim f As Single = Single.MaxValue

Dim d As Double = Double.MaxValue

Dim sb As SByte = SByte.MaxValue

Dim sh As Short = Short.MaxValue

Dim i As Integer = Integer.MaxValue

Dim l As Long = Long.MaxValue

Dim ub As Byte = Byte.MaxValue

Dim ush As UShort = UShort.MaxValue

Dim ui As UInteger = UInteger.MaxValue

Dim ul As ULong = ULong.MaxValue

Dim dl As Decimal = Decimal.MaxValue

Dim o As Object = New Object()

Dim s As String = "s"

Dim al As ArrayList = New ArrayList()

Console.WriteLine(b)

Console.WriteLine(c)

Console.WriteLine(o)

Console.WriteLine(s)

Console.WriteLine(f)

Console.WriteLine(d)

Console.WriteLine(sb)

Console.WriteLine(sh)

Console.WriteLine(i)

Console.WriteLine(l)

Console.WriteLine(ub)

Console.WriteLine(ush)

Console.WriteLine(ui)

Console.WriteLine(ul)

Console.WriteLine(dl)

Console.WriteLine(al)

End Sub

End Module

This program is functionally the same as the C# version. It declares a bunch of variables and then prints their string representation to the screen. There's a few syntax differences though such as how variables are declared, the lack less semicolons and curly braces, having to cast the string to a character using c. So let’s see what the compiled IL looks like.

Main

.method public static void Main() cil managed

{

.entrypoint

.custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 )

// Code size 240 (0xf0)

.maxstack 6

.locals init ([0] bool b,

[1] char c,

[2] float32 f,

[3] float64 d,

[4] int8 sb,

[5] int16 sh,

[6] int32 i,

[7] int64 l,

[8] uint8 ub,

[9] uint16 ush,

[10] uint32 ui,

[11] uint64 ul,

[12] valuetype [mscorlib]System.Decimal dl,

[13] object o,

[14] string s,

[15] class [mscorlib]System.Collections.ArrayList al)

IL_0000: nop

IL_0001: ldc.i4.1

IL_0002: stloc.0

IL_0003: ldc.i4.s 99

IL_0005: stloc.1

IL_0006: ldc.r4 3.4028235e+038

IL_000b: stloc.2

IL_000c: ldc.r8 1.7976931348623157e+308

IL_0015: stloc.3

IL_0016: ldc.i4.s 127

IL_0018: stloc.s sb

IL_001a: ldc.i4 0x7fff

IL_001f: stloc.s sh

IL_0021: ldc.i4 0x7fffffff

IL_0026: stloc.s i

IL_0028: ldc.i8 0x7fffffffffffffff

IL_0031: stloc.s l

IL_0033: ldc.i4 0xff

IL_0038: stloc.s ub

IL_003a: ldc.i4 0xffff

IL_003f: stloc.s ush

IL_0041: ldc.i4.m1

IL_0042: stloc.s ui

IL_0044: ldc.i4.m1

IL_0045: conv.i8

IL_0046: stloc.s ul

IL_0048: ldloca.s dl

IL_004a: ldc.i4.m1

IL_004b: ldc.i4.m1

IL_004c: ldc.i4.m1

IL_004d: ldc.i4.0

IL_004e: ldc.i4.0

IL_004f: call instance void [mscorlib]System.Decimal::.ctor(int32,

int32,

bool,

uint8)

IL_0054: newobj instance void [mscorlib]System.Object::.ctor()

IL_0059: call object [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::GetObjectValue(object)

IL_005e: stloc.s o

IL_0060: ldstr "s"

IL_0065: stloc.s s

IL_0067: newobj instance void [mscorlib]System.Collections.ArrayList::.ctor()

IL_006c: stloc.s al

IL_006e: ldloc.0

IL_006f: call void [mscorlib]System.Console::WriteLine(bool)

IL_0074: nop

IL_0075: ldloc.1

IL_0076: call void [mscorlib]System.Console::WriteLine(char)

IL_007b: nop

IL_007c: ldloc.s o

IL_007e: call object [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::GetObjectValue(object)

IL_0083: call void [mscorlib]System.Console::WriteLine(object)

IL_0088: nop

IL_0089: ldloc.s s

IL_008b: call void [mscorlib]System.Console::WriteLine(string)

IL_0090: nop

IL_0091: ldloc.2

IL_0092: call void [mscorlib]System.Console::WriteLine(float32)

IL_0097: nop

IL_0098: ldloc.3

IL_0099: call void [mscorlib]System.Console::WriteLine(float64)

IL_009e: nop

IL_009f: ldloc.s sb

IL_00a1: call void [mscorlib]System.Console::WriteLine(int32)

IL_00a6: nop

IL_00a7: ldloc.s sh

IL_00a9: call void [mscorlib]System.Console::WriteLine(int32)

IL_00ae: nop

IL_00af: ldloc.s i

IL_00b1: call void [mscorlib]System.Console::WriteLine(int32)

IL_00b6: nop

IL_00b7: ldloc.s l

IL_00b9: call void [mscorlib]System.Console::WriteLine(int64)

IL_00be: nop

IL_00bf: ldloc.s ub

IL_00c1: call void [mscorlib]System.Console::WriteLine(int32)

IL_00c6: nop

IL_00c7: ldloc.s ush

IL_00c9: call void [mscorlib]System.Console::WriteLine(int32)

IL_00ce: nop

IL_00cf: ldloc.s ui

IL_00d1: call void [mscorlib]System.Console::WriteLine(uint32)

IL_00d6: nop

IL_00d7: ldloc.s ul

IL_00d9: call void [mscorlib]System.Console::WriteLine(uint64)

IL_00de: nop

IL_00df: ldloc.s dl

IL_00e1: call void [mscorlib]System.Console::WriteLine(valuetype [mscorlib]System.Decimal)

IL_00e6: nop

IL_00e7: ldloc.s al

IL_00e9: call void [mscorlib]System.Console::WriteLine(object)

IL_00ee: nop

IL_00ef: ret

} // end of method Module1::Main

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

That looks almost identical to the compiled C# program from last except for the added STAThreadAttribute line. But how can that be? These are two completely different programs in two completely different languages? Well it’s because they aren’t completely different programs. The actual operations being performed are identical so it makes sense that the IL generated would be the same.

IL captures the semantics of the code used to generate it not the syntax. The code to perform a specific operation could be completely different in two different languages but if those operations are meant to do the same thing then the IL generated will be similar. In the same way different languages could have features that look the same but work very differently which would generate different IL. The set of IL features is typically larger than the requirements of any single language to allow for a wide variety of language targeting the CLI.

Next time we will learn about stacks and operations.

2016-01-31 - C

The C programming language is really enjoyable to work with. It’s a nice simple language that does very little to get in your way but that also makes it a very error-prone language.

In C there’s a limited number of types and ways of interacting with those types. It is possible to define structs which wrap multiple bits of information and functions which wrap a series of operations but this is very limited compared to the object-oriented aspects of other languages. That being said it is possible to build these more complex systems out of what is provided. This is one of C greatest strengths. It provides a small easy to learn base that is powerful enough to support large complex systems being built on top of it.

This does have some potential for issues though. Because the complex systems have to be built from scratch there’s a large chance that mistakes can be made if the people building the complex system are not very careful. C is also not the kind of language that will catch these mistakes for you. Thus as you build the complex systems you also need to design things to catch and prevent mistakes.

This simplicity and lack of validation comes from C’s heritage as a language for making operating systems. If you are building the base system for controlling the computer it’s hard to rely on the things that those systems usually provide. C is basically just an abstraction of assembly and as suck It provides the same freedom to have the computer do whatever you want it to do. It was meant to make writing these low-level systems easier without limiting the programmers options.

I would describe C as a good experimental language. It’s very easy to quickly make a program that does something and in doing so gain a better understanding of how the computer works. It’s not something you really want to be building really complicated programs out of unless you really know what you are doing and need the freedom it provides.

2016-01-16 - In IL: Variables and Types

Variables are important to any program as they determine where information gets stored. Types are equally important as they determine the format of that information. So to start this investigation off we are going to look at how and where information is stored in an IL program. The following C# program declares a bunch of variables and then writes the string representation of those variables to the screen.

Program.cs

using System;

using System.Collections;

namespace Variables

{

class Program

{

static void Main(string[] args)

{

bool b = true;

char c = 'c';

float f = float.MaxValue;

double d = double.MaxValue;

sbyte sb = sbyte.MaxValue;

short sh = short.MaxValue;

int i = int.MaxValue;

long l = long.MaxValue;

byte ub = byte.MaxValue;

ushort ush = ushort.MaxValue;

uint ui = uint.MaxValue;

ulong ul = ulong.MaxValue;

decimal dl = decimal.MaxValue;

object o = new object();

string s = "s";

ArrayList al = new ArrayList();

Console.WriteLine(b);

Console.WriteLine(c);

Console.WriteLine(o);

Console.WriteLine(s);

Console.WriteLine(f);

Console.WriteLine(d);

Console.WriteLine(sb);

Console.WriteLine(sh);

Console.WriteLine(i);

Console.WriteLine(l);

Console.WriteLine(ub);

Console.WriteLine(ush);

Console.WriteLine(ui);

Console.WriteLine(ul);

Console.WriteLine(dl);

Console.WriteLine(al);

}

If you compile this program and then look at the generated IL code using ildasm or ilspy you will see something like the following (Note that you may need to compile it in debug mode to stop the compiler from optimizing out some variables).

Main

.method private hidebysig static void Main(string[] args) cil managed

{

.entrypoint

// Code size 230 (0xe6)

.maxstack 6

.locals init ([0] bool b,

[1] char c,

[2] float32 f,

[3] float64 d,

[4] int8 sb,

[5] int16 sh,

[6] int32 i,

[7] int64 l,

[8] uint8 ub,

[9] uint16 ush,

[10] uint32 ui,

[11] uint64 ul,

[12] valuetype [mscorlib]System.Decimal dl,

[13] object o,

[14] string s,

[15] class [mscorlib]System.Collections.ArrayList al)

IL_0000: nop

IL_0001: ldc.i4.1

IL_0002: stloc.0

IL_0003: ldc.i4.s 99

IL_0005: stloc.1

IL_0006: ldc.r4 3.4028235e+038

IL_000b: stloc.2

IL_000c: ldc.r8 1.7976931348623157e+308

IL_0015: stloc.3

IL_0016: ldc.i4.s 127

IL_0018: stloc.s sb

IL_001a: ldc.i4 0x7fff

IL_001f: stloc.s sh

IL_0021: ldc.i4 0x7fffffff

IL_0026: stloc.s i

IL_0028: ldc.i8 0x7fffffffffffffff

IL_0031: stloc.s l

IL_0033: ldc.i4 0xff

IL_0038: stloc.s ub

IL_003a: ldc.i4 0xffff

IL_003f: stloc.s ush

IL_0041: ldc.i4.m1

IL_0042: stloc.s ui

IL_0044: ldc.i4.m1

IL_0045: conv.i8

IL_0046: stloc.s ul

IL_0048: ldloca.s dl

IL_004a: ldc.i4.m1

IL_004b: ldc.i4.m1

IL_004c: ldc.i4.m1

IL_004d: ldc.i4.0

IL_004e: ldc.i4.0

IL_004f: call instance void [mscorlib]System.Decimal::.ctor(int32,

int32,

bool,

uint8)

IL_0054: newobj instance void [mscorlib]System.Object::.ctor()

IL_0059: stloc.s o

IL_005b: ldstr "s"

IL_0060: stloc.s s

IL_0062: newobj instance void [mscorlib]System.Collections.ArrayList::.ctor()

IL_0067: stloc.s al

IL_0069: ldloc.0

IL_006a: call void [mscorlib]System.Console::WriteLine(bool)

IL_006f: nop

IL_0070: ldloc.1

IL_0071: call void [mscorlib]System.Console::WriteLine(char)

IL_0076: nop

IL_0077: ldloc.s o

IL_0079: call void [mscorlib]System.Console::WriteLine(object)

IL_007e: nop

IL_007f: ldloc.s s

IL_0081: call void [mscorlib]System.Console::WriteLine(string)

IL_0086: nop

IL_0087: ldloc.2

IL_0088: call void [mscorlib]System.Console::WriteLine(float32)

IL_008d: nop

IL_008e: ldloc.3

IL_008f: call void [mscorlib]System.Console::WriteLine(float64)

IL_0094: nop

IL_0095: ldloc.s sb

IL_0097: call void [mscorlib]System.Console::WriteLine(int32)

IL_009c: nop

IL_009d: ldloc.s sh

IL_009f: call void [mscorlib]System.Console::WriteLine(int32)

IL_00a4: nop

IL_00a5: ldloc.s i

IL_00a7: call void [mscorlib]System.Console::WriteLine(int32)

IL_00ac: nop

IL_00ad: ldloc.s l

IL_00af: call void [mscorlib]System.Console::WriteLine(int64)

IL_00b4: nop

IL_00b5: ldloc.s ub

IL_00b7: call void [mscorlib]System.Console::WriteLine(int32)

IL_00bc: nop

IL_00bd: ldloc.s ush

IL_00bf: call void [mscorlib]System.Console::WriteLine(int32)

IL_00c4: nop

IL_00c5: ldloc.s ui

IL_00c7: call void [mscorlib]System.Console::WriteLine(uint32)

IL_00cc: nop

IL_00cd: ldloc.s ul

IL_00cf: call void [mscorlib]System.Console::WriteLine(uint64)

IL_00d4: nop

IL_00d5: ldloc.s dl

IL_00d7: call void [mscorlib]System.Console::WriteLine(valuetype [mscorlib]System.Decimal)

IL_00dc: nop

IL_00dd: ldloc.s al

IL_00df: call void [mscorlib]System.Console::WriteLine(object)

IL_00e4: nop

IL_00e5: ret

} // end of method Program::Main

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

Now there is a lot there but we are just interested in is how variables are declared so we are going to ignore most of it for now. Line 63 to 79 contains what we are interested in.

Main Variables

.locals init ([0] bool b,

[1] char c,

[2] float32 f,

[3] float64 d,

[4] int8 sb,

[5] int16 sh,

[6] int32 i,

[7] int64 l,

[8] uint8 ub,

[9] uint16 ush,

[10] uint32 ui,

[11] uint64 ul,

[12] valuetype [mscorlib]System.Decimal dl,

[13] object o,

[14] string s,

[15] class [mscorlib]System.Collections.ArrayList al)

.locals is an IL directive that is used to declare the variables of a method. The init keyword indicates that these variables should be initialized to their default values. The declaration of local variables is very similar to how arrays are declared in many languages. It starts with a number in square brackets that denotes the variable’s index. Next there’s a type identifier which denotes the type of the variable. Finally there is the optional name of the variable. IL instructions reference variables either by their index or their name which gets translated to the index when the program is assembled.

If you look at the rest of the method you can see these indexes being used. For example on line 122 we see “ldloc.0” followed by “call void [mscorlib]System.Console::WriteLine(bool)”. In the C# code we call Console.WriteLine on line 27 and pass it a bool argument when we go to print the b variable. Looking in the locals array we see “[0] bool b” on line 63 which we know is the declaration of the b variable with index 0 and type bool. So “ldloc.0” is likely doing something with the b variable so that it can be printed by a call to Console.WriteLine.

Since the .locals directive contains the name of the variable it is fairly easy to map C# types to IL types. bool is bool, char is char, float32 is float, int64 is long etc. These types are built-in types that are inherently understood by the virtual machine. Some are more complicated though like the C# decimal type declared as “valuetype [mscorlib]System.Decimal” and ArrayList declared as “class [mscorlib]System.Collections.ArrayList”. These are user defined types described in assemblies that can be accessed by the virtual machine to look up their definition. Square brackets surround the name of the assembly in which the type is defined. Following the square brackets is the fully-qualified name (Includes but the name of the type and the namespace in which it is defined) of the type.

The types that start with “valuetype” are user defined value types. A value type variable directly contains the data of that variable. The types that start with “class” are user defined reference types. Reference types contain references to a location where the data of that variable is stored. A value type variable can be converted to a reference type variable through a process known as boxing. This involves copying the variable’s data to another location and then creating a reference type variable that references that location. Every value type has a corresponding reference type. The reverse is not generally the case.

The built-in types can also be classified as either value or reference types. The following table describes some of the built in types and specifies what kind of type they are.

Built-in Type	Description	Kind
bool	True/false value	Value type
char	Unicode 16-bit char.	Value type
float32	IEC 60559:1989 32-bit float	Value type
float64	IEC 60559:1989 64-bit float	Value type
int8	Signed 8-bit integer	Value type
int16	Signed 16-bit integer	Value type
int32	Signed 32-bit integer	Value type
int64	Signed 64-bit integer	Value type
uint8	Unsigned 8-bit integer	Value type
uint16	Unsigned 16-bit integer	Value type
uint32	Unsigned 32-bit integer	Value type
uint64	Unsigned 64-bit integer	Value type
object	Base of all types	Reference Type
string	Unicode string	Reference Type

Next time we will see how IL variables work with a Visual Basic .NET program.

2015-11-22 - SQL

Structured Query Language (SQL) is a family of languages used for interacting with databases. Originally developed by IBM in the 1970 and standardized in 1986. SQL allows users to retrieve, create and modify data in a database. These queries are commonly made against a database engine that transforms the query into a series of actions that are required to complete the requested operation.

One of the best things about SQL is its query syntax. For simple operations these queries are very natural sounding which makes them easy to work with and understand. For example to get column1 and column2 from table1 where column1 is equal to 3 you would write this.

SELECT

column1,

column2

FROM table1

WHERE column1 = 2

If you wanted the results to be ordered by coumn2 you would change it to.

SELECT

column1,

column2

FROM table1

WHERE column1 = 2

ORDER BY column2

This makes SQL a very approachable language. Simple operations are simple to perform and it’s only when you start wanting to do more complex things that the query start becoming more complicated.

The main reason SQL can have such simple syntax is because it’s a declarative language. This means that instead of telling the database engine what you want it to do you declare what you want to happen. It’s the database engine’s job to figure out how to do it. In a lot of languages doing something a bit weird feels like cheating, like you are doing something you shouldn’t be doing. In SQL it feels really good when you get a weird query to work exactly how you want it to be. I think this is largely because you aren’t saying what you want it to do so you don’t feel like you are forcing it to do something. Instead you are just finding the best way to describe what you want to happen.

SQL is a standard but is still somewhat un-standardized. All SQL languages are derived from the standard and have a lot in common but they aren’t all the same. The main differences come from the addition of procedural elements which are required for writing scripts that make decisions or perform operations multiple times. These procedural elements weren’t a part of the original standard so they were implemented differently by individual database systems. Microsoft uses Transact-SQL (T-SQL) for their SQL Server and Oracle uses Procedural Language/Structured Query Language (PL/SQL) for their database products. A standard for these procedural elements was created as SQL/Persistent Stored Modules (SQL/PSM) which some databases, such as MySQL, base their versions on. There are also slight syntax and data-type changes between implementations. This means that going from one database system to another does require some new knowledge. A lot of the basics are the same but you need to be aware of the actual implementation differences.

I personally like SQL. It’s a very quirkily language but in a lot of ways that makes it more enjoyable to work with. You can do basic things without a lot of knowledge but you need to really understand the particular database you are working with to do really advanced things well. The more time you put into understanding the quirks the more enduring the language is.

2016-02-13 - In IL: Variables in Visual Basic .NET

2016-01-31 - C

2016-01-16 - In IL: Variables and Types

2015-11-22 - SQL

16 17 18 19 20