2015-11-08 - In IL: Introduction

The Common Intermediate Language (CIL), Microsoft Intermediate Language (MSIL) or usually just Intermediate Language (IL) is the language of the virtual machine component of the Common Language Infrastructure (CLI) standard. The .NET Framework and Mono are implantations of this standard.

The virtual machine is a lot like a computer but implemented in software. The virtual machine exectures IL programs by compiling it into native code. That native code is the language of the computer that the program is running on and represents the operations that it understands. The virtual machine also handles a lot of tasks that would normally need to be done by the program itself such as allocating and freeing memory. This simplifies programming and also provides isolation between the program and the actual computer.

IL provides an intermediate step between the higher level languages and the native code of the computer. It provides classes and functions which are common in high level languages while at the same time having a syntax and structure more like native code. This makes it familiar to people approaching it from the higher level language point of view and from the native code point of view allowing both sides to easily work with it. It also separates responsibilities because the language designers only need to focus on the IL compiler while the designers of the virtual machine implementation focus on translating the IL into native code. Any changes to the IL compiler benefit all implementations it supports. Any improvements to the virtual machine benefit all the languages that target it.

In this series of posts I want to investigate how languages that target the .NET Framework (C#, Visual Basic .NET) compile into IL code. The idea is to learn a little bit about IL, how the compiler works, and how .NET programs actually run. This information should be useful when designing or debugging programs and may even lead to some ideas about how to build a compiler.

Should be fun.

2015-10-18 - Character Encoding: Conclusion

I actually don’t spend a lot of time dealing with character encodings. It comes up a little bit when dealing with files but even then it’s just a matter of selecting the right encoding. So then why did I spend the time to investigate and write about character encodings?

Partially it’s because it’s a good thing to be aware of. It’s useful to understand what selecting the encoding is doing. Also the problem of how to store non-numeric data comes up a lot in programming and character encodings serve as a good example of that problem.

That being said the main thing is how it shows that the best solution to a problem can change over time. The designers of ASCII made the decision that data savings was more important that number of characters so they limited themselves to 7 bits. As time went on this situation changed; space became cheaper while the desire for more characters grew. This lead to Extended-ASCII and Unicode with their 8 bit, 16 bit, and 32 bit characters. Then things partially flipped around again. With the internet and data being sent all over the world space savings became important again but people didn’t want to lose their extra characters. This lead to the UTF formats that went for space savings under common circumstances at the cost of added complexity.

ASCII and Extended ASCII don’t fit current needs because the limited character set and need for code pages complicates sharing information around the world. Similarly UTF-8 wouldn’t have worked for early computers and Teletype machines because the variable width characters would have made it excessively complicated to implement on the hardware of the time.

I find problems that don’t have singular answers to be the most interesting. Problems with the specifics of the situation impact the requirements and multiple solutions can work simultaneously in different situations. All of these encodings are in use today. Some more than others but the introduction of newer encodings hasn’t destroyed those that came before it. One of the goals of UTF-8 was to be compatible with ASCII for this reason.

I’d also like to point out that this is not an exhaustive list. There are a bunch of other character encodings. Even within Unicode their are other transformations and variations of transformations. These are just the most common ones that I know of and have encountered.

2015-10-04 - Seriously Fun TV

I really like TV. I watch a fair bit of it, and from all that watching I’ve found that there’s a recipe to the shows that I enjoy. The best TV shows are those that start with a large scoop of drama and then add in a few spoonfuls of comedy on top of it.

I like stories and characters. Those are the things I find interesting about TV shows. So I tend to prefer dramas because they usually have better stories and characters. The problem is that shows that are pure drama tend to be very dark and depressing. They are just about bad things happening to the main characters and their trying to deal with things before the next bad things happen. I personally don’t like shows that make me depressed.

That’s where the comedy comes in. A little bit of silliness helps to break up the drama in the story and make the characters more likable. The comedy keeps things from being too depressing. The show can’t just be a pure comedy though because then you lose the stories and characters. They just become a series of jokes and as funny as they may be they aren’t really interesting.

So my favourite shows are those that have good stories and characters but aren’t too serious.

2015-08-09 - Unity Coroutines

I’ve been looking into the Unity game engine lately. While doing so I cam across the concept of a “Coroutine”. In Unity most things are done in an update function in a class attached to an object which gets called every frame. This is good for a lot of things but not for actions that should occur over a period of time. Coroutines give the ability to insert delays between operations so that you can better control when updates occur. The Unity manual says that a coroutine is “a function declared with a return type of IEnumerator and with the yield return statement included somewhere in the body. […] To set a coroutine running, you need to use the StartCoroutine function:” The example they give is something similar to this.

Unity Coroutine Function

IEnumerator Fade() {

for (float f = 1f; f >= 0; f -= 0.1f) {

Color c = GetComponent<Renderer>().material.color;

c.a = f;

GetComponent<Renderer>().material.color = c;

yield return new WaitForSeconds(.1f);

}

And the coroutine would be started like this.

Start Coroutine Function

StartCoroutine(Fade());

Now I find the requirement that a coroutine has to be a function with yield returns to be very interesting. For one thing the caller of a function doesn’t generally know what the function is doing and for another the coroutine is being called and the return value passed to StartCoroutine, not the function itself. If the documentation says it then it must be correct though. Let’s see if the compiled version of the function gives us any clues as to how StartCoroutine is detecting that the called function contain yield returns.

Compiled Coroutine Function

.method private hidebysig

instance class [mscorlib]System.Collections.IEnumerator Fade () cil managed

{

// Method begins at RVA 0x1b89c

// Code size 20 (0x14)

.maxstack 2

.locals init (

[0] class Fader/'<Fade>d__0',

[1] class [mscorlib]System.Collections.IEnumerator

)

IL_0000: ldc.i4.0

IL_0001: newobj instance void Fader/'<Fade>d__0'::.ctor(int32)

IL_0006: stloc.0

IL_0007: ldloc.0

IL_0008: ldarg.0

IL_0009: stfld class Fader Fader/'<Fade>d__0'::'<>4__this'

IL_000e: ldloc.0

IL_000f: stloc.1

IL_0010: br.s IL_0012

IL_0012: ldloc.1

IL_0013: ret

} // end of method Fader::Fade

.method private hidebysig

This is Common Intermediate Language (CIL) code. It is what C# usually gets compiled into. It’s basically an assembly language for a virtual machine. When the program is running it starts up an instance of the virtual machine which interprets the IL and generates actual native code. IL is useful for us because it is somewhat readable and gives us clues as to what is actually going to happen when the code is ran.

Looking at the function we notice some strange things. The compiled function doesn’t have any yield returns. It doesn’t even look like the code to change colour is there. All it does is create and return a Fader/'<Fade>d__0' object. Well let’s go look at that class.

Compiled Coroutine Class

.class nested private auto ansi sealed beforefieldinit '<Fade>d__0'

extends [mscorlib]System.Object

implements class [mscorlib]System.Collections.Generic.IEnumerator`1<object>,

[mscorlib]System.Collections.IEnumerator,

[mscorlib]System.IDisposable

{

.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (

01 00 00 00

)

// Fields

.field private object '<>2__current'

.field private int32 '<>1__state'

.field public class Fader '<>4__this'

.field public float32 '<f>5__1'

.field public valuetype [UnityEngine]UnityEngine.Color '<c>5__2'

// Methods

.method private final hidebysig newslot virtual

instance bool MoveNext () cil managed

{

.override method instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()

// Method begins at RVA 0x1b770

// Code size 212 (0xd4)

.maxstack 3

.locals init (

[0] bool CS$1$0000,

[1] int32 CS$4$0001,

[2] bool CS$4$0002

)

IL_0000: ldarg.0

IL_0001: ldfld int32 Fader/'<Fade>d__0'::'<>1__state'

IL_0006: stloc.1

//int loc1 = <>1__state;

IL_0007: ldloc.1

IL_0008: switch (IL_001c, IL_0017)

//switch(loc1)

IL_0015: br.s IL_001e

IL_0017: br IL_009c

IL_001c: br.s IL_0023

IL_001e: br IL_00ce

IL_0023: ldarg.0

IL_0024: ldc.i4.m1

IL_0025: stfld int32 Fader/'<Fade>d__0'::'<>1__state'

IL_002a: nop

IL_002b: ldarg.0

IL_002c: ldc.r4 1

IL_0031: stfld float32 Fader/'<Fade>d__0'::'<f>5__1'

IL_0036: br.s IL_00b6

IL_0038: nop

IL_0039: ldarg.0

IL_003a: ldarg.0

IL_003b: ldfld class Fader Fader/'<Fade>d__0'::'<>4__this'

IL_0040: call instance !!0 [UnityEngine]UnityEngine.Component::GetComponent<class [UnityEngine]UnityEngine.Renderer>()

IL_0045: callvirt instance class [UnityEngine]UnityEngine.Material [UnityEngine]UnityEngine.Renderer::get_material()

IL_004a: callvirt instance valuetype [UnityEngine]UnityEngine.Color [UnityEngine]UnityEngine.Material::get_color()

IL_004f: stfld valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'

IL_0054: ldarg.0

IL_0055: ldflda valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'

IL_005a: ldarg.0

IL_005b: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'

IL_0060: stfld float32 [UnityEngine]UnityEngine.Color::a

IL_0065: ldarg.0

IL_0066: ldfld class Fader Fader/'<Fade>d__0'::'<>4__this'

IL_006b: call instance !!0 [UnityEngine]UnityEngine.Component::GetComponent<class [UnityEngine]UnityEngine.Renderer>()

IL_0070: callvirt instance class [UnityEngine]UnityEngine.Material [UnityEngine]UnityEngine.Renderer::get_material()

IL_0075: ldarg.0

IL_0076: ldfld valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'

IL_007b: callvirt instance void [UnityEngine]UnityEngine.Material::set_color(valuetype [UnityEngine]UnityEngine.Color)

IL_0080: nop

IL_0081: ldarg.0

IL_0082: ldc.r4 0.1

IL_0087: newobj instance void [UnityEngine]UnityEngine.WaitForSeconds::.ctor(float32)

IL_008c: stfld object Fader/'<Fade>d__0'::'<>2__current'

IL_0091: ldarg.0

IL_0092: ldc.i4.1

IL_0093: stfld int32 Fader/'<Fade>d__0'::'<>1__state'

IL_0098: ldc.i4.1

IL_0099: stloc.0

IL_009a: br.s IL_00d2

IL_009c: ldarg.0

IL_009d: ldc.i4.m1

IL_009e: stfld int32 Fader/'<Fade>d__0'::'<>1__state'

IL_00a3: nop

IL_00a4: ldarg.0

IL_00a5: dup

IL_00a6: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'

IL_00ab: ldc.r4 0.1

IL_00b0: sub

IL_00b1: stfld float32 Fader/'<Fade>d__0'::'<f>5__1'

IL_00b6: ldarg.0

IL_00b7: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'

IL_00bc: ldc.r4 0.0

IL_00c1: clt.un

IL_00c3: ldc.i4.0

IL_00c4: ceq

IL_00c6: stloc.2

IL_00c7: ldloc.2

IL_00c8: brtrue IL_0038

IL_00cd: nop

IL_00ce: ldc.i4.0

IL_00cf: stloc.0

IL_00d0: br.s IL_00d2

IL_00d2: ldloc.0

IL_00d3: ret

} // end of method '<Fade>d__0'::MoveNext

.method private final hidebysig specialname newslot virtual

instance object 'System.Collections.Generic.IEnumerator<System.Object>.get_Current' () cil managed

{

.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (

01 00 00 00

)

.override method instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<object>::get_Current()

// Method begins at RVA 0x1b850

// Code size 11 (0xb)

.maxstack 1

.locals init (

[0] object

)

IL_0000: ldarg.0

IL_0001: ldfld object Fader/'<Fade>d__0'::'<>2__current'

IL_0006: stloc.0

IL_0007: br.s IL_0009

IL_0009: ldloc.0

IL_000a: ret

//return <>2__current;

} // end of method '<Fade>d__0'::'System.Collections.Generic.IEnumerator<System.Object>.get_Current'

.method private final hidebysig newslot virtual

instance void System.Collections.IEnumerator.Reset () cil managed

{

.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (

01 00 00 00

)

.override method instance void [mscorlib]System.Collections.IEnumerator::Reset()

// Method begins at RVA 0x1b867

// Code size 6 (0x6)

.maxstack 8

IL_0000: newobj instance void [mscorlib]System.NotSupportedException::.ctor()

IL_0005: throw

//throw new NotSupportedException();

} // end of method '<Fade>d__0'::System.Collections.IEnumerator.Reset

.method private final hidebysig newslot virtual

instance void System.IDisposable.Dispose () cil managed

{

.override method instance void [mscorlib]System.IDisposable::Dispose()

// Method begins at RVA 0x1b86e

// Code size 2 (0x2)

.maxstack 8

IL_0000: nop

IL_0001: ret

} // end of method '<Fade>d__0'::System.IDisposable.Dispose

.method private final hidebysig specialname newslot virtual

instance object System.Collections.IEnumerator.get_Current () cil managed

{

.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (

01 00 00 00

)

.override method instance object [mscorlib]System.Collections.IEnumerator::get_Current()

// Method begins at RVA 0x1b874

// Code size 11 (0xb)

.maxstack 1

.locals init (

[0] object

)

IL_0000: ldarg.0

IL_0001: ldfld object Fader/'<Fade>d__0'::'<>2__current'

IL_0006: stloc.0

IL_0007: br.s IL_0009

IL_0009: ldloc.0

IL_000a: ret

} // end of method '<Fade>d__0'::System.Collections.IEnumerator.get_Current

.method public hidebysig specialname rtspecialname

instance void .ctor (

int32 '<>1__state'

) cil managed

{

.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (

01 00 00 00

)

// Method begins at RVA 0x1b88b

// Code size 14 (0xe)

.maxstack 8

IL_0000: ldarg.0

IL_0001: call instance void [mscorlib]System.Object::.ctor()

IL_0006: ldarg.0

IL_0007: ldarg.1

IL_0008: stfld int32 Fader/'<Fade>d__0'::'<>1__state'

IL_000d: ret

} // end of method '<Fade>d__0'::.ctor

// Properties

.property instance object 'System.Collections.Generic.IEnumerator<System.Object>.Current'()

{

.get instance object Fader/'<Fade>d__0'::'System.Collections.Generic.IEnumerator<System.Object>.get_Current'()

}

.property instance object System.Collections.IEnumerator.Current()

{

.get instance object Fader/'<Fade>d__0'::System.Collections.IEnumerator.get_Current()

}

} // end of class <Fade>d__0

Wow, that’s a lot of stuff. It appears to be a class implementing the IEnumerator interface with the colour change logic in its MoveNext() function. So where did all this code come from? The compiler put it in there. It took the original function and generated a class based on the return type. The body of the function got transformed into a state driven enumerator with the current property being set instead of having yield returns. So if the yield return code gets compiled into an IEnumerator class then how can StartCoroutine require a function that contains yield returns? Well clearly it can’t. If you took that compiled class and rewrote it in C# it would look something like this.

Unity Coroutine Class

public class FaderEnumerator : IEnumerator{

private object _current;

private int _state;

public Fader _this;

public float _f;

public Color _c;

public object Current {

get {

return _current;

}

public FaderEnumerator(int state)

{

_state = state;

}

public bool MoveNext ()

{

switch (_state) {

case 0:

_state = -1;

_f = 1;

break;

case 1:

_state = -1;

_f -= 0.1f;

break;

}

if(_f >= 0)

{

_c = _this.GetComponent<Renderer>().material.color;

_c.a = _f;

_this.GetComponent<Renderer>().material.color = _c;

_current = new WaitForSeconds(.1f);

_state = 1;

return true;

}

return false;

}

public void Reset ()

{

throw new NotSupportedException();

}

To start this coroutine you would do this.

Start Coroutine Class

var faderEnumerator = new FaderEnumerator(0);

faderEnumerator._this = this;

StartCoroutine(faderEnumerator);

These two snippets are roughly equivalent to the original two.

So is the documentation wrong? Well no, the reason the documentation says a coroutine is a function with yield returns is because they made up the concept of couroutines and can define them however they wish. At the same time it isn't being entirely honest though. Clearly you can have a coroutine that isn't a function containing yield returns so it can't really be said that's a requirement.

There are a few reasons I bring this up. Firstly I believe it’s important to understand what abstractions are doing when you use them. The more you understand about what your code is doing the easier it is to debug. The second reason is because the IEnumerator class case is probably more reusable. Since you are working with a whole class there are more options for controlling how it operates. You could create a generic class that could apply similar logic to many different types of things

Finally because the compiler is generating code you need to be careful what you do with yield returns. The compiler is smart and it’s going to do it’s best to do what you are telling it to do but it’s not perfect. There are probably some scenarios where the compiler will refuse to compile the function or worse it will compile but not do exactly what you want it to do. Because you can’t see the code it generates you won’t be able to see exactly what it’s doing.

Also I find it fascinating to look at IL.

2015-11-08 - In IL: Introduction

2015-10-18 - Character Encoding: Conclusion

2015-10-04 - Seriously Fun TV

2015-08-09 - Unity Coroutines

17 18 19 20 21