home comics writing pictures archive about

2015-10-18 - Character Encoding: Conclusion

I actually don’t spend a lot of time dealing with character encodings. It comes up a little bit when dealing with files but even then it’s just a matter of selecting the right encoding. So then why did I spend the time to investigate and write about character encodings?

Partially it’s because it’s a good thing to be aware of. It’s useful to understand what selecting the encoding is doing. Also the problem of how to store non-numeric data comes up a lot in programming and character encodings serve as a good example of that problem.

That being said the main thing is how it shows that the best solution to a problem can change over time. The designers of ASCII made the decision that data savings was more important that number of characters so they limited themselves to 7 bits. As time went on this situation changed; space became cheaper while the desire for more characters grew. This lead to Extended-ASCII and Unicode with their 8 bit, 16 bit, and 32 bit characters. Then things partially flipped around again. With the internet and data being sent all over the world space savings became important again but people didn’t want to lose their extra characters. This lead to the UTF formats that went for space savings under common circumstances at the cost of added complexity.

ASCII and Extended ASCII don’t fit current needs because the limited character set and need for code pages complicates sharing information around the world. Similarly UTF-8 wouldn’t have worked for early computers and Teletype machines because the variable width characters would have made it excessively complicated to implement on the hardware of the time.

I find problems that don’t have singular answers to be the most interesting. Problems with the specifics of the situation impact the requirements and multiple solutions can work simultaneously in different situations. All of these encodings are in use today. Some more than others but the introduction of newer encodings hasn’t destroyed those that came before it. One of the goals of UTF-8 was to be compatible with ASCII for this reason.

I’d also like to point out that this is not an exhaustive list. There are a bunch of other character encodings. Even within Unicode their are other transformations and variations of transformations. These are just the most common ones that I know of and have encountered.

2015-10-04 - Seriously Fun TV

I really like TV. I watch a fair bit of it, and from all that watching I’ve found that there’s a recipe to the shows that I enjoy. The best TV shows are those that start with a large scoop of drama and then add in a few spoonfuls of comedy on top of it.

I like stories and characters. Those are the things I find interesting about TV shows. So I tend to prefer dramas because they usually have better stories and characters. The problem is that shows that are pure drama tend to be very dark and depressing. They are just about bad things happening to the main characters and their trying to deal with things before the next bad things happen. I personally don’t like shows that make me depressed.

That’s where the comedy comes in. A little bit of silliness helps to break up the drama in the story and make the characters more likable. The comedy keeps things from being too depressing. The show can’t just be a pure comedy though because then you lose the stories and characters. They just become a series of jokes and as funny as they may be they aren’t really interesting.

So my favourite shows are those that have good stories and characters but aren’t too serious.

2015-08-09 - Unity Coroutines

I’ve been looking into the Unity game engine lately. While doing so I cam across the concept of a “Coroutine”. In Unity most things are done in an update function in a class attached to an object which gets called every frame. This is good for a lot of things but not for actions that should occur over a period of time. Coroutines give the ability to insert delays between operations so that you can better control when updates occur. The Unity manual says that a coroutine is “a function declared with a return type of IEnumerator and with the yield return statement included somewhere in the body. […] To set a coroutine running, you need to use the StartCoroutine function:” The example they give is something similar to this.

Unity Coroutine Function
IEnumerator Fade() {
for (float f = 1f; f >= 0; f -= 0.1f) {
Color c = GetComponent<Renderer>().material.color;
c.a = f;
GetComponent<Renderer>().material.color = c;
yield return new WaitForSeconds(.1f);
}
}

And the coroutine would be started like this.

Start Coroutine Function
StartCoroutine(Fade());

Now I find the requirement that a coroutine has to be a function with yield returns to be very interesting. For one thing the caller of a function doesn’t generally know what the function is doing and for another the coroutine is being called and the return value passed to StartCoroutine, not the function itself. If the documentation says it then it must be correct though. Let’s see if the compiled version of the function gives us any clues as to how StartCoroutine is detecting that the called function contain yield returns.

Compiled Coroutine Function
.method private hidebysig
instance class [mscorlib]System.Collections.IEnumerator Fade () cil managed
{
// Method begins at RVA 0x1b89c
// Code size 20 (0x14)
.maxstack 2
.locals init (
[0] class Fader/'<Fade>d__0',
[1] class [mscorlib]System.Collections.IEnumerator
)
IL_0000: ldc.i4.0
IL_0001: newobj instance void Fader/'<Fade>d__0'::.ctor(int32)
IL_0006: stloc.0
IL_0007: ldloc.0
IL_0008: ldarg.0
IL_0009: stfld class Fader Fader/'<Fade>d__0'::'<>4__this'
IL_000e: ldloc.0
IL_000f: stloc.1
IL_0010: br.s IL_0012
IL_0012: ldloc.1
IL_0013: ret
} // end of method Fader::Fade
.method private hidebysig

This is Common Intermediate Language (CIL) code. It is what C# usually gets compiled into. It’s basically an assembly language for a virtual machine. When the program is running it starts up an instance of the virtual machine which interprets the IL and generates actual native code. IL is useful for us because it is somewhat readable and gives us clues as to what is actually going to happen when the code is ran.

Looking at the function we notice some strange things. The compiled function doesn’t have any yield returns. It doesn’t even look like the code to change colour is there. All it does is create and return a Fader/'<Fade>d__0' object. Well let’s go look at that class.

Compiled Coroutine Class
.class nested private auto ansi sealed beforefieldinit '<Fade>d__0'
extends [mscorlib]System.Object
implements class [mscorlib]System.Collections.Generic.IEnumerator`1<object>,
[mscorlib]System.Collections.IEnumerator,
[mscorlib]System.IDisposable
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
01 00 00 00
)
// Fields
.field private object '<>2__current'
.field private int32 '<>1__state'
.field public class Fader '<>4__this'
.field public float32 '<f>5__1'
.field public valuetype [UnityEngine]UnityEngine.Color '<c>5__2'
// Methods
.method private final hidebysig newslot virtual
instance bool MoveNext () cil managed
{
.override method instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
// Method begins at RVA 0x1b770
// Code size 212 (0xd4)
.maxstack 3
.locals init (
[0] bool CS$1$0000,
[1] int32 CS$4$0001,
[2] bool CS$4$0002
)
IL_0000: ldarg.0
IL_0001: ldfld int32 Fader/'<Fade>d__0'::'<>1__state'
IL_0006: stloc.1
//int loc1 = <>1__state;
IL_0007: ldloc.1
IL_0008: switch (IL_001c, IL_0017)
//switch(loc1)
IL_0015: br.s IL_001e
IL_0017: br IL_009c
IL_001c: br.s IL_0023
IL_001e: br IL_00ce
IL_0023: ldarg.0
IL_0024: ldc.i4.m1
IL_0025: stfld int32 Fader/'<Fade>d__0'::'<>1__state'
IL_002a: nop
IL_002b: ldarg.0
IL_002c: ldc.r4 1
IL_0031: stfld float32 Fader/'<Fade>d__0'::'<f>5__1'
IL_0036: br.s IL_00b6
IL_0038: nop
IL_0039: ldarg.0
IL_003a: ldarg.0
IL_003b: ldfld class Fader Fader/'<Fade>d__0'::'<>4__this'
IL_0040: call instance !!0 [UnityEngine]UnityEngine.Component::GetComponent<class [UnityEngine]UnityEngine.Renderer>()
IL_0045: callvirt instance class [UnityEngine]UnityEngine.Material [UnityEngine]UnityEngine.Renderer::get_material()
IL_004a: callvirt instance valuetype [UnityEngine]UnityEngine.Color [UnityEngine]UnityEngine.Material::get_color()
IL_004f: stfld valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'
IL_0054: ldarg.0
IL_0055: ldflda valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'
IL_005a: ldarg.0
IL_005b: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'
IL_0060: stfld float32 [UnityEngine]UnityEngine.Color::a
IL_0065: ldarg.0
IL_0066: ldfld class Fader Fader/'<Fade>d__0'::'<>4__this'
IL_006b: call instance !!0 [UnityEngine]UnityEngine.Component::GetComponent<class [UnityEngine]UnityEngine.Renderer>()
IL_0070: callvirt instance class [UnityEngine]UnityEngine.Material [UnityEngine]UnityEngine.Renderer::get_material()
IL_0075: ldarg.0
IL_0076: ldfld valuetype [UnityEngine]UnityEngine.Color Fader/'<Fade>d__0'::'<c>5__2'
IL_007b: callvirt instance void [UnityEngine]UnityEngine.Material::set_color(valuetype [UnityEngine]UnityEngine.Color)
IL_0080: nop
IL_0081: ldarg.0
IL_0082: ldc.r4 0.1
IL_0087: newobj instance void [UnityEngine]UnityEngine.WaitForSeconds::.ctor(float32)
IL_008c: stfld object Fader/'<Fade>d__0'::'<>2__current'
IL_0091: ldarg.0
IL_0092: ldc.i4.1
IL_0093: stfld int32 Fader/'<Fade>d__0'::'<>1__state'
IL_0098: ldc.i4.1
IL_0099: stloc.0
IL_009a: br.s IL_00d2
IL_009c: ldarg.0
IL_009d: ldc.i4.m1
IL_009e: stfld int32 Fader/'<Fade>d__0'::'<>1__state'
IL_00a3: nop
IL_00a4: ldarg.0
IL_00a5: dup
IL_00a6: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'
IL_00ab: ldc.r4 0.1
IL_00b0: sub
IL_00b1: stfld float32 Fader/'<Fade>d__0'::'<f>5__1'
IL_00b6: ldarg.0
IL_00b7: ldfld float32 Fader/'<Fade>d__0'::'<f>5__1'
IL_00bc: ldc.r4 0.0
IL_00c1: clt.un
IL_00c3: ldc.i4.0
IL_00c4: ceq
IL_00c6: stloc.2
IL_00c7: ldloc.2
IL_00c8: brtrue IL_0038
IL_00cd: nop
IL_00ce: ldc.i4.0
IL_00cf: stloc.0
IL_00d0: br.s IL_00d2
IL_00d2: ldloc.0
IL_00d3: ret
} // end of method '<Fade>d__0'::MoveNext
.method private final hidebysig specialname newslot virtual
instance object 'System.Collections.Generic.IEnumerator<System.Object>.get_Current' () cil managed
{
.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (
01 00 00 00
)
.override method instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<object>::get_Current()
// Method begins at RVA 0x1b850
// Code size 11 (0xb)
.maxstack 1
.locals init (
[0] object
)
IL_0000: ldarg.0
IL_0001: ldfld object Fader/'<Fade>d__0'::'<>2__current'
IL_0006: stloc.0
IL_0007: br.s IL_0009
IL_0009: ldloc.0
IL_000a: ret
//return <>2__current;
} // end of method '<Fade>d__0'::'System.Collections.Generic.IEnumerator<System.Object>.get_Current'
.method private final hidebysig newslot virtual
instance void System.Collections.IEnumerator.Reset () cil managed
{
.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (
01 00 00 00
)
.override method instance void [mscorlib]System.Collections.IEnumerator::Reset()
// Method begins at RVA 0x1b867
// Code size 6 (0x6)
.maxstack 8
IL_0000: newobj instance void [mscorlib]System.NotSupportedException::.ctor()
IL_0005: throw
//throw new NotSupportedException();
} // end of method '<Fade>d__0'::System.Collections.IEnumerator.Reset
.method private final hidebysig newslot virtual
instance void System.IDisposable.Dispose () cil managed
{
.override method instance void [mscorlib]System.IDisposable::Dispose()
// Method begins at RVA 0x1b86e
// Code size 2 (0x2)
.maxstack 8
IL_0000: nop
IL_0001: ret
} // end of method '<Fade>d__0'::System.IDisposable.Dispose
.method private final hidebysig specialname newslot virtual
instance object System.Collections.IEnumerator.get_Current () cil managed
{
.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (
01 00 00 00
)
.override method instance object [mscorlib]System.Collections.IEnumerator::get_Current()
// Method begins at RVA 0x1b874
// Code size 11 (0xb)
.maxstack 1
.locals init (
[0] object
)
IL_0000: ldarg.0
IL_0001: ldfld object Fader/'<Fade>d__0'::'<>2__current'
IL_0006: stloc.0
IL_0007: br.s IL_0009
IL_0009: ldloc.0
IL_000a: ret
} // end of method '<Fade>d__0'::System.Collections.IEnumerator.get_Current
.method public hidebysig specialname rtspecialname
instance void .ctor (
int32 '<>1__state'
) cil managed
{
.custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = (
01 00 00 00
)
// Method begins at RVA 0x1b88b
// Code size 14 (0xe)
.maxstack 8
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
IL_0006: ldarg.0
IL_0007: ldarg.1
IL_0008: stfld int32 Fader/'<Fade>d__0'::'<>1__state'
IL_000d: ret
} // end of method '<Fade>d__0'::.ctor
// Properties
.property instance object 'System.Collections.Generic.IEnumerator<System.Object>.Current'()
{
.get instance object Fader/'<Fade>d__0'::'System.Collections.Generic.IEnumerator<System.Object>.get_Current'()
}
.property instance object System.Collections.IEnumerator.Current()
{
.get instance object Fader/'<Fade>d__0'::System.Collections.IEnumerator.get_Current()
}
} // end of class <Fade>d__0

Wow, that’s a lot of stuff. It appears to be a class implementing the IEnumerator interface with the colour change logic in its MoveNext() function. So where did all this code come from? The compiler put it in there. It took the original function and generated a class based on the return type. The body of the function got transformed into a state driven enumerator with the current property being set instead of having yield returns. So if the yield return code gets compiled into an IEnumerator class then how can StartCoroutine require a function that contains yield returns? Well clearly it can’t. If you took that compiled class and rewrote it in C# it would look something like this.

Unity Coroutine Class
public class FaderEnumerator : IEnumerator{
private object _current;
private int _state;
public Fader _this;
public float _f;
public Color _c;
public object Current {
get {
return _current;
}
}
public FaderEnumerator(int state)
{
_state = state;
}
public bool MoveNext ()
{
switch (_state) {
case 0:
_state = -1;
_f = 1;
break;
case 1:
_state = -1;
_f -= 0.1f;
break;
}
if(_f >= 0)
{
_c = _this.GetComponent<Renderer>().material.color;
_c.a = _f;
_this.GetComponent<Renderer>().material.color = _c;
_current = new WaitForSeconds(.1f);
_state = 1;
return true;
}
return false;
}
public void Reset ()
{
throw new NotSupportedException();
}
}

To start this coroutine you would do this.

Start Coroutine Class
var faderEnumerator = new FaderEnumerator(0);
faderEnumerator._this = this;
StartCoroutine(faderEnumerator);

These two snippets are roughly equivalent to the original two.

So is the documentation wrong? Well no, the reason the documentation says a coroutine is a function with yield returns is because they made up the concept of couroutines and can define them however they wish. At the same time it isn't being entirely honest though. Clearly you can have a coroutine that isn't a function containing yield returns so it can't really be said that's a requirement.

There are a few reasons I bring this up. Firstly I believe it’s important to understand what abstractions are doing when you use them. The more you understand about what your code is doing the easier it is to debug. The second reason is because the IEnumerator class case is probably more reusable. Since you are working with a whole class there are more options for controlling how it operates. You could create a generic class that could apply similar logic to many different types of things

Finally because the compiler is generating code you need to be careful what you do with yield returns. The compiler is smart and it’s going to do it’s best to do what you are telling it to do but it’s not perfect. There are probably some scenarios where the compiler will refuse to compile the function or worse it will compile but not do exactly what you want it to do. Because you can’t see the code it generates you won’t be able to see exactly what it’s doing.

Also I find it fascinating to look at IL.

2015-07-25 - The Correct Date Format

Occasionally you will hear the British and the Americans fighting over what is the correct date format. The Americans think it’s MM/DD/YYYY and the British think it’s DD/MM/YYYY, but they are both wrong because the correct format is YYYY-MM-DD.

One reason is ambiguity. Without some additional context it’s hard to tell if 12/01/2012 is December 1st, 2012 or the 12th of January, 2012. This ambiguity doesn’t exist for YYYY-MM-DD because there’s no YYYY-DD-MM in common usage. This means that 2012-12-01 is always December 1st, 2012 and there’s no chance of reading the date wrong.

The second reason is sorting. When dates are sorted as text they are ordered by the first part of the date first. This means that MM/DD/YYYY sorts by month, DD/MM/YYYY sorts by day, and YYYY-MM-DD sorts by year. Ideally you want dates sorted chronologically which means it’s better to have the largest part first. YYYY-MM-DD does this while the other formats will mix dates up.

Consider the dates April 2nd 2012, April 3rd 2012, June 15th 2012, April 17th 2013, and June 2nd 2013. The following table shows these dates sorted according to the various date formats.

YYYY-MM-DD MM/DD/YYYY DD/MM/YYYY
2012-04-02 04/02/2012 02/04/2012
2012-04-03 04/03/2012 02/06/2013
2012-06-15 04/17/2013 03/04/2012
2013-04-17 06/02/2013 15/06/2012
2013-06-02 06/15/2012 17/04/2013

With YYYY-MM-DD format everything is in order. All the 2012 dates come before the 2013 dates, All the April dates come before the June dates within the same year. With MM/DD/YYYY we have 2012 dates before and after the 2013 dates. With DD/MM/YYYY we have 2013 dates before 2012 dates and June dates before April dates.

Also I personally think the dashes look better.