home comics writing pictures archive about

2019-06-15 - Abstraction is Magic

There's a saying in scientific circles about standing on the shoulders of giants. The idea being that what people are working on today is based on the knowledge gained by those that came before them. If everyone was starting from scratch then we'd never make any progress. Programming is the same way but our currency of progress is abstraction.

Abstraction is the idea of hiding details so that it's easier to focus on the bigger picture. When programming first started everything was done in machine code with the programmer telling the processor the exact actions to perform. This allowed absolute control and the possibility of extremely efficient programs but it also required the programmer to be very aware of the intricacies of the processor and it took a lot of work to develop a complete application. The invention of compilers allowed the details of the processor to be abstracted so that the programmer could focus more on the specifics of the application they wanted to create. The developers of the compiler still needed to know about the processor but their work allowed others to focus on bigger issues. Successive generations of programming languages and advances in operating systems and framework allow even more abstraction.

But abstraction is a double edged sword. It helps you to focus on the bigger picture and ignore the small details until there's a problem with those small details. It's really nice that the operating system has a mechanism for creating a dialog box until there's an issue creating that dialog box and the OS won't tell you want it is or how to fix it. When you are writing machine code there's never a situation where the processor does something you didn't explicitly tell it to do. The more abstractions you have the more things that are going on behind the scenes that you are not aware of. You also have less control over how exactly things work. When you are doing everything yourself it's easy to optimize operations to be very efficient for your specific case. Abstractions need to be general enough to meet a variety of needs so they may end up doing things that aren't required for your specific scenario.

I think the important thing here is that abstractions are required for programming to advance but we can't lose sight of what those abstractions are doing. You need to understand your abstractions to some degree if you are going to be successful at using them. This is the main thing that drives me to learn about assembly language, intermediate code, and compilers. I will likely never do anything with those concepts professionally but knowing them helps me work with them as abstractions.

2019-04-27 - In IL: Summing Arrays

Today we are going to see some of the instructions we looked at last time in action. Let's start by looking at a simple program that sums the values in an array.

Program.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace CsArray1
{
class Program
{
static void Main(string[] args)
{
int[] array = new int[5];
for(int i = 0; i < array.Length; i++)
{
array[i] = i;
}
int sum = 0;
for(int i = 0; i < array.Length; i++)
{
sum += array[i];
}
Console.WriteLine(sum);
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

This program creates a 1-dimensional array, fills that array with values, and then sums up those values. Now let's look at the compiled version.

Main
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 54 (0x36)
.maxstack 3
.locals init ([0] int32[] 'array',
[1] int32 sum,
[2] int32 i,
[3] int32 V_3)
IL_0000: ldc.i4.5
IL_0001: newarr [mscorlib]System.Int32
IL_0006: stloc.0
IL_0007: ldc.i4.0
IL_0008: stloc.2
IL_0009: br.s IL_0013
IL_000b: ldloc.0
IL_000c: ldloc.2
IL_000d: ldloc.2
IL_000e: stelem.i4
IL_000f: ldloc.2
IL_0010: ldc.i4.1
IL_0011: add
IL_0012: stloc.2
IL_0013: ldloc.2
IL_0014: ldloc.0
IL_0015: ldlen
IL_0016: conv.i4
IL_0017: blt.s IL_000b
IL_0019: ldc.i4.0
IL_001a: stloc.1
IL_001b: ldc.i4.0
IL_001c: stloc.3
IL_001d: br.s IL_0029
IL_001f: ldloc.1
IL_0020: ldloc.0
IL_0021: ldloc.3
IL_0022: ldelem.i4
IL_0023: add
IL_0024: stloc.1
IL_0025: ldloc.3
IL_0026: ldc.i4.1
IL_0027: add
IL_0028: stloc.3
IL_0029: ldloc.3
IL_002a: ldloc.0
IL_002b: ldlen
IL_002c: conv.i4
IL_002d: blt.s IL_001f
IL_002f: ldloc.1
IL_0030: call void [mscorlib]System.Console::WriteLine(int32)
IL_0035: ret
} // end of method Program::Main
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113

The looping sequence should be very familiar to you by now. You can see it initialize the looping variable, test the variable, perform the loop operations, and increment the variable. You also see some of the instructions we talked about last time such as newarr, stelem, ldlen, and ldelem.

Now let's look at another example.

Program.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace CsArray2
{
class Program
{
static void Main(string[] args)
{
int[,] array = new int[5,10];
for (int i = 0; i < array.GetLength(0); i++)
{
for (int j = 0; j < array.GetLength(1); j++)
{
array[i,j] = i * j;
}
}
int sum = 0;
for (int i = 0; i < array.GetLength(0); i++)
{
for (int j = 0; j < array.GetLength(1); j++)
{
sum += array[i,j];
}
}
Console.WriteLine(sum);
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

This time we are doing basically the same thing except with a 2-dimensional array. This means that we have nested loops for each part, elements are accessed using two indexes, and we have to use the GetLength() method so that we can indicate which dimension we want the length of. Now let's look at the compiled version of this.

Main
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 122 (0x7a)
.maxstack 5
.locals init ([0] int32[0...,0...] 'array',
[1] int32 sum,
[2] int32 i,
[3] int32 j,
[4] int32 V_4,
[5] int32 V_5)
IL_0000: ldc.i4.5
IL_0001: ldc.i4.s 10
IL_0003: newobj instance void int32[0...,0...]::.ctor(int32,
int32)
IL_0008: stloc.0
IL_0009: ldc.i4.0
IL_000a: stloc.2
IL_000b: br.s IL_002e
IL_000d: ldc.i4.0
IL_000e: stloc.3
IL_000f: br.s IL_0020
IL_0011: ldloc.0
IL_0012: ldloc.2
IL_0013: ldloc.3
IL_0014: ldloc.2
IL_0015: ldloc.3
IL_0016: mul
IL_0017: call instance void int32[0...,0...]::Set(int32,
int32,
int32)
IL_001c: ldloc.3
IL_001d: ldc.i4.1
IL_001e: add
IL_001f: stloc.3
IL_0020: ldloc.3
IL_0021: ldloc.0
IL_0022: ldc.i4.1
IL_0023: callvirt instance int32 [mscorlib]System.Array::GetLength(int32)
IL_0028: blt.s IL_0011
IL_002a: ldloc.2
IL_002b: ldc.i4.1
IL_002c: add
IL_002d: stloc.2
IL_002e: ldloc.2
IL_002f: ldloc.0
IL_0030: ldc.i4.0
IL_0031: callvirt instance int32 [mscorlib]System.Array::GetLength(int32)
IL_0036: blt.s IL_000d
IL_0038: ldc.i4.0
IL_0039: stloc.1
IL_003a: ldc.i4.0
IL_003b: stloc.s V_4
IL_003d: br.s IL_0068
IL_003f: ldc.i4.0
IL_0040: stloc.s V_5
IL_0042: br.s IL_0057
IL_0044: ldloc.1
IL_0045: ldloc.0
IL_0046: ldloc.s V_4
IL_0048: ldloc.s V_5
IL_004a: call instance int32 int32[0...,0...]::Get(int32,
int32)
IL_004f: add
IL_0050: stloc.1
IL_0051: ldloc.s V_5
IL_0053: ldc.i4.1
IL_0054: add
IL_0055: stloc.s V_5
IL_0057: ldloc.s V_5
IL_0059: ldloc.0
IL_005a: ldc.i4.1
IL_005b: callvirt instance int32 [mscorlib]System.Array::GetLength(int32)
IL_0060: blt.s IL_0044
IL_0062: ldloc.s V_4
IL_0064: ldc.i4.1
IL_0065: add
IL_0066: stloc.s V_4
IL_0068: ldloc.s V_4
IL_006a: ldloc.0
IL_006b: ldc.i4.0
IL_006c: callvirt instance int32 [mscorlib]System.Array::GetLength(int32)
IL_0071: blt.s IL_003f
IL_0073: ldloc.1
IL_0074: call void [mscorlib]System.Console::WriteLine(int32)
IL_0079: ret
} // end of method Program::Main
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152

We have the same looping sequence as before except this time there's multiple sequences nested inside of each other. The big difference here is that we don't see any of the array instructions we talked about last time. Instead we see method calls and calls to constructors. This is because we have a 2-dimensional array. As mentioned last time IL special cases 1-dimensional arrays that start at 0 and the array instructions we looked at last time are only used for those special arrays. When we move to 2 dimensions we lose the instructions and have to revert to method calls.

Speaking of instructions, next time we're going to look at some more basic instructions which either haven't come up yet or were missed.

2019-04-06 - Chicken or the Egg

People have long debated which came first, the chicken or the egg? Since genetic variation and mutations arise from the fertilization process the egg must have come first. Two proto-chickens got together and they produced an egg from which a chicken hatched. The hatching process doesn't change the animal inside of the egg and so a chicken must come from a chicken egg which was laid by proto-chicken parents.

That being said it does raise a nomenclature question. Is a chicken egg a chicken egg because it contains a chicken or because it was laid by a chicken? An unfertilized chicken egg is still considered to be a chicken egg even though it doesn't contain a chicken. This means that the egg from which the first chicken hatched was not a chicken egg because it was laid by a proto-chicken and so the chicken came first. Later that chicken laid chicken eggs.

At the same time changes in animal populations occur over long periods of time and are usually the result of environmental changes or some other external effect. There was probably never a specific first chicken. Something happened which caused the factors that influenced the survival of proto-chickens to change leading to different characteristics being more ideal and eventually leading to a population that was different enough from past generations to be considered a different species and so neither the chicken or the egg came first. They appeared at the same time.

It's all just a matter of perspective.

2019-02-03 - In IL: Array Instructions

An array is a series of elements laid out continuously in memory with each element being accessible using its index value. The two key properties of an array are its bounds and its dimensions. Bounds indicate the lowest and highest possible index values while dimensions indicate how many index values are required.  For example a two-dimensional array could be used as a table with one index representing the rows and the other index representing the columns. The bounds would indicate how many rows and columns the table has. IL allows arrays with multiple dimensions and various bounds but treats one-dimensional arrays with a 0 for the lower bound as special. These arrays are referred to as vectors and there are special IL instructions for working with these types of arrays.

newarr (NEW ARRay)

Pops a integer off of the stack and creates a new array able to contain that many elements.

Instruction Description Binary Format
newarr <type> Create new array 0x8D <T>

ldlen (LoaD LENgth)

Pops an array off of the stack and pushes the length of the array onto the stack.

Instruction Description Binary Format
ldlen Length of array 0x8E

ldelema (LoaD ELEMent Address)

Pops an index value and an array off of the stack and pushes the address of the element of the array at the specified index onto the stack.

Instruction Description Binary Format
ldelema <type> load element address of specified type 0x8F <T>

ldelem (LoaD ELEMent)

Pops an index value and an array off of the stack and pushes the element of the array at the specified index onto the stack.

Instruction Description Binary Format
ldelem.i1 load 8 bit integer element 0x90
ldelem.u1 load 8 bit unsigned integer element 0x91
ldelem.i2 load 16 bit integer element 0x92
ldelem.u2 load 16 bit unsigned integer element 0x93
ldelem.i4 load 32 bit integer element 0x94
ldelem.u4 load 32 bit unsigned integer element 0x95
ldelem.i8 load 64 bit integer element 0x96
ldelem.u8 load 64 bit unsigned integer element 0x96
ldelem.i load native integer element 0x97
ldelem.r4 load 32 bit floating point element 0x98
ldelem.r8 load 64 bit floating point element 0x99
ldelem.ref load object element 0x9A
ldelem <type> load element of specified type 0xA3 <T>

stelem (SeT ELEMent)

Pops a value, an index value, and an array off of the stack and sets the element of the array at the specified index to the value.

Instruction Description Binary Format
stelem.i set native integer element 0x9B
stelem.i1 set 8 bit integer element 0x9C
stelem.i2 set 16 bit integer element 0x9D
stelem.i4 set 32 bit integer element 0x9E
stelem.i8 set 64 bit integer element 0x9F
stelem.r4 set 32 bit floating point element 0xA0
stelem.r8 set 64 bit floating point element 0xA1
stelem.ref set object element 0xA2
stelem<type> set element of specified type 0xA4<T>

Next time we'll look at some programs which use arrays.

Prev page

6 7 8 9 10

Prev page