2021-03-20 - Adventures in Partitions
No, this post isn't about Poland
I've always been fascinated by the history of programming and to that end I recently bought myself an old computer. I installed Windows 98 SE, Windows NT 4.0, Windows 3.1 and OS/2 2.1 on it and installed a variety of programming packages. Previously I was using virtual machines to host the OS but I found the virtual screen difficult to read and the virtualization program has compatibility versions with windows updates. Having dedicated hardware means things can run full screen and I shouldn’t have to deal with updates.
It's been an interesting experience setting up all these OSs. For one thing I learned things I didn't expect to learn and for another things I expected to be problems weren't. My primary concern with setting up this system was drivers. I envisioned days spent trying to get things to work and googling obscure error messages but that hasn't really been the case. For the most part things just worked and I was able to find drivers for the things I wanted. Dell had drivers downloads for both Windows 98 and Windows NT 4.0 and I even found USB mass storage drivers for both of those as well. I also found a tool that patches the SVGA driver for Windows 3.1 so that you can run it at a resolution above 640x480. I am missing some drivers, like Windows 98 SE can't read NTFS partitions but Windows NT 4.0 can read FAT32 and both can connect to the network and read USB sticks so that’s not a huge issue. When I get around to working with OS/2 I want to try and figure out how to get it to read the CD Drive and display at a higher resolution but those issues don’t stop it from working.
What I did have a problem with is hard drive partitions.
Firstly do you know how computers boot from a hard drive? well it turns out that it's a three step process. First you have the Master Boot Record (MBR) which sits at the start of the hard drive. The computer executes this section first and it loads partition information and passes execution off to some other bit of code. The actual operations performed depends on the MBR installed. A basic one will just find an active partition and execute the Volume Boot Record (VBR) while a more advanced one will switch over to a boot manager program. The VBR works the same as the MBR but for a partition and that is more OS specific. The VBR locates, loads and starts the actual OS.
The other thing I learned about was how partitions are defined and how the computer requests data from them. It turns out that the MBR has space for four partition slots which are stored after the start up code. These partition slots contain information about where the partition is on the disk, how big it is, and what kind of partition it is. This limits the maximum number of primary partitions on a disk to four. You can have extended partitions which are basically partitions containing other partitions but those caused me issues so I never used them. Newer hard drive setups replace the MBR with something more expandable but that's not really relevant to this old computer.
Now on to accessing data. Originally data was accessed on a hard drive using Cylinder-head-sector addressing. Hard drives are made up of a stack of platters. CHS forms a kind of 3D coordinate system for locating data on these platters. The Head value is a vertical coordinate and selects which platter and which side of the platter to get data from. Head is the term for the component that reads the data from the platter so by selecting which head to use you select which platter to read from. The Cylinder or Track value is a radial value which indicates a ring on the platter to get data from. The Sector value is an angular value which indicates which section of the ring the data is in. This system was used because early hard drives were rather simple and so the computer had to tell them exactly where to find the data they wanted. As hard drives got more advanced, and specifically as they got more built in controller logic, this scheme was less necessary. CHS was eventually replaced by Logical block Addressing which accesses data on a hard drive using a single numerical index and leaves it up to the hard drive itself to figure out where that block of data actually is.
The reason this is important is because the format you have for encoding these addresses determines how large of a hard drive you can access. The original IBM BIOS implantation of CHS had 10 bits for cylinder, 8 bits for head, and 6 bits for sector. With a 512 byte sectors this gives 8064 MiB (63 sectors x 1024 cylinders x 256 heads x 512 bytes) of addressable space. There's only 63 sectors in a track because numbering starts at 1. This was replaced by 28-bit LBA which allows for 268,435,456 sectors or 128 GiB and later 48-bit LBA which supports up to 128 PiB. One more wrinkle though because the MBR only has 4 bytes to store the size of a partition. If we are using 28-bit LBA that's fine but with 48-bit LBA we lose 16 bits which limits the maximum number of sectors in a partition to 4,294,967,296 or 2 TiB.
The hard drive installed in the computer is 232 GiB (250 GB) but the BIOS and the partition manager I was using only sees it as 128 GiBs likely because they are using 28-bit LBA. FDISK for Windows 98 reported the drive as only being 65,535 MiB but that’s likely because it’s using a 16 bit value somewhere. Windows NT 4.0 reported the drive as being 8064 MiB likely because it was using CHS. The other problem with the Windows NT 4.0 setup program is that it can only create 4 GiB NTFS partitions because it first creates them as super sized FAT 16 partitions for some reason. The OS itself can create larger partitions but those have to be created after you have it installed. There’s also apparently a bug where the main NT OS files have to be within the first 8064 MiB of the drive or the loader can’t find it. DOS and Windows 3.1 were surprisingly easy to setup. The FAT 16 implementation used by them can only be 2 GiB so I created a partition of that size and they happily installed into it. I tried the same for OS/2 but it saw the partition as only being 32 MiB for some reason and got really confused about the other partitions. I ended up having to let it create its own 32 MiB partition and then expanded it to 2 GiB afterwards. It seems to be okay with that.
But now I can programing in C, C++, QuickBasic, Visual Basic, ASP, Pascal and Assembly so that’s nice.
2021-02-20 - In IL: Assemblies
So far we've mostly been looking at instructions. Instructions form the smallest part of a program, but you can't execute a random IL instruction on it's own. To see how instructions fit together we need to pull up and start looking at things from the outside in. To start with we are going to look at assemblies.
A .NET program can be thought of as a collection of assemblies. Assemblies are individual files, either executable (.exe) files or library (.dll) files, that each contain a collection of types, methods, and data. We'll get to all that in a bit but first let's look at the Assembly information contained within an executable. To do this we're going to go back to part 5 and take a closer look at the compiled code. To refresh your memory here's the C# program from that part.
Now we're going to compile this program and then look at the decompiled file but instead of looking at the contents of the main method we're going to look at the information added before the class is declared.
We have two assembly declarations here. The extern declaration is used to indicate a referenced assembly. In this case the program references mscorlib which is where all the basic types and method are declared. The second declaration describes the assembly we built. You can see that the assembly directive contains a bunch of attributes that describe the assembly itself such as it's name, the version of .NET it's built for and its version. Some of these are set based on the build options of the project and some are based on the values in AssemblyInfo.cs.
Finally we have a module declaration. Assemblies are built from a collection of modules which can be thought of as files although these don't seem to map exactly to source files. It's likely that visual studio does some work to combine all the source files before actually building the assembly. There are also some other directives such as the .subsystem directive which says if this is a graphical application or a console application. These describe how the assembly was built and how it's meant to be run.
Now there's a lot of things that could be talked about with assemblies but I'm going to hold off on that for now as they aren't directly connected to the code we right. I might come back and explore the options more in the future.
Next time we will start looking at class declarations.
2021-01-23 - DataTypes: Introduction
This series is going to look at how data is stored in a computer. We're going to start with simple things like numbers and references but eventually we will work up to more complicated data structures like arrays and queues.
My goal with the first part of the series is to get a solid grasp on how computers store simple values and touch a bit on how operations on those values are performed.
The second part will involve some more code and examples of simple versions of more complex combinations of values. These combinations of values form the basis for many of the containers that we use as programmers so I want to cement the understanding of the basics.
2021-01-19 - Well, it's Been a Year
Literally it's been over a year since I've posted on here. This year I will write more than I did last year. Since I only posted two things last year that shouldn't be hard.
Writing Goals for 2021
- A DataTypes series starting with basic types like integers and floating point numbers but extending into more complicated things like linked-lists
- Continuing the In IL series
- Adding/Updating writing posts on my various projects
- Adding more How-To's and References
Project Goals for 2021
- Convert Comics and Pictures to use the database backend
- Update PenguinMixer program to convert Comics and Picture flat-files to database files
- Create a program for managing Comics and Pictures as well as syncing local testing environment with production data