2021-05-16 - Abstraction is Magic
In computing terms an abstraction is treating something like a magical box. You put certain things into the box and get certain things out of it but how the box actually works is irrelevant to you. Typically this magic box is a system, program or library provided to you by someone else that solves some general problem of computing or provides certain functionality that makes developing an application easier. Abstractions are the basis for a lot of the advancements in computing because they allow us to spend more time focusing on the problems that are specific to the thing we are trying to make. To see how this works I’m going to go through several levels of abstraction and show how they allow us to create more complicated programs.
The zeroth level of abstraction is hardware. The first computers were all hand assembled using components specifically built for those computers. This meant that every computer was unique and a lot of time was spent designing and building each one. Eventually companies started to mass-produce components, CPUs, memory managers, disk interfaces, video controllers etc, and when you can build a computer using off-the-shelf chips instead of specifically designed components it makes it a lot easier and faster to get a working computer. The trade-off is you have less control over the specification and characteristics of each individual component.
The first level of abstraction is machine language. Machine language programs are made up of a series of binary values that tell the computer what to do. A 120 might tell the computer to move a value from one register to another and a 195 might tell it to jump to an instruction at a specific address. Now that were are using mass-produced chips we no longer have to worry about how these codes control the computer and instead we can focus on what we want the computer to do. The trade off is that we are limited to the operations that were implemented in the chip we are using.
The second level of abstraction is assembly language. These are the textual mnemonics used to represent the machine language instructions available. Instead of a 120 we now have a MOV A,B instruction and instead of 195 we now have a JMP instruction. An assembler program provided to use takes this text and converts it into the binary machine language that the computer understands. Assembly languages usually also have directives and labels which allow the programmer to tell the assembler what they want it to do without having to specifically set things up. Some can even perform optimizations that improve performance or memory usage. Now we don’t have to memorize the binary value of each instruction or figure out which addresses we want to use. Instead we can focus more on what we want the program to do which is easier to describe using mnemonics. The trade off is we have less control over the set of operations that the computer actually executes.
The third level of abstraction are compiled or interpreted languages. These are more advanced programming languages which don’t try to represent the actual machine language operations available. Instead of MOV instructions we have variable assignment and instead of JMP instructions we have conditional statements. The compiler or interpreter takes the text you wrote and does the hard work of turning it into the machine instructions that the computer can actually execute. Now we don’t have to know anything about the instruction set of the computer we are running on or sometimes even which computer we are running on. Instead we can focus more on what we want the program to do which is a lot easier to describe using the keywords of the higher level languages. The trade off is we have even less control over what operations the computer is executing.
The fourth level of abstraction are frameworks and libraries. These are collections of code which have been written to handler UI or database operations for us. Instead of drawing a dialog box using box characters we just tell the framework that we want a dialog and where we want it and it draws it for us. Someone wrote the framework to have a dialog function in it and we simply have to call that function. Now we don’t have to worry how to draw dialogs or connect to databases. Instead we can focus more on what we want to do with the dialogs and the database. The trade off here is we don’t have any control over how these dialogs are implemented or what functionality they provide.
The further up we go the more we get to focus on our specific problem. A program is about getting information from a source and then doing something useful with that information. The less we need to focus on the operations of getting information, displaying information or storing information the more we can focus on what we specifically want to do with that information. The trade-off with abstraction is efficiency. If you do everything from scratch you can develop a solution that is extremely efficient at doing what you want it to do but will likely require a lot of work to complete. On the other hand using abstractions you can be more efficient at designing and implementing your solution because you are simply putting together magical boxes created by other people with a little bit of customization on top for your particular needs. The solution won’t be quite as optimized as all the magical boxes need to support a variety of situations which don’t all apply to you but it will be a lot quicker to design and implement.
As computers have gotten more powerful the need for efficiency has gone down. We no longer need to fit our programs in 1 MiB of memory or less so using a large library that we only need a small part of is less of a problem. The more abstractions we can use the less we need to worry about and the more we can focus on what we are trying to do.
Of course the downside of abstractions are when things don’t work the way you want them to. Then the magical box concept becomes a pain because you need to know how it works and hopefully change it to better suit your situation.
Comments: