home comics writing pictures archive about

2014-05-03 - Character Encoding: ASCII

Computers only really understand numbers. They take numbers as input, perform operations based on numbers, and produce numbers as output. This creates a problem because people don’t really understand numbers. Computers were developed primarily so that people wouldn’t have to deal with numbers. People tend to work better using a wide range of characters of which numbers are only a small subset. This means that there is a need for a way of encoding characters as numbers so that computers and people can understand each other.

Enter the American Standard Code for Information Interchange (Often abbreviated as ASCII). ASCII is a seven bit character encoding scheme created in the 1960s and inspired by earlier Teletype encoding schemes. ASCII has since become very popular for use with computers. ASCII groups related characters together and orders them in a meaningful fashion where appropriate. This makes certain operations very simple and aids in sorting. For example the number '0' is encoded in ASCII with a value of 48 and the numbers '1'-'9' occupy values 49 to 57. So for a character known to be a number the value can be determined by subtracting 48. The lowercase 'a' has a value of 97 while the uppercase 'A' has a value of 65. Since both lowercase and uppercase letters are ordered in the same manner and without breaks the conversion between them is just the addition or subtraction of 32.

Range Description
Decimal Hex Binary
0-31,
127
0x00-0x1F,
7F
000 0000-001 1111,
111 1111
Control characters: This includes some formatting characters like backspace, tab, and line feed as well as some obsolete Teletype characters like Bell.
32-47,
58-64,
91-96,
123-126
0x20-0x2F,
0x3A-0x40,
0x5B-0x60,
0x7B-0x7E
010 0000-010 1111,
011 1010-100 0000,
101 1011-110 0000,
111 1011-111 1110
Symbols: This includes punctuation, the space, and other symbols such as the ampersand, brackets, and slashes
48-57 0x30-0x39 011 0000-011 1001 Digits: '0' to '9'
65-90 0x41-0x5A 100 0001-101 1010 Uppercase Letters: 'A' to 'Z'
97-122 0x61-0x7A 110 0001-111 1010 Lowercase Letters: 'a' to 'z'

As the name implies ASCII was created primarily for use with English (American) computers. It allows for all the standard American English characters but has no room for accented or non-Latin characters. This means that extensions are required for non-American and non-English use. Most character encoding development since ASCII has been to find a way to support these additional characters.

2014-03-14 - Reading

I need to read more, my backlog is growing. I’ve got 3 books on the go; although, a couple of them are programming books so “reading” them means coding the examples. I’ve got 5 or 6 books I need to start reading. There are also 3 or 4 books I want to buy.

I need to get on that.

2014-03-07 - SubVersion

SVN ( SubVersioN) is a version control system. The main part of SVN is the repository which is a collection of versioned files and directories. You setup an SVN server by creating a repository and adding files to it. You then check out a working copy of those files and make changes to them. As you make changes you periodically commit them back to the server. The repository stores files as a collection of changes which allows you to view not only the latest version but any version of the files. The more check-ins you make the more history your files have. As you are working on something you can undo certain changes. You can see when a change was made and possibly why it was made if a message was added with the commit. If you change how something works but later decide you’d like to have some of that functionality back you can grab a copy of the file before those changes. This is one of the reasons I recently setup an SVN server locally and added this website and a bunch of my programming projects to it.

One of the other reasons I setup the SVN server was to make it easier to keep copies of the same file in sync. SVN allows you to set an “Externals” property on a folder which tells it that it should copy files or folders from another part of the repository into that folder. If you commit changes to the original those changes will get propagated to the copy when you update and the same thing happens if you change the copy. This is very helpful to me as I want to have copies of files from other projects as part of the website. Similarly as I’m working on the command line version of the code viewer I can setup an “Externals” property on the project folder to copy the types folder from the website. This means that the website code viewer and the command line code viewer are always working from the same files.

It’s also very helpful for organization and backup. All my projects are in their own folders in the repository and they are setup to have similar folder structures. I can also have several working copies of the projects so that there are backups of all my files without the need for manually syncing them.

2014-02-21 - Hex Editing

Hex editors are programs that allow you to modify the binary contents of a file. These editors usually represent this data as hexadecimal digits, hence the name. You can do very powerful things with hex editors but they can also break things quite spectacularly. Generally the process of editing the binary bits of a file involves a lot of trial and error because it’s hard to tell what each part represents; however, some file formats contain plain text strings which can be easily changed. That’s what I used to do

I used to be fairly involved with the G-Mod community and one of the groups in that community was The Skinners. The Skinners were people that spent their time modifying texture files. They would take an existing model and re-skin it to be something new. The Skinners had a problem though, modifying a texture doesn't create a new model it just changed the original. That was where hex editing came in. With a hex editor it was possible to make a copy of the model and then modify the copy to be unique from the original. You first had to change the model’s internal reference to point to the copy. Next you had to modify the texture paths to point to the new ones created by the Skinners.

I tried a little bit of skinning and modelling but the hex editing was what I really enjoyed. One of the draws for me was the word puzzle. With hex editors it is problematic to change the length of the file. This meant that you had to think of a way to create new path names that were unique and descriptive but with the same number of characters as the original. The other reason I enjoyed hex editing was it meant playing around with files on a very low level. Hex editing was probably the start of my path towards becoming a programmer. Playing around with the bits and bytes, changing how the computer perceived a file. It was fun.