home comics writing pictures archive about

2015-05-03 - Branching Storylines

I am a big fan of stories in games. Some of my favourite games are so because of their stories. That’s why it bugs me that there’s a trend in games towards having very open stories. Games where the player is given a lot of control over what happens. They can choose to go here or to not go there. They can choose to save the village or to burn it down. The player is partially responsible for writing the story and that makes it difficult to have a really strong story.

The problem is that these choices make alternatives. A character could be dead in one version of the story but alive in another. The player could be evil in one story while they could be a saint another. It’s difficult to have a strong story when everything in it is so fluid. Now I do understand the reasoning for this. Players like to feel that their actions have impact. That they’re doing something and not just following along.

Personally I would prefer it if the choices where limited and fairly obvious. That way the player still had impact in the story but at the same time there aren’t too many alternatives. That way it’s easier to polish the story and make sure it’s solid while at the same time allowing the player to go through all the possible stories.

That still leaves the problem of endings though. It’s difficult to make a sequel to a game with multiple endings without picking one or just making up a new one that’s a combination of all of them. As I said it’s hard to have a strong story when everything is so fluid.

2015-04-19 - Character Encoding: Endianness

Data is stored on computers as a series of bytes and the order in which these bytes are saved is based on the endianness of the system. big endian systems store the most significant byte (MSB) first while little endian systems store the least significant byte (LSB) first. For example consider the number 305,419,896 which is 0x12345678 in hex. Every two digit hex digit is a byte so in a big endian system the byte 0x12 would be saved first while in a little endian system 0x78 would be saved first.

Endianness Low Address     High Address
Big Endian 0x12 0x34 0x56 0x78
Little Endian 0x78 0x56 0x34 0x12

If the same system saves and loads the data then everything’s fine. If the value is saved on a little endian but read on a big endian system it would get the incorrect value of 2,018,915,346. The same would happen going from a big endian system to a little endian system.

Endianness is not specific to character encodings but it is one of the places where it’s most noticeable as text is commonly sent between computers. Because of this programs are often designed to read and write both ways so endianness is no longer a function of the computer being used but dependant on how the program saves the data.

UCS-2 and UCS-4 solve this problem using a Byte-Order-Mark (BOM). The character U+FEFF is placed at the start of the file to indicate the encoding and endianness of the file. U+FFFE is an invalid character so if it shows up at the beginning of a file then it can be assumed that the alternative endianness should be used. Big-endian is assumed if no BOM is present and the format is not otherwise specified.

BOM Encoding
0xFEFF UCS-2 Big Endian
0xFFFE UCS-2 Little Endian
0x0000FEFF UCS-4 Big Endian
0xFFFE0000 UCS-4 Little Endian

The BOM allows Unicode text to identify it’s own characteristics so that there’s no external information required to display the data correctly.

2015-04-12 - Character Encoding: UCS-2/UCS-4

Extended ASCII encodings allows for a large number of characters to be displayed but requires the use of multiple character sets within a single encoding. This means that a single value can map to multiple characters which causes problems when transmitting data or when a single document needs characters from several sets. To solve this problem a universal character set was created called Unicode. The 2 byte Universal Character Set (UCS-2) uses two bytes to encode all characters which allows for a much larger number of possible characters. The first 256 characters are similar to the English Windows-1252 code page and then characters from a wide variety of other languages and symbols make up the rest of the characters. There are no character sets so every value corresponds to only one character. Unicode characters codes use the format U+XXXX where U+ indicates that it’s a Unicode character and XXXX is the 4 digit hex value of the character.

  Language Range
English U+0000 - U+00FF
Cyrillic U+0400 - U+052F
Arabic U+0600 - U+077F
Greek U+0370 - U+03FF
Hebrew U+0590 - U+05FF
Chinese/Japanese/Korean U+4E00– U+9FFF

As more and more characters were identified and added to the standard it became clear that 2 bytes was not enough. This lead to a 4 byte Universal Character Set (UCS-4) to allow for even more characters. Characters in the range U+00000000 - U+0000FFFF are identical to UCS-2 and make up the Basic Multilingual Plane. Characters above U+0000FFFF make up the supplementary planes.

  Plane Range
Basic Multilingual Plane U+0000 – U+​FFFF
Supplementary Multilingual Plane U+10000 –​ U+1FFFF
Supplementary Ideographic Plane U+20000 – U+​2FFFF
Supplementary Special-purpose Plane U+E0000 – U+​EFFFF

These multi-byte character encodings allows for a vast number of characters to be encoded but the majority of these characters are not commonly used. This makes UCS-2 and especially UCS-4 space inefficient. There’s also compatibility issues with earlier encoding schemes if they are incorrectly read as UCS-2 or UCS-4.

2015-04-03 - Two Questions

I tend to get a lot of ideas for projects. The problem is figuring out before you start whether or not you are going to be motivated to work on and finish a project. After all you don’t get much from not working on a project. I ask myself two questions when considering new project ideas and focus on those that have decent answers to both. The first question is what can I learn from it? The second question is what problem does it solve?

Learning is a big part of why I do things. I want to know more than I currently know and grow my skills and understanding of the universe. So if a project can help me learn something than that provides a big incentive to keep at it. The more I work on the project the more practice I get and the more I learn. Understanding what I can learn from a project will give an indicator of my motivation level through the beginning of the project.

Learning only gets you so far though. There's an upper limit on how much you can learn from a single project. What becomes important later on is what the project will ultimately do. Finishing a project will allow me to use it. Understanding what problem the project solves will give an indicator of my motivation through the end of the project.

Usually I start with a topic that I want to learn about and then I think about a project that will solve a useful problem using that topic. That way I can maintain my motivation to work on the project beyond the learning phase and hopefully end up with something that I can show off. A finished project that shows what I know and makes doing something easier.