Compact Header

Take input string of text, convert the input into a different chunks of bits based on the type of data. Convert bits to an array of utf-8 characters for transmission.
The receiver converts the utf-8 characters back into bits and decodes the bits into the original data.

Compact Header

type	input	bit length	bits
version number		8	0000 0001
time stamp		20
Sender Type:		2	00
sender id		48	0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
gridsquare		46	0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00
name		48	0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

Version Number is a range from 1 to 255.

Each section is a different color to help differentiate between the different parts of the header. The version is the first 8 bits, the timestamp is the next 20 bits, the sender type is the next 2 bits, the sender id is the next 48 bits, the gridsquare is the next 46 bits, and the name is the last 48 bits. The bits are then converted into utf-8 characters for transmission. The receiver will then convert the utf-8 characters back into bits and decode the bits into the original data.

Conversions

version + timestamp + sender type + sender id + gridsquare + name

"13181080ko6bva~AA00bb11ccalex okita~"

Combined header converted to bits and concatonated to a single string

0000 0000 0000 0000 0000

lots of tofu...

Timestamp

The timestamp is used to synchronize messages sent during the month. Each month the seconds reset to 0. To represent the largest possible number of seconds in a month (31 days), we need to account for 31 * 24 * 60 * 60 = 2,678,400 seconds. This can be represented using 22 bits (2^22 = 4,194,304, which is the smallest power of 2 greater than 2,678,400). However, if we only need to represent every three seconds, we can divide this number by 3, resulting in 892,800 distinct values. This can be represented using 20 bits (2^20 = 1,048,576, which is the smallest power of 2 greater than 892,800). Therefore, 20 bits are sufficient to describe the timestamp with a resolution of every three seconds from the start of the month.

Timestamp epoch seconds:
epoch bits:

The timestamp bits are pooled together after a version number.

Character Map

Next comes the character map, a reduced set of characters used to simplify sending numbers and letters. ASCII uses 8bits per character, but we'll want to use as few bits as possible. We will use three different character maps, one exclusively for letters, one for numbers, and one for both letters and numbers and a few symbols.

Map:

Map Length: 0

Character Map
Character	Binary	Decimal

This maps characters to a 6-bit binary representation. This is done to minimize the amount of data that needs to be sent over the air. By default the length of the map should be 39 characters. Using 6-bits we can represent a total of 64 characters but we will reserve some characters for special use cases later on (which have yet to be determined).

Letters Map

This map is exclusively for letters, using 5 bits per character to represent the 26 letters of the alphabet. Since this is primarily used for a name, then we also need to add a space and a "." which might come after an abreviation, some names have hyphens and an email would use an @. So since we can fill 32 characters we may as well use them all. In this case a "~" will be used as a terminating character.

Letters Map:

Map Length: 0

Letters Map
Letter	Binary	Decimal

Numbers Map

This map is exclusively for numbers, using 4 bits per character to represent the 10 digits (0-9). But the range of a 4 bit number, aka a "nibble" is 16, so we can fill the rest of the space with characters useful for phone numbers. We'll also need a terminating character to indicate the end of the number. Like the letters we'll also use a "~" to terminate the number.

Numbers Map:

Map Length: 0

Numbers Map
Number	Binary	Decimal

We're also starting with "0001" and not "0000" since too many 0s together might create an awkward utf-8 character.

Character to Bits

To help out along the way we'll use cursor.ai to help write up some functions. First we'll want to write a function that will take a character and return the binary representation of the character using a character map.

characterToBits(character, map)

This function is used to take a character and return the binary representation of the character using a character map. Valid maps are "characterMap", "lettersMap", and "numbersMap".

Character to Bits Character:
Map:

Bits to Utf-8

After the characters are encoded to bits we'll need to encode them into utf8 characters. Utf8 characters take 8 bits at a time, so if we're short by some characters we'll need to pad them with a few extra bits before transmitting them. The receiver will know how many bits to expect so after decoding utf8 characters back into bits we'll take the first n bits and discard the unused bits.

We use three functions to go to and from bits to utf8 characters.

paddedBits(bits)

bitsToUtf8(bits)

utf8toBits(utf8)

These functions are used to convert between bits and utf8 characters. The first function paddedBits will post fix 0s until the bits are a multiple of 8. The second function bitsToUtf8 will take the padded bits and convert them to utf8 characters. The third function utf8toBits will take the utf8 characters and convert them to bits.

The padding should only happen on the very last character in a utf8 string so the last extra bits can be discarded after the conversion. So if we had 42 bits we can convert using 48 bits to 6 utf-8 characters for transmission. Then once received, the encoding should recognize the first 42 bits and discard the unused 6 bits at the end.

bits to utf8 \ utf8 to bits Bits:

Utf-8:
Utf-8 Bits:

The above example shows how an input like 1111 can result in 11110000 bits with padding. This looks like utf8 character "ð" as a result. Then we can convert that back into bits as seen in the Utf-8 Bits box. Without padding something like 11111111 is "ÿ" so we can see that the un padded result is correct as well. The utf8 bits will be used to decode back into the header format. It's possible to have a string of 00000000 but that shouldn't happen unless the input is empty.

Bitstream to Utf-8 Bits:

Utf-8 String to Bits:

When a stream of bits is converted to utf8 characters, the padding will be lost. The receiver will need to know the original length of the bitstream to properly remove the padding. The bitstream is divided into chunks of 8 bits, and each chunk is converted to a utf8 character. If there are any bits left over at the end, they are padded with 0s. Then we can convert back to bits to get the original bitstream.

bitstreamToUtf8(bitstream)

utf8ToBitstream(utf8)

We add two more functions to help. First is the bitstreamToUtf8 where we take an arbitrary length of 1s and 0s and return utf8 characters. Then we have utf8ToBitstream where we take a string of utf8 characters and return the original bitstream.

With the above example we use "01010101111100001110000010101010" to convert into some utf8 characters "Uðàª" and then back to "01010101111100001110000010101010". This means we'll be able to build our bit stream from the different character maps, convert them to utf8 characters for transmission then once they're received the listener can unpack the utf8 back into bits and convert them back into the header information.

Sender ID

This section is a work in progress. Still trying to figure ou the smaller details on getting this to work properly. The main purpose of the first couple of bits is to identify the type of sender ID. Based on the type we will reserve a different number of bits for the sender ID, and we'll need to decode it properly when we get to that section.

First we take an input of some values. This could be one of three things. A ham radio call sign, a GMRS call sign, or a phone number. We'll want to reserve two bits to indicate which of the three it is.

We can use the functions as shown above to build our bit stream from the input. Based on the type of sender we will change which character map is used to build the stream of bits.

Sender ID Sender ID:
Sender ID Input:

Bits to Utf-8:
Utf-8 to Bits:

The above example shows how a ham radio call sign "KC7B" is converted to bits "001011000011100010000010" using the character map. Then we can see the bits are converted to utf8 characters ",8" and back to bits "001011000011100010000010". The result is the same as the input, which means the conversion is working properly. However the utf8 conversion back to bits is longer than the original bitstream. Based on the conversion version we should know the length of the expected bit stream so we can ignore extra padded 0s.

Sender ID Sender ID:
Sender Bitstream:

Sender ID from Bits:

Depending on the bitstreamType we use different maps to decode the bits back into characters.

Name

The name is a string of characters that is encoded the same way as the sender id but we use the letters map to encode the characters.

Name Name:

Name Bits:
Bits to Utf-8:
Utf-8 to Bits:
Name from Bits:

Name Length	Name Bits	Name Utf-8
0	0	0

The name could be short or fill in a max limit of 20 characters. As a result we will need to know the length of the name, or we'll need a terminating character to know when to stop decoding letters. In the letters map the 32nd character is a "~" which we'll use as a terminating character, so we'll have something like "Alex~" which means we'll only show "ALEX" as the decoded name. In the header the name will use a fixed set of bits, so with a short name we'll leave the unused bits filled as 1s. As observed before "00000000" when converted to a utf-8 character does some strange things.

Grid Square

A Grid square is a geo position indicated by two letters and two numbers. The letters are the latitude and the numbers are the longitude. The latitude is the row and the longitude is the column. The latitude goes from A to R and the longitude goes from A to V. So we can use [letter letter], [number number], [letter letter], [number number] to indicate the grid square saving several bits based on knowing the pattern.