A Collection of Code Snippets in as Many Programming Languages as Possible
This project is maintained by TheRenegadeCoder
Welcome to the Base64 Encode Decode page! Here, you'll find a description of the project as well as a list of sample programs written in various languages.
This article was written by:
Base64 is a popular method of encoding strings and other data. It can encode images, text, JSON, and almost any other format as well. It is also URL-safe.
In this project you will be encoding normal text to Base64-encoded text and decoding Base64-encoded text into normal text.
From Base64 Table from Wikipedia, this is the Base64 Alphabet:
Index | Binary | Char | Index | Binary | Char | Index | Binary | Char | Index | Binary | Char | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000000 | A | 16 | 010000 | Q | 32 | 100000 | g | 48 | 110000 | w | |||
1 | 000001 | B | 17 | 010001 | R | 33 | 100001 | h | 49 | 110001 | x | |||
2 | 000010 | C | 18 | 010010 | S | 34 | 100010 | i | 50 | 110010 | y | |||
3 | 000011 | D | 19 | 010011 | T | 35 | 100011 | j | 51 | 110011 | z | |||
4 | 000100 | E | 20 | 010100 | U | 36 | 100100 | k | 52 | 110100 | 0 | |||
5 | 000101 | F | 21 | 010101 | V | 37 | 100101 | l | 53 | 110101 | 1 | |||
6 | 000110 | G | 22 | 010110 | W | 38 | 100110 | m | 54 | 110110 | 2 | |||
7 | 000111 | H | 23 | 010111 | X | 39 | 100111 | n | 55 | 110111 | 3 | |||
8 | 001000 | I | 24 | 011000 | Y | 40 | 101000 | o | 56 | 111000 | 4 | |||
9 | 001001 | J | 25 | 011001 | Z | 41 | 101001 | p | 57 | 111001 | 5 | |||
10 | 001010 | K | 26 | 011010 | a | 42 | 101010 | q | 58 | 111010 | 6 | |||
11 | 001011 | L | 27 | 011011 | b | 43 | 101011 | r | 59 | 111011 | 7 | |||
12 | 001100 | M | 28 | 011100 | c | 44 | 101100 | s | 60 | 111100 | 8 | |||
13 | 001101 | N | 29 | 011101 | d | 45 | 101101 | t | 61 | 111101 | 9 | |||
14 | 001110 | O | 30 | 011110 | e | 46 | 101110 | u | 62 | 111110 | + | |||
15 | 001111 | P | 31 | 011111 | f | 47 | 101111 | v | 63 | 111111 | / | |||
Padding | = |
This alphabet is used for both encode and decode.
Base64 encode works as follows:
For example, let's use the string kitten
. The 3-byte chunks are kit
and ten
. Let's
focus on the first chunk: kit
. Using 8-bit ASCII:
k
= 107
(01101011
)i
= 105
(01101001
)t
= 116
(01110100
)So, the 24-bit group is this:
01101011 01101001 01110100
Dividing this in 6-bit groups gives this:
011010 110110 100101 110100
Since it's easier to work with this as decimal, the table indices are these:
26 54 37 52
Looking these indices up in the table results a Base64 string of a2l0
.
Following the same procedure for ten
(left as an exercise for the reader),
the Base64 string is dGVu
. Together, kitten
encodes to a2l0dGVu
.
For the case when the last chunk is not 3 bytes long, extend the last bits with zeros until it is 6 bits long, and add padding characters until the last Base64 string is 4 bytes long.
For example, let's consider the case where the last chunk is k
:
k
is 01101011
.011010 11
.011010 110000
.26 48
.aw
.aw==
.For the final encode example, let's consider the case when the last chunk is
ki
:
ki
is 01101011 01101001
.011010 110110 1001
.011010 110110 100100
.26 54 36
.a2k
.a2k=
.Before dividing into the algorithm for how to decode a Base64 string, let's talk about some rules for what constitutes a valid Base64 string:
Assuming the Base64 string is valid, and ignoring the case where the last 4-byte chunk has padding characters, decode works as follows:
Let's use the Base64 string a2l0dGVu
. The 4-byte chunks are a2l0
and
dGVu
. Let's focus on the first chunk: a2l0
.
Looking up each byte in the table results in this:
a
is index 26
(011010
)2
is index 54
(110110
)l
is index 37
(100101
)0
is index 52
(110100
)Converting the binary indices to 8-bit groups results in this:
01101011 01101001 01110100
Convert this back to ASCII results in this:
01101011
(107
) = k
01101001
(105
) = i
01110100
(116
) = t
Following the same procedure for dGVu
(left as an exercise for the reader),
decodes to ten
. Together, a2l0dGVu
encodes to kitten
.
For the case when the last chunk has pad characters, decode any complete 8-bit chunk, and ignore any chunk that is shorter than 8 bits.
For example, let's consider the case where the last chunk is aw==
. Looking
up the non-pad characters results in this:
a
is index 26
(011010
)w
is index 48
(110000
)Dividing this into 8-byte groups results in this:
01101011 0000
Ignoring the last 4 bits results in this:
01101011
(107
) = k
Therefore, this decodes to k
.
For the final decode example, let's consider the case where the last chunk
is a2k=
. Looking up the non-pad characters results in this:
a
is index 26
(011010
)2
is index 54
(110110
)k
is index 36
(100100
)Dividing this into 8-byte groups results in this:
01101011 01101001 00
Ignoring the last 2 bits results in this:
01101011
(107) = k
01101001
(105) = i
Therefore, this decodes to ki
.
You can read more about Base64 in Wikipedia.
Write a program that accepts two parameters: the mode (encode
or decode
) and
some text.
decode
, it should print the decoded Base64 text.encode
, it should print the encoded text.$ ./base64-encode-decode.lang "encode" "hello world"
aGVsbG8gd29ybGQ=
$ ./base64-encode-decode.lang "decode" "aGVsbG8gd29ybGQ="
hello world
Acceptable language utilities include language features and built-in libraries. External dependencies are unacceptable. Remember, the goal is to show off language features and utilities.
In this project, the algorithm must handle ASCII strings. You don't need to worry about handling a string in the general case.
Every project in the Sample Programs repo should be tested. In this section, we specify the set of tests specific to Base64 Encode Decode. In order to keep things simple, we split up the testing as follows:
Description | Input | Output |
---|---|---|
Lowercase String | "encode" "hello world" | "aGVsbG8gd29ybGQ=" |
Long String | "encode" "They swam along the boat at incredible speeds." | "VGhleSBzd2FtIGFsb25nIHRoZSBib2F0IGF0IGluY3JlZGlibGUgc3BlZWRzLg==" |
Numbers | "encode" "1234567890" | "MTIzNDU2Nzg5MA==" |
Symbols | "encode" "xyz!#$%&()*+,-./:;<=>?@[\]^_`{|}~" | "eHl6ISMkJSYoKSorLC0uLzo7PD0+P0BbXF1eX2B7fH1+" |
All Base64 Characters | "encode" "! }gggIIT55;qqs!!Gjjb??=~~2$$+;;i::x..4kk,ppnoo" | "ISAgfWdnZ0lJVDU1O3FxcyEhR2pqYj8/PX5+MiQkKzs7aTo6eC4uNGtrLHBwbm9v" |
Description | Input | Output |
---|---|---|
Lowercase String | "decode" "aGVsbG8gd29ybGQ=" | "hello world" |
Long String | "decode" "VGhleSBzd2FtIGFsb25nIHRoZSBib2F0IGF0IGluY3JlZGlibGUgc3BlZWRzLg==" | "They swam along the boat at incredible speeds." |
Numbers | "decode" "MTIzNDU2Nzg5MA==" | "1234567890" |
Symbols | "decode" "eHl6ISMkJSYoKSorLC0uLzo7PD0+P0BbXF1eX2B7fH1+" | "xyz!#$%&()*+,-./:;<=>?@[\]^_`{|}~" |
All Base64 Characters | "decode" "ISAgfWdnZ0lJVDU1O3FxcyEhR2pqYj8/PX5+MiQkKzs7aTo6eC4uNGtrLHBwbm9v" | "! }gggIIT55;qqs!!Gjjb??=~~2$$+;;i::x..4kk,ppnoo" |
Description | Input |
---|---|
No Input | |
Invalid Mode | "blue" "Oh look a Pascal triangle" |
All of these tests should output the following:
Usage: please provide a mode and a string to encode/decode
Description | Input |
---|---|
Missing String | "encode" |
Empty String | "encode" "" |
All of these tests should output the following:
Usage: please provide a mode and a string to encode/decode
Description | Input |
---|---|
Missing String | "decode" |
Empty String | "decode" "" |
Length Number Not Multiple Of 4 | "decode" "hello+world" |
Invalid Characters | "decode" "hello world=" |
Too Many Pad Characters At End | "decode" "MTIzNDU2Nzg5M===" |
Pad Characters In Middle | "decode" "MTIzNDU2=Nzg5M==" |
All of these tests should output the following:
Usage: please provide a mode and a string to encode/decode
There are 8 articles: