Binary ↔ ASCII hex code converter
The binasc program functions similar to a hex editor. Bytes can be converted into hexadecimal byte codes that in turn can be converted back into bytes. Byte codes can be expressed in multiple ways such as hexadecimal, decimal, and binary. In addition, decimal codes can be big- or little-endian, spanning multiple bytes. Also 4- and 8-byte floating point ASCII values can be convert into their binary byte equivalents.
binasc [-a | -b | -c output.bin ] input [ > output.txt ]
cat input.bin | binasc [-a|-b] [ > output.txt ]
cat input.txt | binasc -c output.bin
By default, binasc
7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01 00 00 00 ac
; E L F
8c 04 08 34 00 00 00 68 5e 00 00 00 00 00 00 34 00 20 00 05 00 28 00 16 00
; 4 h ^ 4 (
15 00 06 00 00 00 34 00 00 00 34 80 04 08 34 80 04 08 a0 00 00 00 a0 00 00
; 4 4 4
00 05 00 00 00 04 00 00 00 03 00 00 00 d4 00 00 00 d4 80 04 08 d4 80 04 08
;
13 00 00 00 13 00 00 00 04 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 00
;
80 04 08 00 80 04 08 78 5a 00 00 78 5a 00 00 05 00 00 00 00 10 00 00 01 00
; x Z x Z
00 00 78 5a 00 00 78 ea 04 08 78 ea 04 08 2c 02 00 00 38 03 00 00 06 00 00
; x Z x x , 8
00 00 10 00 00 02 00 00 00 04 5c 00 00 04 ec 04 08 04 ec 04 08 a0 00 00 00
; \
a0 00 00 00 06 00 00 00 04 00 00 00 2f 6c 69 62 2f 6c 64 2d 6c 69 6e 75 78
; / l i b / l d - l i n u x
2e 73 6f 2e 32 00 00 25 00 00 00 38 00 00 00 00 00 00 00 0d 00 00 00 20 00
; . s o . 2 % 8
The two main viewing options are `-a` and `-b`. The `-a` option will suppress display of the hex bytes and only show ASCII printable characters. Printable characters will be separated by a space when one or more intermediate bytes are not printable (or the printable character is a space). The -a functions similar to the `strings` command-line program available on most unix systems, and is a good way to search for text in a binary file. Here is printable character only output using the same file as in the default style show above:
`binasc -a input`
ELF 4 h^ 4 ( 4 4 4 xZ xZ xZ x x , 8 \ /lib/ld-linux.so.2 % 8 # / 5 ! % , "
& 7 $ 6 ) 1 + 0 - 2 3 4 ( ' * . ) p ? ` h E 1 K " ] L " n \ " | " L h U < i
( < > ( 8 @ ( = D > K > e , v 0 , ) E . l I l 3 y E | Q i a C \ | ' | ! !
__gmon_start__ libg++.so.2.7.2 _DYNAMIC _GLOBAL_OFFSET_TABLE_ _init _fini
__builtin_vec_new __builtin_delete __builtin_new __builtin_vec_delete
__ls__7ostreamPCc __ctype_b __ctype_tolower write__7ostreamPCci
get__7istreamRc _vt.3ios _vt.7ostream.3ios __ls__7ostreami cerr exit
__strtod_internal __ls__7ostreamc cout strchr strcmp atexit
libstdc++.so.2.7.2 __11fstreambasei _vt.7istream.3ios _vt.8ifstream.3ios
__11fstreambaseiPCcii open__11fstreambasePCcii _vt.8iostream.3ios
_vt.7fstream.3ios close__11fstreambase _._7fstream _._8ifstream
getline__7istreamPcic read__7istreamPci hex__FR3ios __ls__7ostreaml
endl__FR7ostream libm.so.6 libc.so.6 __libc_init_first bsearch qsort
__strtol_internal strcpy strncpy strtok _environ __environ environ _start
_etext _edata __bss_start _end 1 0 @ h | - ! ( ' , * + ) $ . / % # " & U S
The width of each text line can be controlled with the `--width` option. For example, here is the same text wrapped into 40 columns instead of the default of 75 columns:
`binasc -a --width 40 input`
ELF 4 h^ 4 ( 4 4 4 xZ xZ xZ x x , 8 \
/lib/ld-linux.so.2 % 8 # / 5 ! % , " & 7
$ 6 ) 1 + 0 - 2 3 4 ( ' * . ) p ? ` h E
1 K " ] L " n \ " | " L h U < i ( < > (
8 @ ( = D > K > e , v 0 , ) E . l I l 3
y E | Q i a C \ | ' | ! !
__gmon_start__ libg++.so.2.7.2 _DYNAMIC
_GLOBAL_OFFSET_TABLE_ _init _fini
__builtin_vec_new __builtin_delete
__builtin_new __builtin_vec_delete
__ls__7ostreamPCc __ctype_b
__ctype_tolower write__7ostreamPCci
get__7istreamRc _vt.3ios
_vt.7ostream.3ios __ls__7ostreami cerr
exit __strtod_internal __ls__7ostreamc
cout strchr strcmp atexit
libstdc++.so.2.7.2 __11fstreambasei
_vt.7istream.3ios _vt.8ifstream.3ios
__11fstreambaseiPCcii
open__11fstreambasePCcii
_vt.8iostream.3ios _vt.7fstream.3ios
close__11fstreambase _._7fstream
_._8ifstream getline__7istreamPcic
read__7istreamPci hex__FR3ios
__ls__7ostreaml endl__FR7ostream
libm.so.6 libc.so.6 __libc_init_first
bsearch qsort __strtol_internal strcpy
strncpy strtok _environ __environ
environ _start _etext _edata __bss_start
_end 1 0 @ h | - ! ( ' , * + ) $ . / % #
" & U S
Alternately, the `-b` option produces only the hex byte code for each byte in the file (similar to the BSD [http://en.wikipedia.org/wiki/Hexdump hexdump] utility). Unlike the [http://en.wikipedia.org/wiki/Od_(Unix) od] command, bytes are not grouped into two-byte words when displayed as hexadecimal numbers (which will switch order of the bytes in the output display on little-endian computers). Here is example output when using the `-b` option using the same file as in previous examples:
7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01 00 00 00 ac
8c 04 08 34 00 00 00 68 5e 00 00 00 00 00 00 34 00 20 00 05 00 28 00 16 00
15 00 06 00 00 00 34 00 00 00 34 80 04 08 34 80 04 08 a0 00 00 00 a0 00 00
00 05 00 00 00 04 00 00 00 03 00 00 00 d4 00 00 00 d4 80 04 08 d4 80 04 08
13 00 00 00 13 00 00 00 04 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 00
80 04 08 00 80 04 08 78 5a 00 00 78 5a 00 00 05 00 00 00 00 10 00 00 01 00
00 00 78 5a 00 00 78 ea 04 08 78 ea 04 08 2c 02 00 00 38 03 00 00 06 00 00
00 00 10 00 00 02 00 00 00 04 5c 00 00 04 ec 04 08 04 ec 04 08 a0 00 00 00
a0 00 00 00 06 00 00 00 04 00 00 00 2f 6c 69 62 2f 6c 64 2d 6c 69 6e 75 78
2e 73 6f 2e 32 00 00 25 00 00 00 38 00 00 00 00 00 00 00 0d 00 00 00 20 00
00 00 15 00 00 00 00 00 00 00 07 00 00 00 0b 00 00 00 23 00 00 00 01 00 00
00 1d 00 00 00 14 00 00 00 16 00 00 00 0c 00 00 00 00 00 00 00 2f 00 00 00
0e 00 00 00 00 00 00 00 00 00 00 00 35 00 00 00 19 00 00 00 21 00 00 00 1f
binasc file1 > file2
binasc file2 -c file3
; file1 and file3 should be the same
binasc -b file1 > file2
binasc file2 -c file3
; file1 and file3 should be the same
binasc -a file1 > file2
binasc file2 -c file3 ; this results in an error
See the [examples] page for example files to compile with the `-c` option.
#define SEQ 03 04 05
00 01 02 SEQ
SEQ SEQ
Running the above code through the C preprocessor gives:
# 1 "input.txt"
# 1 ""
# 1 ""
# 1 "input.txt"
00 01 02 03 04 05
03 04 05 03 04 05
Example use of the C preprocessor when compiling a file:
cpp input.txt | binasc -c output.bin
A more advanced example that can define the substitution text for `SEQ` externally to the file:
#ifndef SEQ
#define SEQ 03 04 05
#endif
00 01 02 SEQ
SEQ SEQ
cpp -DSEQ="FF EE DD" input.txt
# 1 "input.txt"
# 1 ""
# 1 ""
# 1 "input.txt"
00 01 02 ff ee dd
ff ee dd ff ee dd
3.2 Hexadecimal numbers
Hexadecimal numbers specify one byte and must contain no more than 2 digits in the range from 00 to ff (0 to 255 decimal, or -128 to 127 as signed decimal values). The letter digits A-F can be either upper case or lower case. Examples of valid hexadecimal numbers:
7f 45 4c 46 1 1 1 0 0
8c 04 08 34 0 0 0 8 e
15 00 06 10 0 0 4 0 0
3.3 Binary numbers
Binary numbers can be specified by plain numbers longer than three characters or numbers containing (but not starting with) a comma. A binary number is allowed to have up to 8 digits (bits) since a binary number represents one byte in the output file. An optional comma is expected to split the number into two equal parts with 4 bits on each side of the comma. If there are fewer then 4 digits on either side of the comma, zeros will be inferred to the left of the given digits for each half (nibble) of the byte.
For example "`0010`" is the binary number which is equal to the decimal number "4". The binary number "`0010`" can also be represented equivalently as "`0,0010`" and "`0000,0010`". Note that "`10`" is the hexadecimal number equal to the decimal number "`16`" and is _not_ the binary number equal to the decimal number "2".
3.4 Decimal numbers (ints and floats)
Decimal numbers, unlike hexadecimal or binary numbers, can fill slots of 1`-`4 bytes for integers, and 4 or 8 bytes for floating-point decimal numbers. Decimal numbers may also be either positive or negative unlike the hexadecimal or binary number input to binasc compiling. A decimal number starts with a quote character (') followed by the number with no intervening space. There are two qualifications which can be given just before the quote (in either order):
* a number in the range from 1 to 4 which specifies how many bytes into which the integer decimal number is to be stored. Floating-point numbers can be either stored in either 4 or 8 bytes. The default size for floating-point numbers is 4 bytes if no prefix size is specified.
*the symbol "`u`" can be given before the quote character in a decimal number to indicate the sequence order into which the bytes for the number will be placed in the file. No letter "`u`" means that the most significant byte is written first (_big-endian_), while including the prefix letter "`u`" indicates to write the bytes in reverse order with the smallest byte occurring first (_little-endian_). For example the decimal number 1234 can be represented by the two-byte hexadecimal number 04d2. In big-endian storage the `04` byte is written first, then the `d2` byte. in little-endian storage the `d2` byte is written first then the `04` byte:
|| *decimal* || *hex* || *big endian* || *little endian* ||
|| 1234 || 04d2 || `04 d2` || `d2 04` ||
|| || || `2'1234` || `2u'1234` ||
When a byte size is not specified before the quote character, the default is 1 for integers. When not specifying a byte size, valid decimal numbers are in the range from 0 to 255, or `-`128 to 127 if signed, i.e., the range for one-byte decimal numbers is from `-`128 to 255, and you have to know the representation later (signed or unsigned). If you specify a byte size of 1, then you can give any integer number value, but it will be truncated to fit into one byte. The maximum integer decimal number which can fill 4 bytes is 4294967294 or so. (hexadecimal `ff ff ff ff`).
More examples of decimal numbers:
|| *token* || *decimal #* || *hex*||
|| `'0` || 0 || `00`||
|| `'255` || 255 || `ff`||
|| `'256` || 0 || `00` (truncated) ||
|| `2'256` || 256 || `01 00` (not truncated)||
|| `4'44100` || 44100 || `00 00 ac 44` (big-endian)||
|| `4u44100` || 44100 || `44 ac 00 00` (little-endian)||
|| `4u'453` || 453 || `c5 01 00 00` ||
|| `u4'453` || 453 || `c5 01 00 00' (`u4'` is equivalent to `4u'`) ||
|| `2'-5` || -5 || `ff fb`||
|| `3'500000` || 500000 || `07 a1 20` ||
If a decimal number includes a period character (`.`) it is assumed to be a floating-point number. Floating-point numbers can be either 4 or 8 bytes.
|| *token* || *decimal* || *hex* ||
|| `'3.1415` || 3.1415 || `40 49 0e 56` ||
|| `4'3.1415` || 3.1415 || `40 49 0e 56` ||
|| `u'3.1415` || 3.1415 || `56 0e 49 40` ||
|| `8'3.1415` || 3.1415 || `40 09 21 ca c0 83 12 6f` ||
|| `8u'3.1415` || 3.1415 || `6f 12 83 c0 ca 21 09 40` ||
|| *invalid examples* || *reason* ||
|| `123` || does not start with a quote character ||
|| `'256`|| Exceeds the storage space of one byte (use a multi-byte indication). in this case, `'256' is equivalent to `1'256` which will truncate to `1'0`, or `00` hex. ||
3.5 ASCII characters
To insert literal ASCII characters into compiled output, precede each character with a plus (`+`). Each character is a separate token. For example to place the characters "`cat`" into a file, the tokenization would be "`+c +a +t`".
3.6 Variable Length Values
Variable-length values are used to store delta times in standard MIDI files. They are a form of compression so that small 4-byte integers can be represented by a single byte. To create a VLV, the bits of a 4-byte integer are grouped into 7-bit pieces. Any most-significant groupings containing only zeros are ignored (except for the least-significant grouping). The remaining groups are placed into separate bytes, with the most significant bit of each byte representing a _continuation bit_. If the continuation bit is "`1`", then there is at least one more byte after the current byte in the file which belongs to the VLV. If the continuation bite is "`0`", then the current byte is the last byte in the VLV.
To indicate a variable-length value in the input file for compiling with binasc, prefix a decimal number with the letter `v`, such as `v100` which will be translated into `64` hex. Variable length values can only be used to store up to 4 bytes of an integer. The resulting VLV will be between 1 to 5 bytes long.
Here are more examples of VLVs:
|| *VLV* || *byte expansion* ||
|| `v0` || `00` ||
|| `v127` || `7f`||
|| `v128` || `81 00` ||
|| `v123456` || `87 c4 40` ||
3.7 MIDI pitch-bend data bytes
MIDI pitch-bend data bytes contain a 14-bit integer which is split into two 7-bit values stored with the least-significant byte coming first (little-endian). The minimum value 0 is represented by the two bytes `00 00` and the maximum value is represented by the two bytes `7f 7f`. The middle of the range is `00 40`.
In the input file used to compile a file with the binasc program, use the letter `p` followed (without space) by a floating-point number in the range from `-`1.0 to +1.0. The plus sign is optional for positive values, as is any leading zero. Values outside of the valid range will be truncated to the maximum or minimum value.
Below are example conversions of pitch-bend tokens into hexadecimal values. The cents column shows the number of cents deviation from the standard pitch if the default depth of the pitch bend is a whole tone (which it usually is). If this assumption is true, then `cent = 200 * value`.
|| *pitch bend token* || *hex bytes* || *cents* ||
|| `p0` || `00 40` || 0 ||
|| `p1` or `p+1`|| `7f 7f` || 200 (wholetone) ||
|| `p-1` || `00 00` || `-`200 ||
|| `p0.5` or `p.5` || `7f 5f` || 100 (semitone) ||
|| `p-.25` || `7f 4f` || `-`50 (quartertone) ||
|| `p-0.3333` || `55 2a` || `-`66.67 ||
3.8 MIDI tempo meta message data bytes
Tempo in a standard MIDI file is given as a three-byte integer representing
the duration of a quarter note in microseconds. For example a tempo of
60 beats per minute has one beat per second, and each second is a million
microseconds, so the tempo 60BPM is represented in a MIDI file as 1000000.
To indicate a tempo in the input data for the `-c` compile process, prefix the letter `t` to a floating-point value.
|| *tempo token* || *decimal form* || *hex bytes* ||
|| `t60` || `3'1000000`|| `0f 42 40` ||
|| `t120` || `3'500000` || `07 a1 20` ||
|| `t40` || `3'1500000`|| `16 e3 60` ||
|| `t144` || `3'416667` || `06 5b 9b` ||
|| `t63` || `3'952381` || `0e 88 3d` ||
|| `t132.45` || `3'453001` || `06 e9 89` ||
Tempo is given in meta message 51 hex, so here is an example full event in a MIDI file using the `t` marker for tempo
|| `v0 ff 51 03 t120` || `00 ff 51 03 07 a1 20` ||
4. Examples
Example files for compiling with the `-c` option that demonstrate various methods of representing bytes as described above can be found on the [examples] page. Examples can be downloaded via Mercurial (if installed on your computer) with the command:
`hg clone https://wiki.binasc.googlecode.com/hg binasc-wiki`
The example compiled files and their companion ASCII files are found in `binasc-wki/files/examples`.
5. Downloads
Compiled versions of binasc are available for [https://code.google.com/p/binasc/downloads/detail?name=binasc.linux Linux], [https://code.google.com/p/binasc/downloads/detail?name=binasc.osx#makechanges OS X] and [https://code.google.com/p/binasc/downloads/detail?name=binasc.exe Windows] on the [https://code.google.com/p/binasc/downloads/list Download page].
Source code can be viewed online [https://code.google.com/p/binasc/source/browse/ here]. To download the source code, click on the _zip_ link on that source-code browse page. The source code can also be downloaded using the Mercurial repository system (if you have it installed on your computer):
`hg clone https://code.google.com/p/binasc`
The source code should be easy to compile on linux or OS X by typing:
`cd binasc; make`
To copy the program to `/usr/bin` type:
`make install`
To verify that the program is available from the command-line:
`which binasc`
This command should reply with the path to binasc:
`/usr/bin/binasc`
*option* | *meaning* |
-a | Display only ASCII printable characters contained in binary input file (no hex bytes) |
-b | Display only hex bytes contain in binary input file (no ASCII-printable characters) |
-c file | Input file contains hex bytes (or other formats of bytes described below) which will be compiled into binary data stored in `file` |
--mod # | Set the number of hex bytes displayed on each line. The default is 25 hex bytes. |
--wrap # | Set the line length when the `-a` option is used. The default is 75 characters. |
--midi | parse binary data as a standard MIDI file. |
-h | view help for the program. |