Some files in a computer system are written for humans and contain text.
% file /etc/hosts
/etc/hosts: ASCII text
But many other files are made for the computer to execute, and it isn’t possible to read them using a tool like cat
.
% cat /bin/ls | head
����@�
��Z������
This is because they are binary files
% file /bin/ls
/bin/ls: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/bin/ls (for architecture x86_64): Mach-O 64-bit executable x86_64
/bin/ls (for architecture arm64e): Mach-O 64-bit executable arm64e
TODO
How is it possible to build a binary with 2 architectures? If I copy/paste this file between an Intel and an M1 Mac, it runs properly on both! 🤯
However, it is possible to read them using a tool like hexdump
hexdump -C /bin/ls | head
00000000 ca fe ba be 00 00 00 02 01 00 00 07 00 00 00 03 |................|
00000010 00 00 40 00 00 01 1c c0 00 00 00 0e 01 00 00 0c |..@.............|
00000020 80 00 00 02 00 01 80 00 00 01 5a f0 00 00 00 0e |..........Z.....|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
The left letter of each pair is the high 4 bits and the second letter the
lower 4 bits. Not all bytes represent a visible character, so I’m going to take40
which does. When split, 4
can be represented as 0100
and 0
as0000
, merged back together forms the binary number 01000000
, or 64 in
decimal. Which happens to be the value for the character @
.
DEC | HEX | BIN | ASCII Symbol |
63 | 3F | 00111111 | ? |
64 | 40 | 01000000 | @ |
65 | 41 | 01000001 | A |
stateDiagram-v2 40 --> 4 40 --> 0 4 --> 0100 0 --> 0000 0100 --> 01000000 0000 --> 01000000 01000000 --> 64 64