Hex dump

In computing, a hex dump is a textual hexadecimal view (on screen or paper) of (often, but not necessarily binary) computer data, from memory or from a computer file or storage device. Looking at a hex dump of data is usually done in the context of either debugging, reverse engineering or digital forensics.[1]

In a hex dump, each byte (8 bits) is represented as a two-digit hexadecimal number. Hex dumps are commonly organized into rows of 8 or 16 bytes, sometimes separated by whitespaces. Some hex dumps have the hexadecimal memory address at the beginning.

Some common names for this program function are hexdump, hd, od, xxd and simply dump or even D.

Samples

A sample text file:

0123456789ABCDEF
/* ********************************************** */
	Table with TABs (09)
	1       2       3
	3.14	6.28	9.42

as displayed by Unix hexdump:

0000000 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46
0000010 0a 2f 2a 20 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a
0000020 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a
*
0000040 2a 2a 20 2a 2f 0a 09 54 61 62 6c 65 20 77 69 74
0000050 68 20 54 41 42 73 20 28 30 39 29 0a 09 31 09 09
0000060 32 09 09 33 0a 09 33 2e 31 34 09 36 2e 32 38 09
0000070 39 2e 34 32 0a                                 
0000075

The leftmost column is the hexadecimal displacement (or address) for the values of the following columns. Each row displays 16 bytes, with the exception of the row containing a single *. The * is used to indicate multiple occurrences of the same display were omitted. The last line displays the number of bytes taken from the input.

An additional column shows the corresponding ASCII character translation with hexdump -C or hd:

00000000  30 31 32 33 34 35 36 37  38 39 41 42 43 44 45 46  |0123456789ABCDEF|
00000010  0a 2f 2a 20 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |./* ************|
00000020  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |****************|
*
00000040  2a 2a 20 2a 2f 0a 09 54  61 62 6c 65 20 77 69 74  |** */..Table wit|
00000050  68 20 54 41 42 73 20 28  30 39 29 0a 09 31 09 09  |h TABs (09)..1..|
00000060  32 09 09 33 0a 09 33 2e  31 34 09 36 2e 32 38 09  |2..3..3.14.6.28.|
00000070  39 2e 34 32 0a                                    |9.42.|
00000075

This is helpful when trying to locate TAB characters in a file which is expected to use multiple spaces.


The -v option causes hexdump to display all data verbosely:

00000000  30 31 32 33 34 35 36 37  38 39 41 42 43 44 45 46  |0123456789ABCDEF|
00000010  0a 2f 2a 20 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |./* ************|
00000020  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |****************|
00000030  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |****************|
00000040  2a 2a 20 2a 2f 0a 09 54  61 62 6c 65 20 77 69 74  |** */..Table wit|
00000050  68 20 54 41 42 73 20 28  30 39 29 0a 09 31 09 09  |h TABs (09)..1..|
00000060  32 09 09 33 0a 09 33 2e  31 34 09 36 2e 32 38 09  |2..3..3.14.6.28.|
00000070  39 2e 34 32 0a                                    |9.42.|
00000075

od

POSIX command can be used to display a hex dump with the -t x option.

# od -tx1 tableOfTabs.txt
0000000    30  31  32  33  34  35  36  37  38  39  41  42  43  44  45  46
0000020    0a  2f  2a  20  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a
0000040    2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a
*
0000100    2a  2a  20  2a  2f  0a  09  54  61  62  6c  65  20  77  69  74
0000120    68  20  54  41  42  73  20  28  30  39  29  0a  09  31  09  09
0000140    32  09  09  33  0a  09  33  2e  31  34  09  36  2e  32  38  09
0000160    39  2e  34  32  0a                                            
0000165

Character evaluations can be added with the -c option:

0000000    0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
           30  31  32  33  34  35  36  37  38  39  41  42  43  44  45  46
0000020   \n   /   *       *   *   *   *   *   *   *   *   *   *   *   *
           0a  2f  2a  20  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a
0000040    *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
           2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a  2a

0000100    *   *       *   /  \n  \t   T   a   b   l   e       w   i   t
           2a  2a  20  2a  2f  0a  09  54  61  62  6c  65  20  77  69  74
0000120    h       T   A   B   s       (   0   9   )  \n  \t   1  \t  \t
           68  20  54  41  42  73  20  28  30  39  29  0a  09  31  09  09
0000140    2  \t  \t   3  \n  \t   3   .   1   4  \t   6   .   2   8  \t
           32  09  09  33  0a  09  33  2e  31  34  09  36  2e  32  38  09
0000160    9   .   4   2  \n                                            
           39  2e  34  32  0a                                            
0000165

In this output the TAB characters are displayed as \t and NEWLINE characters as \n.

DUMP, DDT and DEBUG

In the CP/M 8-bit operating system used on early personal computers, the standard DUMP program would list a file 16 bytes per line with the hex offset at the start of the line and the ASCII equivalent of each byte at the end.[2] Bytes outside the standard range of printable ASCII characters (20 to 7E) would be displayed as a single period for visual alignment. This same format was used to display memory when invoking the D command in the standard CP/M debugger DDT.[3] Later incarnations of the format (e.g. in the DOS debugger DEBUG) changed the space between the 8th and 9th byte to a dash, without changing the overall width.

This notation has been retained in operating systems that were directly or indirectly derived from CP/M, including DR-DOS, MS-DOS, OS/2 and Windows. On Linux systems, the command hexcat produces this classic output format too. The main reason for the design of this format is that it fits the maximum amount of data on a standard 80-character-wide screen or printer, while still being very easy to read and skim visually.

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00   edit...........

Here the leftmost column represents the address at which the bytes represented by the following columns are located. CP/M and various DOS systems ran in real mode on the x86 CPUs, where addresses are composed of two parts (base and offset).

In the above examples the final 00s are non-existent bytes beyond the end of the file. Some dump tools display other characters so that it is clear they are beyond the end of the file, typically using spaces or asterisks, e.g.:

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74                                    edit

or

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74 ** ** ** ** ** ** ** ** ** ** **   edit

See also

References

  1. "02: hexdump | COMPSCI 365/590F | Digital Forensics (Spring 2017)". people.cs.umass.edu. Retrieved 2022-09-05.
  2. CP/M 2.2 Manual page 1-41 and pages 5-40 to 5-46
  3. CP/M 2.2 Manual page 4-5
  • How to Use the Hexdump Unix Utility Extensive examples.
  • hdr Hexdump with colored ranges to ease visualization. Options to skip data, displaying bitfields, complex range definition, ... follow the link to 'hdr_examples.pod'.
  • Hex cheatsheet for looking up byte-nibbles and nibble-bits.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.