Formats&Protocols
- Linux Foundation Referenced Specifications
- CyberChef A simple, intuitive web app for analysing and decoding data without having to deal with complex tools or programming languages. CyberChef encourages both technical and non-technical people to explore data formats, encryption and compression.
- Bit Twiddling Hacks
Endianess
Numeric format
- One hot is an encoding such that only one bit is one, all the other are zero
- The secret life of NaN
- Float toy
$$ e^{i\pi} + 1^{{\tt NaN}\cdot\inf} = 0 $$
(source).
Character encoding
Notation
Caret notation is a notation for unprintable control characters in ASCII encoding. The notation consists of a caret (^) followed by a capital letter.
ASCII
Stands for American Standard Code for Information Interchange
, it's a 7bits code.
ASCII reserves the first 32 codes (numbers 0–31 decimal) for control characters: codes originally intended not to represent printable information, but rather to control devices (such as printers) that make use of ASCII, or to provide meta-information about data streams such as those stored on magnetic tape.
Unicode
Unicode is not an encoding but is a classification of characters: though each one is identified by a number, this number is not used directly in its representation: 'UTF-8' is particular representation where for compatibility purpose the ASCII set is maintained as is.
Links
- http://www.cl.cam.ac.uk/~mgk25/unicode.html
- http://canonical.org/~kragen/strlen-utf8.html
- http://nedbatchelder.com/text/unipain/unipain.html
- http://farmdev.com/talks/unicode/
- http://www.joelonsoftware.com/articles/Unicode.html
- http://www.2ality.com/2013/09/javascript-unicode.html
- http://the-pastry-box-project.net/oli-studholme/2013-october-8/
- http://www.utf8everywhere.org/
- https://speakerdeck.com/mathiasbynens/hacking-with-unicode
- http://agiliq.com/blog/2014/11/character-encoding-and-unicode/
- http://eev.ee/blog/2015/09/12/dark-corners-of-unicode/
- http://reedbeta.com/blog/programmers-intro-to-unicode/
- Hacking GitHub with Unicode's dotless 'i'
For testing purpose use
“Iñtërnâtiônàlizætiøn” looks like E2 80 9C 49 C3 B1 74 C3 AB 72 6E C3 A2 74 69 C3 B4 6E C3 A0 6C 69 7A C3 A6 74 69 C3 B8 6E E2 80 9D in UTF-8 in hex.
An implementation of strlen()
is the following:
# original from <http://canonical.org/~kragen/strlen-utf8.html>
.global strlen_utf8
strlen_utf8:
push %esi
cld
mov 8(%esp), %esi
xor %ecx, %ecx
loopa: dec %ecx
loopb: lodsb
shl $1, %al
js loopa # x1xxxxxx
jc loopb # 1xxxxxxx
jnz loopa # 00xxxxxx
mov %ecx, %eax
not %eax
pop %esi
ret
Punycode
Punycode is a representation of Unicode with the limited ASCII character subset used for Internet hostnames.
Regex
Compression related (Zip, GZip)
The most common compression formats are all using the DEFLATE
algorithm defined in
RFC 1951, in particular ZLIB
(defined in
RFC 1950 and GZip
(defined in
RFC 1952); the Zip
format instead is defined
here.
Links
- The Hitchhiker’s Guide to Compression
- LZ77&LZ78
- LZW
- LZSS
- LZMA
- Dissecting the GZIP format
- Zip Files All The Way Down
- The structure of a PKZip file
- A better zip bomb
- How to make compressed file quines, step by step
- YouFLATE An interactive tool that allows you to DEFLATE (compress) data using your own length-distance pairs, not merely the most efficient ones as is default with DEFLATE.
- infgen Deflate disassember to convert a deflate, zlib, or gzip stream into a readable form.
- Gzip = Zip = Zlib = Deflate ?
JPEG
PNG
- PNG Merge this is a Poc (Proof of Concept) to create a PNG file that contains different images somehow entangled together.
- How To Bypass CSP By Hiding JavaScript In A PNG Image
- Reference
- Let's write a PDF file speakerdeck
- PDF file format basic structure
- https://www.aldeid.com/wiki/Analysis-of-a-malicious-pdf
- http://esec-lab.sogeti.com/posts/2009/06/26/at-least-4-ways-to-die-opening-a-pdf.html
- https://www.osdefsec.com/analyzing-malicious-pdf/
- http://eternal-todo.com/tools/peepdf-pdf-analysis-tool
H264
JSON
ELF
An ELF
file is identified by four magic bytes \x7FELF
; it has an header
that gives general information about the types of ELF
file that can be
- Relocatable file
- Executable
- Shared objects/library
the architecture and the entry point.
It defines a series of sections and segments (program header and section header), respectively describing the execution and the linking for the file.
The kernel loads in memory only the PT_LOAD
type and if is defined an interpreter (in
the section PT_INTERP
) call the interpreter to do its job (i.e. resolve the dynamic sections).
This is a prime that is also an ELF
:
7f454c46010101000000000000000000020003000100000054800408340000000000000000000000340020000100000000000000010000000000000000800408008004085b0000005b0000000500000000100000b32a31c040cd80597ec9b11d
- Specification
- Cheating the ELF
- LIEF The purpose of this project is to provide a cross platform library which can parse, modify and abstract ELF, PE and MachO formats
- Why doesn't
strip
remove section headers from ELF executables? - 0pack A novel technique to hide code from debuggers
- Analyzing ELF Binaries with Malformed Headers Part 1 - Emulating Tiny Programs
- Palindromic 64 bit ELF binaries
- ELF program header virtual address and file offset
- Chain loading, not preloading: the dynamic linker as a virtualization vector
- ELF interposition and -Bsymbolic
Relocation
Relocations are the entities used by the linking process, dynamic or not
Core dumps
- core(5) man page
- core File Format
- slide Linux Core Dumps
TLS
Dwarf
- https://www.ibm.com/developerworks/aix/library/au-dwarf-debug-format/index.html
- http://www.dwarfstd.org/doc/Debugging%20using%20DWARF-2012.pdf
- https://stackoverflow.com/questions/5954140/dumping-c-structure-sizes-from-elf-object-file
- http://wiki.dwarfstd.org/index.php?title=DWARF_FAQ
- https://landley.net/kdocs/ols/2007/ols2007v2-pages-35-44.pdf
- https://maskray.me/blog/2020-11-08-stack-unwinding
Links
- Anatomy of a program in memory POST
- Cheating ELF PDF
- Dissection of an ELF file
- Reversing an ELF from the ground up
- Source code of binfmt_elf
- How programs get run: ELF binaries
- A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux
- Executable and Linkable Format
- Writeup about reversing ELF with forged sections/segments
- Executable and Linkable Format 101 - Part 1 Sections and Segments
- Executable and Linkable Format 101. Part 2: Symbols
- Shared libraries and execute permissions
- Custom ELF program headers—what, why and how
- Introduction to the ELF Format
- ELF Header
- Understanding Program Headers
- Section Header Table
- Section Types and Special Sections
- Understanding C start up .init_array and .fini_array sections
MS-DOS&PE
Mach-O
QRcode
Compact disc
UART
USB
- How does a USB keyboard work? video by Ben Eater