Howdy, PNG! | Weblog
By David Buchanan, 16th of January 2023
PNG is my favorite file format of all time. Model 1.0 of the specification was released in 1996 (earlier than I used to be born!) and the format stays broadly used to this present day. I believe the principle causes it caught round for thus lengthy are:
- It is “Ok” at lossless picture compression.
- It builds on current applied sciences (zlib/DEFLATE compression).
- It is easy to implement (helped by the above level).
- It helps quite a lot of modes and bit-depths, together with “true colour” (24-bit RGB) and transparency.
- It is not patented.
There are different similarly-old and similarly-ubiquitous codecs (cough ZIP cough) which are disgusting to cope with resulting from legacy cruft, ad-hoc extensions, spec ambiguities, and mutually incompatible implementations. On the entire, PNG just isn’t like that in any respect, and it is largely resulting from its well-thought-out design and cautious updates through the years.
I am writing this text to fulfil my function as a PNG evangelist, spreading the enjoyment of good-enough lossless picture compression to each nook of the web. Related articles exist already, however this one is mine.
I will be referencing the Working Draft of the PNG Specification (Third Edition) launched in October 2022 (!), however each characteristic I point out right here ought to nonetheless be current within the 1.0 spec. I am going to goal to replace this text as soon as the Third Version releases formally.
Writing a PNG File
I believe the easiest way to familiarize yourself with a file format is to jot down code for studying or writing it. On this occasion we will write a PNG, as a result of we are able to select to concentrate on the best subset of PNG options.
A minimum-viable PNG file has the next construction:
PNG signature || "IHDR" chunk || "IDAT" chunk || "IEND" chunk
The PNG signature (aka “magic bytes”) is defined as:
"89 50 4E 47 0D 0A 1A 0A" (hexadecimal bytes)
Or, expressed as a Python bytes literal:
These magic bytes have to be current firstly of each PNG file, permitting applications to simply detect the presence of a PNG.
PNG Chunks
After the signature, the remainder of the PNG is only a sequence of Chunks. They every have the identical total structure:
Size - A 31-bit unsigned integer (the variety of bytes within the Chunk Information subject) Chunk Sort - 4 bytes of ASCII higher or lower-case characters Chunk Information - "Size" bytes of uncooked knowledge CRC - A CRC-32 checksum of the Chunk Sort + Chunk Information
PNG makes use of Network Byte Order (aka “big-endian”) to encode integers as bytes. “31-bit” just isn’t a typo – PNG defines a “PNG 4 byte integer”, which is proscribed
to the vary 0 to 231-1, to defend in opposition to the existence of C programmers.
If you happen to’re not acquainted with these ideas, don’t be concerned – Python will deal with all of the encoding for us.
The Chunk Sort
, in our occasion, can be considered one of IHDR
, IDAT
, or IEND
(extra on these later).
The CRC subject is a CRC-32 checksum. The spec offers a terse mathematical definition, however we are able to ignore all these particulars and use a library to deal with it for us.
The which means of information inside a bit is determined by the chunk’s kind, and probably, context from prior chunks.
Placing all that collectively, here‘s a Python script that generates a vaguely PNG-shaped file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
import zlib # https://www.w3.org/TR/2022/WD-png-3-20221025/#5PNG-file-signature PNG_SIGNATURE = b'x89PNGrnx1an' # https://www.w3.org/TR/2022/WD-png-3-20221025/#5Chunk-layout def write_png_chunk(stream, chunk_type, chunk_data): # https://www.w3.org/TR/2022/WD-png-3-20221025/#dfn-png-four-byte-unsigned-integer chunk_length = len(chunk_data) if chunk_length > 2**31 - 1: # That is unlikely to ever occur! elevate ValueError("This chunk has an excessive amount of chonk!") # https://www.w3.org/TR/2022/WD-png-3-20221025/#5CRC-algorithm # Fortuitously, zlib's CRC32 implementation is suitable with PNG's spec: crc = zlib.crc32(chunk_type + chunk_data) stream.write(chunk_length.to_bytes(4, "massive")) stream.write(chunk_type) stream.write(chunk_data) stream.write(crc.to_bytes(4, "massive")) if __name__ == "__main__": """ This isn't going to lead to a legitimate PNG file, but it surely's a begin """ ihdr = b" |