V4 Package format

This document describes the RPM file format version 4, which is used by RPM versions 4.x and with limitations, readable with 3.x.

THE PROPER WAY TO ACCESS THESE STRUCTURES IS THROUGH THE RPM LIBRARY!!

The RPM file format covers both source and binary packages. An RPM package file is divided in 4 logical sections:

. Lead      -- 96 bytes of "magic" and other info
. Signature -- collection of "digital signatures"
. Header    -- holding area for all the package information (aka "metadata")
. Payload   -- compressed archive of the file(s) in the package (aka "payload")

All applicaple integer quantities are stored in network byte order (big-endian). When data is presented, the first number is the byte number, or address, in hex, followed by the byte values in hex, followed by character “translations” (where appropriate).

Lead

The Lead is used for identifying RPM package files. The rest of the data in the Lead is historical only.

The Lead is always 96 bytes long and starts with a four byte “magic” [ 0xED, 0xAB, 0xEE, 0xDB ]. For further details, refer to the Lead Format document.

Signature

The Signature uses the same underlying data structure as the Header, but is zero-padded to a multiple of 8 bytes.

The Signature consists of a single immutable header region denoted by it’s first tag of 62 (RPMTAG_HEADERSIGNATURES) which can be used to identify it.

The Signature can contain several tags of different types:

Name Tag Header Type
HEADERSIGNATURES 62 BIN
DSA 267 BIN
RSA 268 BIN
SHA1 269 STRING
LONGSIZE 270 INT_64
LONGARCHIVESIZE 271 INT_64
SHA256 272 STRING
FILESIGNATURES 274 STRING_ARRAY
FILESIGNATURELENGTH 275 INT_32
VERITYSIGNATURES 276 STRING_ARRAY
VERITYSIGNATUREALGO 277 INT_32
SIZE 1000 INT_32
PGP 1002 BIN
MD5 1004 BIN
GPG 1005 BIN
RESERVEDSPACE 1008 BIN

All packages carry at least HEADERSIGNATURES, (LONG)SIZE, MD5 and SHA1, and since rpm >= 4.14, SHA256 tags.

The MD5 binary hash is 16 bytes long. Other binary tag sizes vary depending on key parameters and such. As a special case, the RESERVEDSPACE tag is used as a space reservation for signatures to allow for much faster package signing.

The 64bit size tags are only used in packages 4GB or larger in size, but otherwise follow the same behavior as their 32bit counterparts.

On digitally signed packages, one of RSA or DSA tags is present and contains an OpenPGP signature on the header. The RSA tag is used for RSA signatures and the DSA tag is used for both EcDSA and original DSA signatures. Additionally, an RPM v3 signature with the same key on the header+payload may be present. For these, the PGP tag is used for RSA signatures and the GPG tag is used for both EcDSA and original DSA signatures.

In addition, a package may also have either IMA or fsverity signatures on it’s files. If present, these are in FILESIGNATURES and VERITYSIGNATURES tags respectively.

Note: some of the tag numbers clash with those of the main header, care must be taken not to mix them up.

The Header contains all the information about a package: name, version, file list, etc. It uses the same underlying data structure as the Signature.

The Header consists of a single immutable header region denoted by it’s first tag of 63 (RPMTAG_HEADERIMMUTABLE) which can be used to identify it.

The complete list of tags is documented here.

Payload

The Payload is a cpio archive, gzipped by default. The cpio archive type used is SVR4 with a CRC checksum.

As cpio is limited to 4 GB (32 bit unsigned) file sizes RPM since version 4.12 uses a stripped down version of cpio for packages with files > 4 GB. This format uses 07070X as magic bytes and the file header otherwise only contains the index number of the file in the RPM header as 8 byte hex string. The file metadata that is normally found in a cpio file header - including the file name - is completely omitted as it is stored in the RPM header already.

To use a different compression method when building new packages with rpmbuild(8), define the %_binary_payload or %_source_payload macros for the binary or source packages, respectively. These macros accept an RPM IO mode string (only w mode).

Differences to V3

The main differences of the V4 package format are:

  • The immutable header region in Signature and Header
  • Header-only signatures and hashes on the immutable region
  • File paths stored in “compressed” format (dirname, dirindex, basename)

Later versions extended the format in various ways that are not backwards compatible with early v4 if used:

  • Support for 64bit integers (rpm >= 4.6, unreadable by older)
  • Support for packages over 4GB (rpm >= 4.6, unreadable by older)
  • Support for individual files over 4GB (rpm >= 4.12, unreadable by older)
  • Per-file signatures (rpm >= 4.13)
  • Separate hash on the payload (rpm >= 4.14)