Variable-length quantity
(→Related formats) |
|||
Line 11: | Line 11: | ||
The encoded integer may be unsigned, or it may be a signed integer using, e.g., [[two's complement]] representation. | The encoded integer may be unsigned, or it may be a signed integer using, e.g., [[two's complement]] representation. | ||
+ | |||
+ | Many variations are possible. Instead of bytes, 16-bit (base-32768) or larger code units can be used. If there is a defined limit on the number of bytes (or code units), then for maximal-length codes, the final byte's flag bit can be repurposed. | ||
== Related formats == | == Related formats == | ||
Line 16: | Line 18: | ||
* [[BPG]] ("ue7") | * [[BPG]] ("ue7") | ||
* [[DWARF]] ("LEB128", "ULEB128", "SLEB128") | * [[DWARF]] ("LEB128", "ULEB128", "SLEB128") | ||
− | * [[HLP (WinHelp)]] and [[Segmented Hypergraphics]] ("compressed short"). The "compressed long" type is a 16-bit | + | * [[HLP (WinHelp)]] and [[Segmented Hypergraphics]] ("compressed short"). The "compressed long" type is a 16-bit variant. |
* [[MIDI]] ("variable-length quantity") | * [[MIDI]] ("variable-length quantity") | ||
* [[Protobuf]] ("base 128 varint") | * [[Protobuf]] ("base 128 varint") |
Revision as of 01:13, 14 January 2015
Variable-length base-128 refers a family of formats for encoding arbitrarily large integers. It does not seem to have a well-established name, though it's sometimes called variable-length quantity (VLQ), and specific forms of it may be given names by the specifications that use them.
Details
The integer to be encoded is written in binary, then partitioned into as many 7-bit blocks as necessary, each of which is stored in a byte. The remaining bit in each byte (usually the most-significant bit) is used as a flag to indicate whether there are any more bytes. Most commonly, the flag bit is 1 in every byte except the last.
Both little-endian (where the 7 data bits of the first byte are the least-significant 7 bits of the encoded integer) and big-endian (where they are the most-significant) variants exist.
The encoded integer may be unsigned, or it may be a signed integer using, e.g., two's complement representation.
Many variations are possible. Instead of bytes, 16-bit (base-32768) or larger code units can be used. If there is a defined limit on the number of bytes (or code units), then for maximal-length codes, the final byte's flag bit can be repurposed.
Related formats
Among the formats that use variable-length base-128 are:
- BPG ("ue7")
- DWARF ("LEB128", "ULEB128", "SLEB128")
- HLP (WinHelp) and Segmented Hypergraphics ("compressed short"). The "compressed long" type is a 16-bit variant.
- MIDI ("variable-length quantity")
- Protobuf ("base 128 varint")
- WBMP ("multi-byte integer")