Variable-length quantity
m (Jsummers moved page Variable-length base-128 to Variable-length quantity: Slightly generalizing this article) |
(Partial rewrite) |
||
Line 3: | Line 3: | ||
|subcat=Elements of File Formats | |subcat=Elements of File Formats | ||
}} | }} | ||
− | + | By '''variable-length quantity''' ('''VLQ'''), we refer to a variety of schemes for encoding an integer in a variable number of [[byte]]s, where the required number of bytes grows as the magnitude of the integer grows. Such formats can encode arbitrarily large integers, though artificial limits are sometimes imposed. | |
− | + | As far as we know, there is no official standard for such number formats. Each file format specification that uses them typically uses its own definition, and invents its own name. | |
− | + | ||
− | + | == Details == | |
+ | The most common VLQ formats use a '''variable-length base-128''' scheme in which 7 bits of each byte constitute a base-128 digit, and the 8th bit indicates whether there are any more digits. | ||
− | The | + | Even this simple case has a combinatorial explosion of variants: |
+ | * The flag bit could be the most significant bit, or the least significant bit. | ||
+ | * The flag for end-of-number could be 0, or 1. | ||
+ | * The most-significant block of 7 bits could come first (big-[[Endianness|endian]] style), or last (little-endian style). | ||
+ | * If negative numbers are supported, there are several different ways to do it. | ||
+ | * There could be a defined maximum number of bytes. If so, then for maximal-length codes, the final byte's flag bit could be repurposed. | ||
− | + | Instead of having one flag bit per byte, there could be one every two bytes, resulting in a base-32768 encoding. | |
== Related formats == | == Related formats == | ||
− | Among the formats that use variable-length | + | Among the formats that use variable-length quantities are: |
* [[BPG]] ("ue7") | * [[BPG]] ("ue7") | ||
* [[DWARF]] ("LEB128", "ULEB128", "SLEB128") | * [[DWARF]] ("LEB128", "ULEB128", "SLEB128") | ||
Line 27: | Line 32: | ||
* [[Wikipedia:LEB128|Wikipedia: LEB128]] | * [[Wikipedia:LEB128|Wikipedia: LEB128]] | ||
− | [[Category: | + | [[Category:Integer formats]] |
Revision as of 19:21, 23 August 2017
By variable-length quantity (VLQ), we refer to a variety of schemes for encoding an integer in a variable number of bytes, where the required number of bytes grows as the magnitude of the integer grows. Such formats can encode arbitrarily large integers, though artificial limits are sometimes imposed.
As far as we know, there is no official standard for such number formats. Each file format specification that uses them typically uses its own definition, and invents its own name.
Details
The most common VLQ formats use a variable-length base-128 scheme in which 7 bits of each byte constitute a base-128 digit, and the 8th bit indicates whether there are any more digits.
Even this simple case has a combinatorial explosion of variants:
- The flag bit could be the most significant bit, or the least significant bit.
- The flag for end-of-number could be 0, or 1.
- The most-significant block of 7 bits could come first (big-endian style), or last (little-endian style).
- If negative numbers are supported, there are several different ways to do it.
- There could be a defined maximum number of bytes. If so, then for maximal-length codes, the final byte's flag bit could be repurposed.
Instead of having one flag bit per byte, there could be one every two bytes, resulting in a base-32768 encoding.
Related formats
Among the formats that use variable-length quantities are:
- BPG ("ue7")
- DWARF ("LEB128", "ULEB128", "SLEB128")
- HLP (WinHelp) and Segmented Hypergraphics ("compressed short"). The "compressed long" type is a 16-bit variant.
- MIDI ("variable-length quantity")
- Protobuf ("base 128 varint")
- WBMP ("multi-byte integer")