Parity Volume Set
Parity Volume Set (also known as parity archive or parchive) is a file format for storing redundant data for one or more input files. These data can be used to repair the input files if they get damaged. The error correction is based on the Reed-Solomon algorithm. Three versions of the format exist: Par1, Par2 and Par3. The Par3 format is in "near-final form"[2], it is used by an old version of MultiPar tool,[3] as well as par3cmdline
.[4]
Contents[hide] |
Discussion
Historically, these were multi-part archives that was distributed in Usenet (a.k.a., "network news"), but can still be used in prevention of complete data loss during transit or storage. Parchive is like RAID for files instead of a whole file system.
The technology is based on a 'Reed-Solomon Code' implementation that allows for recovery of any 'X' real data-blocks for 'X' parity data-blocks present. (Data-blocks referring to files OR much smaller virtual slices of files).[5]
Modern Par2
software can take advantage of GPU to speed up recovery file creation.[6][7]
While Par3
has yet to be finalized as of writing in 2025, the "2022-01-28 ALPHA DRAFT" specifications addresses interesting flaws that has existed since its conception:
Major differences from Parchive 2.0 are: ...(redacted for brevity) * replace MD5 hash (It is both slow and less secure.) ...(redacted for brevity) Part of "support any linear code" is to fix the major bug in Parchive 2.0. Parchive 2.0 did not do Reed-Solomon encoding as it promised. There was a major mistake in the paper that Parchive 2.0 relied on. The problem manifested as a bug in Parchive 1.0 and, while Parchive 2.0 reduced its occurrence, it did not fix the problem. Parchive 2.0 did not use an always invertible matrix; it essentially used a random matrix, which (luckily) is invertible with high probability. Parchive 3.0 fixes that bug. The other part of "support any linear code" is supporting codes beside Reed-Solomon. Reed-Solomon has excellent data protection, but is slow to compute. LDPC and sparse random matrices will speed things up dramatically, with a slight increase in errors that cannot be recovered from.
Identification
A Par1 file starts with the following byte sequence:
50 41 52 00 00 00 00 00
This corresponds to the ASCII text string PAR
, followed by 5 null bytes.
A Par2 file starts with the bytes:
50 41 52 32 00 50 4B 54
This corresponds to ASCII text string PAR2
, followed by a null byte and the text string PKT
.
Finally, a Par3 file can be identified by the following 4-byte sequence:
50 41 33 00
This corresponds to the text string PA3
, followed by a null byte.
Specifications
Specification version | SourceForge/Internet Archive link | GitHub link |
Parity Volume Set Specification v1.0 | SourceForge | GitHub |
Parity Volume Set Specification 2.0 | SourceForge | GitHub |
proposal for Parchive Specification 3.0 | hp.vector.co.jp IA mirror | GitHub |
par2 Examples
Create uniformed recovery file sizes with 100% redundancy for example.dwarfs
par2 create -u -r100 example.dwarfs
This makes it more like Par1[9]
Software
- Windows
- Mac
- Linux
Sample files
Par1 sample files
See Search results with par extensions - Discmaster.textfiles.com for sample Par1
files.
Par1
files are usually distributed in a set, containing <name>.par
and .p<num>
, where <name>
is the name of the file, typically to be created as a parity archive of, and <num>
is an integer that starts with 01
, incrementing for each and every related Par1
archive.[10]
See Also:
- Search results with p01 extensions - Discmaster.textfiles.com
- Search results with p02 extensions - Discmaster.textfiles.com
Par2 sample files
See Search results with par2 extensions and are likely parity archive - Discmaster.textfiles.com for samples.
These files are usually distributed in a set, containing <name>.par2
and <name>.vol<numA>+<numB>.par2
, where name
is the name of the file, typically to be created as a parity archive of, and <num>
is an incrementing number, and is often starts with 0
for <numA>.[11]
Additionally, Par2
files bear .par2
extension, making identification easier and less ambiguous compared to Par1
, which has .par
extension, and can be confused with extensions that also begins with the same .par
.
Links
- Wikipedia:Parchive
- Par2 Files Explained in Plain English (Broken link) [Internet Archive copy]
- Parchive project page on GitHub
- Parchive project page on SourceForge.net (Legacy)
References
- ↑ parchive Files - SourceForge.net
- ↑ Commit 4c1b780 - 2022-01-29 - par3cmdline - GitHub
- ↑ Par3 support? #46 - MultiPar - GitHub
- ↑ par3cmdline - GitHub
- ↑ Parchive: Parity Archive Tool - SourceForge.net
- ↑ GPU Acceleration via par2j64.exe??? Is it possible? How do I do it? #40 - MultiPar - GitHub
- ↑ Added support for GPU acceleration (CUDA) on recovery file creation. #176 - par2cmdline - GitHub
- ↑ Parity Volume Set Specification 3.0 (2022-01-28 ALPHA DRAFT) - GitHub
- ↑ Why is PAR 2.0 better than PAR 1.0? - par2cmdline - GitHub
- ↑ Par2 Files Explained in Plain English - Internet Archive copy
- ↑ Par2 Files Explained in Plain English - Internet Archive copy