ZIP

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
Line 30: Line 30:
 
! Code
 
! Code
 
! Compression scheme
 
! Compression scheme
 +
! Notes and references
 
|-
 
|-
 
|0 || Uncompressed
 
|0 || Uncompressed
 
|-
 
|-
|1 || Shrink
+
|1 || Shrink || Used by PKZIP prior to v2.0.
 
|-
 
|-
|2–5 || Reduce
+
|2–5 || Reduce || Used by PKZIP v0.x.
 
|-
 
|-
|6 || Implode (Shannon–Fano)
+
|6 || Implode (Shannon–Fano coding) || Used by PKZIP v1.x. See also [[TTComp archive]].
 
|-
 
|-
|8 || [[DEFLATE]]
+
|8 || [[DEFLATE]] || Used by PKZIP v2.0+.
 
|-
 
|-
 
|9 || Deflate64
 
|9 || Deflate64
Line 74: Line 75:
 
* [http://www.info-zip.org/ Info-ZIP]: [http://www.info-zip.org/Zip.html Zip], [http://www.info-zip.org/UnZip.html UnZip]
 
* [http://www.info-zip.org/ Info-ZIP]: [http://www.info-zip.org/Zip.html Zip], [http://www.info-zip.org/UnZip.html UnZip]
 
* [[7-Zip]]
 
* [[7-Zip]]
* [http://zziplib.sourceforge.net/ zziplib]
 
** [http://search.cpan.org/~vspader/Archive-ZZip-0.13/ZZip/ZZip.pm Archive::ZZip]: Perl bindings for zziplib
 
 
* [[zlib]] - The zlib library does not support ZIP format, but it is distributed with "minizip" code that supports most ZIP files.
 
* [[zlib]] - The zlib library does not support ZIP format, but it is distributed with "minizip" code that supports most ZIP files.
 
* [http://www.nih.at/libzip/ libzip] - Uses zlib.
 
* [http://www.nih.at/libzip/ libzip] - Uses zlib.
 
* [http://www.libarchive.org/ libarchive] - Uses zlib.
 
* [http://www.libarchive.org/ libarchive] - Uses zlib.
 +
* [http://zziplib.sourceforge.net/ zziplib]
 +
** [http://search.cpan.org/~vspader/Archive-ZZip-0.13/ZZip/ZZip.pm Archive::ZZip]: Perl bindings for zziplib
 
* [http://code.google.com/p/miniz/ miniz]
 
* [http://code.google.com/p/miniz/ miniz]
 +
* PKZIP
 +
** [http://cd.textfiles.com/1stcanadian/utils/pkz110/pkz110.exe PKZIP 1.10] (MS-DOS binary)
 +
** [http://www.ibiblio.org/pub/packages/ccic/software/dos/utils/pkz204g.exe PKZIP 2.04g] (MS-DOS binary)
  
 
== Sample files ==
 
== Sample files ==
Line 88: Line 92:
  
 
== Links ==
 
== Links ==
* [[Wikipedia:Zip (file format)|ZIP (file format): Wikipedia]]
+
* [[Wikipedia:Zip (file format)|Wikipedia: Zip (file format)]]
 +
* [[Wikipedia:PKZIP|Wikipedia: PKZIP]]
 
* [http://research.swtch.com/zip Zip files all the way down] (creating an infinitely-regressed ZIP file)
 
* [http://research.swtch.com/zip Zip files all the way down] (creating an infinitely-regressed ZIP file)
 
* [http://imgur.com/a/PbN8H#1 ZIP101 an archive walkthrough]
 
* [http://imgur.com/a/PbN8H#1 ZIP101 an archive walkthrough]

Revision as of 13:33, 22 March 2014

Not to be confused with Zip disk, an unrelated disk cartridge unit.
File Format
Name ZIP
Ontology
Extension(s) .zip
MIME Type(s) application/zip
LoCFDD fdd000354, fdd000355, fdd000362, fdd000361
PRONOM x-fmt/263
UTI com.pkware.zip-archive
Released 1989

ZIP is one of the most popular file compression formats. It was created in 1989 as the native format of the PKZIP program, which was introduced by Phil Katz in the wake of a lawsuit (which he lost) against him by the makers of the then-popular ARC program (and file format) for copyright and trademark infringement in an earlier program PKARC which had been file-compatible with ARC. This resulted in Katz creating a new file format, which rapidly overtook ARC in popularity (to a large extent because of BBS sysops, then the primary users of such compression, resenting the lawsuit). Many programs have been released for a variety of operating systems to compress and decompress ZIP files, and native support for the format is built into several popular operating systems.

ZIP implementations vary in their support for features in the specification from PKWARE[1], particularly features added since version 2 (1993), some of which are protected by patents and require licensing. Many implementations limit the use of compression to the DEFLATE algorithm, introduced with version 2. Extensions incorporated into the specification that have been widely adopted are: long filenames; large files (using a technique known as ZIP64); and filenames in UTF-8. In 2011 work began on an interoperable subset of the latest APPNOTE.TXT with the intention of publication as ISO/IEC 21320-1, Document Container File -- Part 1: Core. As of November 2012, a discussion draft is available[2]. Designed to promote interoperable implementations, the draft ISO/IEC 21320-1 prohibits compression other than using DEFLATE, segmentation or multiple volumes, and features that are subject to patents.

While .zip is the usual file extension, ZIP-formatted files can be found with many other extensions since a number of other file formats use ZIP compression but store their files in application-specific extensions. See Category:ZIP based file formats for a list of such formats.

Contents

See also

Identification

The byte sequence 'P' 'K' 0x05 0x06 (the "end of central directory signature") appears somewhere near the end of the file, almost always beginning 22 bytes from the end of the file. However, a few ZIP-based formats have extra non-ZIP data (such as a digital signature) at the end of the file. Robust unzip utilities are able to scan for the signature.

Most ZIP files (but not self-extracting ZIP files) happen to begin with 'P' 'K' 0x03 0x04. This is not a global file signature, but is the signature that appears once for every compressed file inside the ZIP file. Some ZIP-based formats are designed such that they necessarily begin in this way.

Compression

Each file in a ZIP file is compressed using one of a number of compression algorithms. Only compression types 0 (uncompressed) and 8 (DEFLATE) are likely to be seen in modern portable ZIP files. In old ZIP files, types 1 (Shrink) and 6 (Implode) are common.

Code Compression scheme Notes and references
0 Uncompressed
1 Shrink Used by PKZIP prior to v2.0.
2–5 Reduce Used by PKZIP v0.x.
6 Implode (Shannon–Fano coding) Used by PKZIP v1.x. See also TTComp archive.
8 DEFLATE Used by PKZIP v2.0+.
9 Deflate64
10 PKWARE Data Compression Library Imploding (old IBM TERSE)
12 Bzip2
14 LZMA (EFS)
18 IBM TERSE (new)
19 IBM LZ77 z Architecture (PFS)
97 WavPack
98 PPMd version I, Rev 1

Specifications

Software

Sample files

  • 1608A.ZIP → D1-MAC.ZIP: Example of a file that uses the uncommon "Reduce" compression scheme

References

  1. http://www.pkware.com/documents/casestudies/APPNOTE.TXT
  2. http://kikaku.itscj.ipsj.or.jp/sc34/open/1855.pdf

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox