File identification software
From Just Solve the File Format Problem
(Difference between revisions)
Dan Tobias (Talk | contribs) |
m (Formatting) |
||
(15 intermediate revisions by 7 users not shown) | |||
Line 8: | Line 8: | ||
* [[Apache Tika]] (cross-platform, open source, [http://tika.apache.org/ website]): "The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries." Written in [[Java]]. | * [[Apache Tika]] (cross-platform, open source, [http://tika.apache.org/ website]): "The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries." Written in [[Java]]. | ||
− | * [[DROID]] (cross-platform, open source, [http://digital-preservation.github.com/droid/ website]): ''"DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats."'' Requires [[Java]] | + | * Detect It Easy ([https://github.com/horsicq/Detect-It-Easy website]) |
+ | * [[DROID]] (cross-platform, open source, [http://digital-preservation.github.com/droid/ website]): ''"DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats."'' Requires [[Java]] 7 or 8 (Version 6.1.5). | ||
* [[FIDO]] (cross-platform, open source) [http://www.openplanetsfoundation.org/software/fido website]: Format Identification for Digital Objects, written in [[Python]]. | * [[FIDO]] (cross-platform, open source) [http://www.openplanetsfoundation.org/software/fido website]: Format Identification for Digital Objects, written in [[Python]]. | ||
* [[FIDOO]] (web-based online file identification): [http://www.techmaurice.com/fidoo/ website] | * [[FIDOO]] (web-based online file identification): [http://www.techmaurice.com/fidoo/ website] | ||
− | * [[File command]] (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the [ | + | * [[File command]] (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the [https://manpages.debian.org/man1/file Debian man page] for an overview, and [http://openpreservation.org/blog/2012/08/09/magic-editing-and-creation-primer/ this] guide to creating "magic" entries for it. |
* [[File Information Tool Set]]: software from the Harvard University library to identify file formats and extract metadata | * [[File Information Tool Set]]: software from the Harvard University library to identify file formats and extract metadata | ||
− | *[[FI Tools]] (Windows, commercial, [http://www.forensicinnovations.com/fitools.html website]) | + | * [[FI Tools]] (Windows, commercial, [http://www.forensicinnovations.com/fitools.html website]) |
+ | * [[GetTyp]] and GT2 | ||
* [[G-Spot]] (Windows, freeware, [http://www.headbands.com/gspot/ website]): Identifies [[audio]] and [[video]] codecs need to play a media file. | * [[G-Spot]] (Windows, freeware, [http://www.headbands.com/gspot/ website]): Identifies [[audio]] and [[video]] codecs need to play a media file. | ||
+ | * [[IDArc]] | ||
* [[JHOVE]] (tool to classify/identify/validate file formats) | * [[JHOVE]] (tool to classify/identify/validate file formats) | ||
+ | * [[Konvertor]] (tool to display and convert between 4000+ different formats, freeware [http://www.logipole.com website]) | ||
+ | * [[Magika]] | ||
* [[MediaInfo]] (cross-platform, open source, [http://mediainfo.sourceforge.net/en website]): "MediaInfo is a convenient unified display of the most relevant technical and tag data for video and audio files." | * [[MediaInfo]] (cross-platform, open source, [http://mediainfo.sourceforge.net/en website]): "MediaInfo is a convenient unified display of the most relevant technical and tag data for video and audio files." | ||
− | * [[Siegfried]] (signature-based file identification tool) [http://www.itforarchivists.com/siegfried website] [http://www.openplanetsfoundation.org/blogs/2014-09-27-siegfried-pronom-based-file-format-identification-tool blog post] | + | * [[PHP PRONOM drip]]: Recognize file formats using PRONOM registry (open source, [http://www.phpclasses.org/package/9095-PHP-Recognize-file-formats-using-PRONOM-registry.html website]) |
− | * [[TrID]] (Windows/Linux, free for non-commercial use, [http://mark0.net/soft-trid-e.html website]): identifies files using a database of filetype signatures. Also has an [http://mark0.net/onlinetrid. | + | * [[Siegfried]] (signature-based file identification tool): [http://www.itforarchivists.com/siegfried website] · [http://www.openplanetsfoundation.org/blogs/2014-09-27-siegfried-pronom-based-file-format-identification-tool blog post] · PRONOM:{{PRONOM|fmt/883}} |
+ | * [[TrID]] (Windows/Linux, free for non-commercial use, [http://mark0.net/soft-trid-e.html website]): identifies files using a database of filetype signatures. Also has an [http://mark0.net/onlinetrid.html online version]. | ||
+ | * [[The Unarchiver]] Has a "lsar" tool that identifies archives well | ||
+ | * [https://gitlab.com/bunnylin/98ripper 98ripper] 98ripper has an identification mode `98ripper -i` that can identify PC-98 disk images | ||
+ | * [https://github.com/temisu/ancient_format_decompressor ancient] ancient tool has an identification mode `ancient identify` to identify various archive formats | ||
== References == | == References == | ||
− | * | + | * {{ForensicsWikiURL|file_format_identification}} |
+ | |||
+ | [[Category:Software]] | ||
+ | [[Category:File Format Identification]] |
Latest revision as of 15:34, 4 March 2024
Software | > | File identification software |
Software that automates the process of Identifying Files.
- Apache Tika (cross-platform, open source, website): "The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries." Written in Java.
- Detect It Easy (website)
- DROID (cross-platform, open source, website): "DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats." Requires Java 7 or 8 (Version 6.1.5).
- FIDO (cross-platform, open source) website: Format Identification for Digital Objects, written in Python.
- FIDOO (web-based online file identification): website
- File command (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the Debian man page for an overview, and this guide to creating "magic" entries for it.
- File Information Tool Set: software from the Harvard University library to identify file formats and extract metadata
- FI Tools (Windows, commercial, website)
- GetTyp and GT2
- G-Spot (Windows, freeware, website): Identifies audio and video codecs need to play a media file.
- IDArc
- JHOVE (tool to classify/identify/validate file formats)
- Konvertor (tool to display and convert between 4000+ different formats, freeware website)
- Magika
- MediaInfo (cross-platform, open source, website): "MediaInfo is a convenient unified display of the most relevant technical and tag data for video and audio files."
- PHP PRONOM drip: Recognize file formats using PRONOM registry (open source, website)
- Siegfried (signature-based file identification tool): website · blog post · PRONOM:fmt/883
- TrID (Windows/Linux, free for non-commercial use, website): identifies files using a database of filetype signatures. Also has an online version.
- The Unarchiver Has a "lsar" tool that identifies archives well
- 98ripper 98ripper has an identification mode `98ripper -i` that can identify PC-98 disk images
- ancient ancient tool has an identification mode `ancient identify` to identify various archive formats