Siegfried
Software | > | File identification software | > | Siegfried |
Automated batch identification of file formats using internal and external signatures. Siegfried's primary signatures are derived from PRONOM and DROID but also has support for FreeDesktop.org's MIME, Library of Congress' FDD's, and Wikidata.
Contents[hide] |
About
Like DROID, Siegfried supports identification of individual files, or entire directory trees. Siegfried can look within different aggregate file formats such as ZIP, TAR, WARC and ARC.
Siegfried is open source and developed in Golang. Siegfied supports command-line output which can be piped into a file for further analysis. Siegfried's server support also offers access via REST API. Siegfried can also be compiled as WASM which enables client-side identification of file formats and client-side digital preservation workflows.
GitHub
Siegfried is available on GitHub.
Checksums
Siegfried supports `md5`, `sha1`, `sha256`, `sha512`, `crc` checksums.
Output formats
Siegfried supports output in YAML, JSON, CSV, and DROID compatible CSV. Siegfried also offers a replay capability to re-run existing signature files through its engine to convert the file to one of its other supported formats.
Customizing Siegfried
Siegfried can be customized through its partner application Roy. More information can be found on the Roy Wiki.
Wikidata
Folks may be interested in customizing the Wikidata signature file to get more fine-grained or specific results from Wikidata based identifications. The SPARQL source looks as follows:
# Return all file format records from Wikidata. SELECT DISTINCT ?uri ?uriLabel ?puid ?extension ?mimetype ?encoding ?referenceLabel ?date ?relativity ?offset ?sig WHERE { { ?uri (wdt:P31/(wdt:P279*)) wd:Q235557. } UNION { ?uri (wdt:P31/(wdt:P279*)) wd:Q26085352. } FILTER(EXISTS { ?uri (wdt:P2748|wdt:P1195|wdt:P1163|ps:P4152) _:b2. }) FILTER((STRLEN(?sig)) >= 4 ) OPTIONAL { ?uri wdt:P2748 ?puid. } OPTIONAL { ?uri wdt:P1195 ?extension. } OPTIONAL { ?uri wdt:P1163 ?mimetype. } OPTIONAL { ?uri p:P4152 ?object. OPTIONAL { ?object pq:P3294 ?encoding. } OPTIONAL { ?object ps:P4152 ?sig. } OPTIONAL { ?object pq:P2210 ?relativity. } OPTIONAL { ?object pq:P4153 ?offset. } OPTIONAL { ?object prov:wasDerivedFrom ?provenance. OPTIONAL { ?provenance pr:P248 ?reference; pr:P813 ?date. } } } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". } } ORDER BY (?uri)
Once the SPARQL has been customized it can be used with Siegfried by following the instructions on the Siegfried Wiki.