Siegfried

From Just Solve the File Format Problem
Jump to: navigation, search
Software > File identification software > Siegfried

Automated batch identification of file formats using internal and external signatures. Siegfried's primary signatures are derived from PRONOM and DROID but also has support for FreeDesktop.org's MIME, Library of Congress' FDD's, and Wikidata.

Contents

 [hide

About

Like DROID, Siegfried supports identification of individual files, or entire directory trees. Siegfried can look within different aggregate file formats such as ZIP, TAR, WARC and ARC.

Siegfried is open source and developed in Golang. Siegfied supports command-line output which can be piped into a file for further analysis. Siegfried's server support also offers access via REST API. Siegfried can also be compiled as WASM which enables client-side identification of file formats and client-side digital preservation workflows.

GitHub

Siegfried is available on GitHub.

Checksums

Siegfried supports `md5`, `sha1`, `sha256`, `sha512`, `crc` checksums.

Output formats

Siegfried supports output in YAML, JSON, CSV, and DROID compatible CSV. Siegfried also offers a replay capability to re-run existing signature files through its engine to convert the file to one of its other supported formats.

Customizing Siegfried

Siegfried can be customized through its partner application Roy. More information can be found on the Roy Wiki.

Wikidata

Folks may be interested in customizing the Wikidata signature file to get more fine-grained or specific results from Wikidata based identifications. The SPARQL source looks as follows:

# Return all file format records from Wikidata.
SELECT DISTINCT ?uri ?uriLabel ?puid ?extension ?mimetype ?encoding ?referenceLabel ?date ?relativity ?offset ?sig WHERE {
  { ?uri (wdt:P31/(wdt:P279*)) wd:Q235557. }
  UNION
  { ?uri (wdt:P31/(wdt:P279*)) wd:Q26085352. }
  FILTER(EXISTS { ?uri (wdt:P2748|wdt:P1195|wdt:P1163|ps:P4152) _:b2. })
  FILTER((STRLEN(?sig)) >= 4 )
  OPTIONAL { ?uri wdt:P2748 ?puid. }
  OPTIONAL { ?uri wdt:P1195 ?extension. }
  OPTIONAL { ?uri wdt:P1163 ?mimetype. }
  OPTIONAL {
    ?uri p:P4152 ?object.
    OPTIONAL { ?object pq:P3294 ?encoding. }
    OPTIONAL { ?object ps:P4152 ?sig. }
    OPTIONAL { ?object pq:P2210 ?relativity. }
    OPTIONAL { ?object pq:P4153 ?offset. }
    OPTIONAL {
      ?object prov:wasDerivedFrom ?provenance.
      OPTIONAL {
        ?provenance pr:P248 ?reference;
          pr:P813 ?date.
      }
    }
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
ORDER BY (?uri)

Try It!

Once the SPARQL has been customized it can be used with Siegfried by following the instructions on the Siegfried Wiki.

Related

Resources

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox