Archive Team hostname file

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Created page with "{{FormatInfo |formattype=electronic |subcat=Archiving |extensions={{ext|hostnames}} }} When the [http://www.archiveteam.org/ Archive Team] is preparing to archive data from a...")
 
(Add cat)
 
Line 12: Line 12:
  
 
The file is saved with a '''.hostnames''' extension, and a filename that is a number one less than the first serial numbered line in the file (e.g., ''2000000.hostnames''). It is then compressed in [[gzip]] format for upload/download.
 
The file is saved with a '''.hostnames''' extension, and a filename that is a number one less than the first serial numbered line in the file (e.g., ''2000000.hostnames''). It is then compressed in [[gzip]] format for upload/download.
 +
 +
[[Category:Metadata]]

Latest revision as of 06:42, 7 March 2013

File Format
Name Archive Team hostname file
Ontology
Extension(s) .hostnames

When the Archive Team is preparing to archive data from a multi-user, multi-hostname site that's about to be terminated (e.g., Posterous), often an early step will be to obtain (through automated scripted access) a list of the hostnames used on that site, so that in a later stage of archiving, the web data in those hostnames can be retrieved.

The format is simple: plain ASCII, Unix-style line breaks (LF, hex 0A, as newline character), one hostname per line. Each line has a sequential serial number followed by a tab (09) and then the hostname:

2000001	dwellz.posterous.com

The file is saved with a .hostnames extension, and a filename that is a number one less than the first serial numbered line in the file (e.g., 2000000.hostnames). It is then compressed in gzip format for upload/download.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox