From Just Solve the File Format Problem
Jump to: navigation, search

Here are some answers to either frequently asked questions or to important aspects of this project; the ground rules.


What is the purpose of this Wiki and this project?

The purpose of this wiki is to "Solve the File Format Problem".

The File Format Problem in this context is that over the last century, many types of electronic information have been presented in formats that are non-intuitive and subject to the rapidly-shifting interests of what is newest and best. As a result, thousands of programs, documents, images and binary files are in danger of being unreadable to later generations as no readily available information about accessing them is to be found. Various projects have been launched in the last 20 years to deal with this problem, but they all lack a groundswell of directed volunteers working to once and for all get all the disparate information into one place and easily referenced by all. This is the goal of this project.

The project started on November 1st, 2012. All are invited to register and contribute. The goal is to have a massive bulk of knowledge in one place for all to use and incorporate into other similar directories and sites, providing a needed resource for generations to come.

What IS a file format, anyway?

That is an excellent, excellent question.

For the purposes of this Wiki, a file format is the description of any self-contained information. While some formats are self-evident, many are more obscure references to pre-known information, or collections of bytes and items that can be misleading or hard to decipher. File formats can also be embedded in each other - on this wiki, each format gets its own entry. For example, a JPEG image file might be on a FAT filesystem on a floppy disk. All of these get individual entries.

The nature of such a wide description is that some entries will skirt the edge of reason, or a file format will not be clear as to where it should be filed. This is understandable and will be dealt with, but the most important priority is to declare the file format exists, and then link it from the locations it should be at.

Wisdom is actually just a file format. Contains less raw data than experience logs, less structure than code. It's basically lazy learnings.
-- Venkatesh Rao

Some of these classifications (Organic, for example) are strange. Who'd want that?

Part of the goals of this project are to extend out past the more limited realms of similar projects. Because of funding, direction, and preference, the multiple other sites are usually focused on a sub-set of all potential file formats, because who'd have time to deal with them? If this project gains more and more contributors, then there's not much more effort required to go after items on the fuzzy lines of what constitutes file formats. Or, for that matter, discern the reason they're file formats. So we have many items like DNA and Piano Rolls that can also be included and worked on. Priority-wise, however, it's probably best to stick with electronic file forms, especially ones lacking any links or documentation - they could really use your efforts.

Why is it CC0?

Making the information here CC0 provides the most flexibility for institutions and individuals who want to benefit from the project down the line. With no restrictive license for the main site, the efforts can be re-absorbed into a mass of other projects without needless negotiations over "licenses" and "rights" and provenance.

As a result, please refrain from copy-pasting prose and paragraphs from other sources that are not public domain or CC0. (For example, Wikipedia. Sketch out a differently-written summary of the information, and then link to the original information you're interested in having included. People who find compatibility with rewriting extant material can therefore take these links and create a new work for the Just Solve wiki. Duplicating lists and names of formats is OK - make use of the work of others to track down lists of obscure file types to include in this wiki.

Wasn't it "Public Domain" initially?

This Wiki was straight "Public Domain" for the first few weeks of its existence - however, calling something "public domain" has some issues in locations where Public Domain is not recognized by local copyright law. Therefore, CC0 was added to remove this ambiguity. So in effect, this is still Public Domain.

What happens in the event of disagreement and conflict?

Please try and work together as best you can - the goal of this project is to absorb an enormous amount of information and arguing about a point of order doesn't help anyone. Unlike Wikipedia's need for a distinct voice, there is nothing stopping people from making a second paragraph set to explain an alternate overview of a format, or to add a set of other links that the entry didn't have before. It's understandable we're going to have issues about arrangement of materials, like whether a certain format is a "written" or "text" format. It's more important to get the existence of that format in, because the use of Categories and links as we best put them together will make sure the effort is not wasted.

In cases where an impasse is reached, Jason Scott will serve as editor. In the event Jason Scott appears to be unreasonable, a separate discussion can ensue on the talk page. The goal, again, is to get the information in, past prettiness, preciousness and perfection. Cleaning is a much easier process than acquiring and enumerating at this stage.

How can I help? I don't like to ______ and everyone seems to have that under control anyway.

The way I see it, there are two kinds of tasks that the Wiki encourages: tasks involving finding items for the Wiki (be they formats or information about them) and making the Wiki "better" in terms of organization and presentation.

For the tasks of finding items, going to the Sources page and you will find a wide variety of locations and directories that will always need another set of eyes to find gems. The individual pages for the formats will have links to other examples or directories that reference that format specifically - those could always use more additions. Linking to an already-researched page that has an eye to describing and improving access to a format completely trumps creating such a page on the Wiki - the huge span of formats out there is the daunting part. So search far and wide and bring back knowledge!

For the task of the Wiki itself, there are a lot of cases where formatting could be improved, design could look nicer, and classification could be clearer. We expect the look of the place to change significantly over the month as people include MediaWiki Templates and other tricks to make it easier to browse and find information. Short, informative blurbs about each format help people find what they're looking for. Check it out.

Why are there so many darned file formats anyway?

This XKCD comic has as good an explanation as anybody.

Here's a webinar giving some of the reasons behind file format multiplicity.

Personal tools