Open Document Text

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Created page with "'''Open Document Text''' is actually a zip archive with XML files describing text and relationships and JPEG, PNG, and other graphical files for pictures and o...")
 
(Change to redirect now that content is merged)
 
Line 1: Line 1:
'''Open Document Text''' is actually a [[zip archive]] with [[XML]] files describing text and relationships and [[JPEG]], [[PNG]], and other graphical files for pictures and other media included in the document.
+
#REDIRECT [[OpenDocument Text]]
 
+
The layout of the regular ODT file is the following:
+
* META-INF
+
** manifest.xml
+
* Thumbnails
+
** thumbnail.png
+
* content.xml
+
* manifest.rdf
+
* meta.xml
+
* mimetype
+
* settings.xml
+
* styles.xml
+
 
+
==Inner files description==
+
===manifest.xml===
+
Lists all the other xml files that are in this document. In the event of the simple document its contents maybe something like:
+
 
+
<?xml version="1.0" encoding="UTF-8"?>
+
<!DOCTYPE manifest:manifest PUBLIC "-//OpenOffice.org//DTD Manifest 1.0//EN" "Manifest.dtd">
+
<manifest:manifest xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0">
+
  <manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.text" manifest:full-path="/"/>
+
  <manifest:file-entry manifest:media-type="text/xml" manifest:full-path="content.xml"/>
+
  <manifest:file-entry manifest:media-type="text/xml" manifest:full-path="styles.xml"/>
+
  <manifest:file-entry manifest:media-type="text/xml" manifest:full-path="meta.xml"/>
+
  <manifest:file-entry manifest:media-type="text/xml" manifest:full-path="settings.xml"/>
+
</manifest:manifest>
+
 
+
===conten.xml===
+
This is the file that contains all the text in the document.
+
 
+
The root element is always &lt;office:document-content&gt;. To get the text without metadata you go through the following hierarchy:
+
* office:document-content
+
** office:body
+
*** office:text
+
 
+
There you will find tags in the ''text'' namespace, that either mirror HTML in their names or are self-explanatory for the most part. Some examplese are:
+
* text:p - paragraph
+
* text:list - a listing that will have several text:list-item elements
+
* text:list-item - a single item of the list
+
 
+
Each text tag may have text:style attribute that links it to the style that is defined in office:document-content > office:automatic-styles > style:style.
+
 
+
===manifest.rdf===
+
[[RDF]] metadata. Most often the contents are just
+
 
+
  <?xml version="1.0" encoding="utf-8"?>
+
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
+
  </rdf:RDF>
+
 
+
===meta.xml===
+
This is the metadata that somebody fills in to describe the document or is automatically recorded by the software. The root element is always office:document-meta. The contents are defined rather loosely, the editing software is advised not to delete tags that it doesn't recognise, since other software maybe using them. In practice deleting all the contents of office:document-meta > office:meta will not damage the document, and it can be considered non-essential information.
+
 
+
===mimetype===
+
A text file that consists of
+
  application/vnd.oasis.opendocument.text
+
 
+
===settings.xml===
+
Software specific settings of the document. The root tag is &lt;office:document-settings&gt;. No inner contents are required for the functioning document.
+
 
+
===styles.xml===
+
Non-automatic document styles, that are held in &lt;office:document-styles&gt; tag.
+

Latest revision as of 23:08, 14 April 2015

  1. REDIRECT OpenDocument Text
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox