ID3

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(add link)
(adding informal formats; more correct way to skip ID3v2)
 
(3 intermediate revisions by one user not shown)
Line 5: Line 5:
 
|locfdd={{LoCFDD|fdd000106}}, {{LoCFDD|fdd000107}}, {{LoCFDD|fdd000108}}
 
|locfdd={{LoCFDD|fdd000106}}, {{LoCFDD|fdd000107}}, {{LoCFDD|fdd000108}}
 
|released=1996
 
|released=1996
 +
|kaitai struct=id3v1_1
 +
|wikidata={{wikidata|Q1054220}}
 
}}
 
}}
 
'''ID3''', or '''ID3 tag''', is a metadata format, mainly used in [[MP3]] audio files. It stores information such as the song title, artist, and album.
 
'''ID3''', or '''ID3 tag''', is a metadata format, mainly used in [[MP3]] audio files. It stores information such as the song title, artist, and album.
Line 18: Line 20:
  
 
== Identification ==
 
== Identification ==
For an MP3 file with an ID3v1 tag, ASCII "{{magic|TAG}}" appears beginning 128 bytes from the end of the file.
+
* For an MP3 file with an ID3v1 tag, ASCII "{{magic|TAG}}" appears beginning 128 bytes from the end of the file.
 
+
* An MP3 file with an ID3v2 tag usually begins with ASCII "{{magic|ID3}}".
An MP3 file with an ID3v2 tag usually begins with ASCII "{{magic|ID3}}".
+
* Alternatively, the signature "{{magic|3DI}}" could appear 10 bytes from the ''end'' of the file, or 138 bytes from the end of the file if there is also an ID3v1 tag. This is much less likely.
 
+
Alternatively, the signature "{{magic|3DI}}" could appear 10 bytes from the ''end'' of the file, or 138 bytes from the end of the file if there is also an ID3v1 tag. This is much less likely.
+
  
 
There are other (rare) ways to use ID3v2, not covered by the above identification logic.
 
There are other (rare) ways to use ID3v2, not covered by the above identification logic.
Line 28: Line 28:
 
=== How to skip past an ID3v2 segment ===
 
=== How to skip past an ID3v2 segment ===
 
To identify an audio file's format, it is best to skip past any ID3v2 segment at the beginning of the file before looking for a magic signature, and not just assume that ID3 implies MP3. Unfortunately, doing so is not trivial. Here is an attempt to summarize the algorithm:
 
To identify an audio file's format, it is best to skip past any ID3v2 segment at the beginning of the file before looking for a magic signature, and not just assume that ID3 implies MP3. Unfortunately, doing so is not trivial. Here is an attempt to summarize the algorithm:
* Let OFFSET = 0.
+
# Let OFFSET = 0.
* Read and remember the first 10 bytes of the file.
+
# Read and remember the first 10 bytes of the file.
* If bytes 0-2 are not ASCII "ID3", stop. An ID3v2 segment is not present.
+
# If bytes 0-2 are not ASCII "ID3", stop. An ID3v2 segment is not present.
* Let OFFSET = 10 (for the 10-byte header).
+
# Let OFFSET = 10 (for the 10-byte header).
* Decode bytes 6-9 as a 32-bit "synchsafe int" (refer to any ID3v2 spec). Let OFFSET = OFFSET + this decoded int.
+
# Decode bytes 6-9 as a 32-bit "synchsafe int" (which are 28 bit of size, because you only read 7 bit of each 4 bytes, refer to any ID3v2 spec). Let OFFSET = OFFSET + this decoded int.
* If the 0x10 bit of byte 5 is set, let OFFSET = OFFSET + 10 (for the footer).
+
# If the 0x10 bit of byte 5 is set, the file's last 10 bytes have an ID3 footer, which is not audio data.
OFFSET is now the file offset of the payload audio data.
+
# OFFSET is now behind the ID3v2 tag. Although uncommon, the docs nowhere forbid to have multiple ID3v2 tags (f.e. v2.2 and v2.3 following), so the whole scannings may start anew at the current offset and you should go back to 1. while your true exit condition is 3. when not seeing the "ID3" magic.
 +
 
 +
== Similar/Depending, but informal formats ==
 +
All are in front of an ID3v1.0 or ID3v1.1 tag at the end of the file, but there's no priority between them and they're somewhat mutually exclusive:
 +
* [http://www.birdcagesoft.com/ID3v12.txt ID3.1v2], extending ID3v1 fields by 15-30 more bytes. Identification: {{magic|EXT}} 128 bytes in front.
 +
* [https://web.archive.org/web/20120310015458/http://www.fortunecity.com/underworld/sonic/3/id3tag.html Enhanced TAG], extending ID3v1 fields by 60 more bytes. Identification: {{magic|TAG+}} 227 bytes in front.
 +
* [https://web.archive.org/web/20220217102229/https://id3.org/Lyrics3 Lyrics3], adding subtitles=lyrics with a precision down to 1 second. Identification: {{magic|LYRICSEND}} in front.
 +
* [https://web.archive.org/web/20220416222957/https://id3.org/Lyrics3v2 Lyrics3v2], adding subtitles=lyrics, allowing up to 250 bytes for fields, adding cover picture. Identification: {{magic|LYRICS200}} in front.
  
 
== Specifications ==
 
== Specifications ==
Line 47: Line 54:
 
* [http://wiki.hydrogenaud.io/index.php?title=ID3v1.1 Hydrogenaudio Knowledgebase: ID3v1.1]
 
* [http://wiki.hydrogenaud.io/index.php?title=ID3v1.1 Hydrogenaudio Knowledgebase: ID3v1.1]
 
* [http://wiki.hydrogenaud.io/index.php?title=ID3v2 Hydrogenaudio Knowledgebase: ID3v2]
 
* [http://wiki.hydrogenaud.io/index.php?title=ID3v2 Hydrogenaudio Knowledgebase: ID3v2]
 +
* [https://formats.kaitai.io/id3v1_1/ Kaitai Struct: ID3v1.1]
 +
* [https://formats.kaitai.io/id3v2_3/ Kaitai Struct: ID3v2.3]
 +
* [https://formats.kaitai.io/id3v2_4/ Kaitai Struct: ID3.2.4]
  
 
== Software ==
 
== Software ==

Latest revision as of 12:18, 20 October 2023

File Format
Name ID3
Ontology
Extension(s) .mp3, others
LoCFDD fdd000106, fdd000107, fdd000108
Wikidata ID Q1054220
Kaitai Struct Spec id3v1_1.ksy
Released 1996

ID3, or ID3 tag, is a metadata format, mainly used in MP3 audio files. It stores information such as the song title, artist, and album.

Although designed for use with (and named after) MP3, ID3 is sometimes used with other audio formats. This can be done in two fundamental ways:

  • Embedding the ID3 data inside the file, in a manner appropriate for that audio format. For example, here is a WMA file containing ID3 data.
  • Prepending and/or appending the ID3 data to the file, MP3-style. This practice is not necessarily approved by any standard, but it has been done, for example with Ogg and FLAC. Here, the tail is wagging the dog, and ID3 can be thought of as a container format for an arbitrary audio format.

Contents

[edit] Format details

There are two major versions. ID3v1 defines a fixed-length data block that is always placed at the end of the file. ID3v2, which has very little in common with ID3v1, defines a block with variable-length frames and allows more flexibility and verbosity. ID3v2 data usually appears at the beginning of the file. It is possible, and common, for a file to have both ID3v1 and ID3v2 metadata.

As of 2017, there are three versions of ID3v2 to be aware of: v2.2.x, v2.3.x, and v2.4.x. These formats have some critical differences, and are definitely not compatible with each other.

[edit] Identification

  • For an MP3 file with an ID3v1 tag, ASCII "TAG" appears beginning 128 bytes from the end of the file.
  • An MP3 file with an ID3v2 tag usually begins with ASCII "ID3".
  • Alternatively, the signature "3DI" could appear 10 bytes from the end of the file, or 138 bytes from the end of the file if there is also an ID3v1 tag. This is much less likely.

There are other (rare) ways to use ID3v2, not covered by the above identification logic.

[edit] How to skip past an ID3v2 segment

To identify an audio file's format, it is best to skip past any ID3v2 segment at the beginning of the file before looking for a magic signature, and not just assume that ID3 implies MP3. Unfortunately, doing so is not trivial. Here is an attempt to summarize the algorithm:

  1. Let OFFSET = 0.
  2. Read and remember the first 10 bytes of the file.
  3. If bytes 0-2 are not ASCII "ID3", stop. An ID3v2 segment is not present.
  4. Let OFFSET = 10 (for the 10-byte header).
  5. Decode bytes 6-9 as a 32-bit "synchsafe int" (which are 28 bit of size, because you only read 7 bit of each 4 bytes, refer to any ID3v2 spec). Let OFFSET = OFFSET + this decoded int.
  6. If the 0x10 bit of byte 5 is set, the file's last 10 bytes have an ID3 footer, which is not audio data.
  7. OFFSET is now behind the ID3v2 tag. Although uncommon, the docs nowhere forbid to have multiple ID3v2 tags (f.e. v2.2 and v2.3 following), so the whole scannings may start anew at the current offset and you should go back to 1. while your true exit condition is 3. when not seeing the "ID3" magic.

[edit] Similar/Depending, but informal formats

All are in front of an ID3v1.0 or ID3v1.1 tag at the end of the file, but there's no priority between them and they're somewhat mutually exclusive:

  • ID3.1v2, extending ID3v1 fields by 15-30 more bytes. Identification: EXT 128 bytes in front.
  • Enhanced TAG, extending ID3v1 fields by 60 more bytes. Identification: TAG+ 227 bytes in front.
  • Lyrics3, adding subtitles=lyrics with a precision down to 1 second. Identification: LYRICSEND in front.
  • Lyrics3v2, adding subtitles=lyrics, allowing up to 250 bytes for fields, adding cover picture. Identification: LYRICS200 in front.

[edit] Specifications

[edit] Software

[Ed. note: There are many utilities that can read and write ID3 tags, including Windows Explorer to some extent. We suggest searching the web.]

[edit] Resources

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox