DICOM

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Overview of file format)
m (Sample files)
 
(26 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
{{FormatInfo
 
{{FormatInfo
|subcat=Scientific Data formats
+
|subcat=Health and Medicine
|extensions={{ext|dcm}}, others
+
|extensions={{ext|dcm}}, {{ext|dic}}, {{noext}}, others
 +
|mimetypes={{mimetype|application/dicom}}, {{mimetype|image/dicom-rle}}
 +
|pronom={{PRONOM|fmt/574}}
 +
|kaitai struct=dicom
 +
|released=1985
 
}}
 
}}
 
+
'''DICOM''' (Digital Imaging and Communications in Medicine) is far and away the most widely-used (and probably the oldest) electronic file format in medical imaging. Nearly every device that acquires medical images ultrasound, CT, PET, and MRI – acquires DICOM images in normal operation. There's a [http://medical.nema.org/standard.html 20-part specification] detailing the file format and its ecosystem. The IANA has assigned TCP and UDP port 104 to DICOM-related traffic.
== General description ==
+
 
+
'''DICOM''' (Digital Imaging and Communications in Medicine) is far and away the most widely-used (and probably the oldest) electronic file format in medical imaging. Nearly every device that acquires medical images -- ultrasound, CT, PET, and MRI -- acquire DICOM images in normal operation. There's a [http://medical.nema.org/standard.html 20-part specification] detailing the file format and its ecosystem. The IANA has assigned TCP and UDP port 104 to DICOM-related traffic.
+
  
 
It's kind of a big deal.
 
It's kind of a big deal.
Line 12: Line 13:
 
However, as with any sufficiently-adopted standard, there are splinter factions. The most common format is 2-dimensional images or "slices" that can be formed into a 3-dimensional image; however, some manufacturers have extended the standard to save 3 or even 4-dimensional images in a "mosaic" format.
 
However, as with any sufficiently-adopted standard, there are splinter factions. The most common format is 2-dimensional images or "slices" that can be formed into a 3-dimensional image; however, some manufacturers have extended the standard to save 3 or even 4-dimensional images in a "mosaic" format.
  
== File format ==
+
The earliest versions of the standard were known as ACR/NEMA, after the American College of Radiology and National Electrical Manufacturers Association.
  
While there are many complications involved in decoding a DICOM file, fundamentally it is simply a sequence of data blocks called ''attributes'' or ''elements''. Each attribute contains a 16-bit ''group number'' and a 16-bit ''element number'', conventionally written in hexadecimal and separated with a comma, e.g. (0028,0011).
+
== Format ==
 +
While there are many complications involved in decoding a DICOM file, fundamentally it is simply a sequence of data blocks called ''attributes'' or ''elements''. Each attribute contains a 16-bit ''group number'' and a 16-bit ''element number'', conventionally written in hexadecimal and separated with a comma, e.g. "(0028,0011)".
  
 
=== Standard attributes ===
 
=== Standard attributes ===
Line 33: Line 35:
 
* http://svn.sourceforge.jp/svnroot/pgctn/pgctn/trunk/main_tree/dicomviewer/univiewer/dcmdict.txt
 
* http://svn.sourceforge.jp/svnroot/pgctn/pgctn/trunk/main_tree/dicomviewer/univiewer/dcmdict.txt
  
== Software ==
+
== Types of DICOM files ==
  
Software that reads DICOM files is pretty much everywhere. Most neuroimaging analysis packages have some way of importing DICOMs and turning them in to a higher-dimensional file; open-source stand-alone libraries abound, as well:
+
=== Little-endian vs. big-endian ===
 +
A DICOM file may use either little-endian or big-endian [[Endianness|byte order]] for certain representations of numbers. Little-endian is more common.
  
* [http://code.google.com/p/pydicom/ PyDICOM]
+
=== Explicit VR vs. Implicit VR ===
 +
A DICOM file may use either ''Explicit VR'' or ''Implicit VR'' format. VR stands for ''Value Representation''.
 +
 
 +
''Explicit VR'' means that each attribute has its data type stored in the file.
 +
 
 +
''Implicit VR'' means that the attribute types are not stored in the file. The decoder will have to use a data dictionary of its own to figure them out.
 +
 
 +
=== With header vs. Without header ===
 +
When stored on disk, DICOM files are supposed to begin with a header, though not all of them do. Files with a header are sometimes called '''Part 10''' files.
 +
 
 +
When a header is present, the file begins with a 128-byte preamble that is usually set to all zero bytes, but which may be used for application-specific purposes. The next 4 bytes are the ASCII signature "DICM". Following the signature is a set of "Group 2" attributes, in little-endian, explicit-VR format. After the Group 2 attributes is the main part of the file, using the format given by the Transfer Syntax UID (0002,0010) attribute. ("Transfer Syntax" is the DICOM term for "file format".)
 +
 
 +
Files without a header ''usually'' use Implicit VR, little-endian format.
 +
 
 +
=== Modality ===
 +
One of the most important attributes in a DICOM file is ''Modality'' (0008,0060). It indicates the type of data stored in the file, and often corresponds to the type of machine that created the file. For example, a modality of "MR" means MRI, and "US" means ultrasound. Different modalities have different required attributes, and may have different conventions for how to display images contained in the file, etc.
 +
 
 +
== Identifiers ==
 +
The most common filename extension is '''.dcm'''. Not all DICOM files have a filename extension.
 +
 
 +
== Identification ==
 +
DICOM files with a header have the ASCII signature "<code>DICM</code>" at byte offset 128.
 +
 
 +
Files without a header cannot be readily identified, though many begin with bytes <code>08 00 ?? 00</code>.
 +
 
 +
== Image formats ==
 +
If a DICOM file contains image data, it contains either a single image, or a video clip (usually composed of multiple still images all having the same size and color format). There is an extension called [[Papyrus (DICOM extension)|Papyrus]] that can store multiple different images in a single file.
 +
 
 +
The image format is determined by attribute (0002,0010): Transfer Syntax UID. If there is no such attribute, the image is uncompressed. Defined formats include:
 +
* [[Run-length encoding]]: UID 1.2.840.10008.1.2.5
 +
* [[DEFLATE]]: UID 1.2.840.10008.1.2.1.99
 +
* [[JPEG]] (lossy): UID 1.2.840.10008.1.2.4.50, etc.
 +
* [[Lossless JPEG (original)|Lossless JPEG]]: UID 1.2.840.10008.1.2.4.57, etc.
 +
* [[JPEG-LS]]: UID 1.2.840.10008.1.2.4.80 and .81
 +
* [[JPEG 2000]]: UID 1.2.840.10008.1.2.4.90, etc.
 +
* [[MPEG-2]]: UID 1.2.840.10008.1.2.4.100, etc.
 +
* [[H.264|MPEG-4 AVC/H.264]]: UID 1.2.840.10008.1.2.4.102, etc.
 +
 
 +
== Specifications ==
 +
* [http://medical.nema.org/standard.html The DICOM Standard]
 +
 
 +
== Software ==
 +
Software that reads DICOM files is pretty much everywhere. Most neuroimaging analysis packages have some way of importing DICOMs and turning them in to a higher-dimensional file; open-source stand-alone libraries abound, as well.
 +
 
 +
* [http://www.pydicom.org/ Pydicom]
 
* [http://www.cabiatl.com/mricro/mricron/dcm2nii.html dcm2nii]
 
* [http://www.cabiatl.com/mricro/mricron/dcm2nii.html dcm2nii]
 
* [http://sourceforge.net/apps/mediawiki/gdcm/index.php?title=Main_Page Grassroots DICOM]
 
* [http://sourceforge.net/apps/mediawiki/gdcm/index.php?title=Main_Page Grassroots DICOM]
 +
* [http://www.healthcare.philips.com/main/about/connectivity/ Philips DICOM Viewer] For Microsoft Windows. Linked to in the sidebar of pages such as this one.
 +
* [[ImageMagick]] (read-only)
 +
* [[XnView]]
 +
* [http://snisurset.net/code/abydos/ abydos]
 +
 +
== Sample files ==
 +
* [http://www.aycan.de/lp/sample-dicom-images.html Examples of DICOM images]
 +
* {{DexvertSamples|image/dicom}}
 +
 +
== Links ==
 +
* [http://medical.nema.org/ DICOM home page]
 +
* [[Wikipedia:DICOM|Wikipedia article]]
 +
* [http://www.textfiles.com/programming/FORMATS/acr-nema.txt A little bit of discussion]
 +
* [https://orthanc.chu.ulg.ac.be/book/dicom-guide.html Understanding DICOM with Orthanc], "a gentle, informal, high-level introduction to DICOM"
 +
* [https://www.vladsiv.com/dicom-file-format-basics/ DICOM File Format Basics]
 +
 +
[[Category:Scientific Data formats]]
 +
[[Category:Graphics]]

Latest revision as of 04:20, 28 December 2023

File Format
Name DICOM
Ontology
Extension(s) .dcm, .dic, (none), others
MIME Type(s) application/dicom, image/dicom-rle
PRONOM fmt/574
Kaitai Struct Spec dicom.ksy
Released 1985

DICOM (Digital Imaging and Communications in Medicine) is far and away the most widely-used (and probably the oldest) electronic file format in medical imaging. Nearly every device that acquires medical images – ultrasound, CT, PET, and MRI – acquires DICOM images in normal operation. There's a 20-part specification detailing the file format and its ecosystem. The IANA has assigned TCP and UDP port 104 to DICOM-related traffic.

It's kind of a big deal.

However, as with any sufficiently-adopted standard, there are splinter factions. The most common format is 2-dimensional images or "slices" that can be formed into a 3-dimensional image; however, some manufacturers have extended the standard to save 3 or even 4-dimensional images in a "mosaic" format.

The earliest versions of the standard were known as ACR/NEMA, after the American College of Radiology and National Electrical Manufacturers Association.

Contents

[edit] Format

While there are many complications involved in decoding a DICOM file, fundamentally it is simply a sequence of data blocks called attributes or elements. Each attribute contains a 16-bit group number and a 16-bit element number, conventionally written in hexadecimal and separated with a comma, e.g. "(0028,0011)".

[edit] Standard attributes

If an attribute's group number is even, then it is a standard attribute defined in the DICOM specification, and the group and element number together uniquely identify the meaning of the attribute.

[edit] Private attributes

If the group number is odd, then it is a private attribute, and it will have been preceded by a special attribute supplying a "private creator" identification string. A private attribute is uniquely identified by the combination of its creator identifier, group number, and the low byte of its element number.

Some examples of creator identifiers are GEMS_IMAG_01 and Philips Imaging DD 001. An identifier is usually specific to a manufacturer of medical equipment, not to a particular medical device. Unfortunately, instead of having one specification per manufacturer, private attributes are usually only documented in device-specific "DICOM Conformance Statements", which list only the attributes used by that one device.

Examples of DICOM Conformance Statements (search the documents for "private creator"):

Compilations:

[edit] Types of DICOM files

[edit] Little-endian vs. big-endian

A DICOM file may use either little-endian or big-endian byte order for certain representations of numbers. Little-endian is more common.

[edit] Explicit VR vs. Implicit VR

A DICOM file may use either Explicit VR or Implicit VR format. VR stands for Value Representation.

Explicit VR means that each attribute has its data type stored in the file.

Implicit VR means that the attribute types are not stored in the file. The decoder will have to use a data dictionary of its own to figure them out.

[edit] With header vs. Without header

When stored on disk, DICOM files are supposed to begin with a header, though not all of them do. Files with a header are sometimes called Part 10 files.

When a header is present, the file begins with a 128-byte preamble that is usually set to all zero bytes, but which may be used for application-specific purposes. The next 4 bytes are the ASCII signature "DICM". Following the signature is a set of "Group 2" attributes, in little-endian, explicit-VR format. After the Group 2 attributes is the main part of the file, using the format given by the Transfer Syntax UID (0002,0010) attribute. ("Transfer Syntax" is the DICOM term for "file format".)

Files without a header usually use Implicit VR, little-endian format.

[edit] Modality

One of the most important attributes in a DICOM file is Modality (0008,0060). It indicates the type of data stored in the file, and often corresponds to the type of machine that created the file. For example, a modality of "MR" means MRI, and "US" means ultrasound. Different modalities have different required attributes, and may have different conventions for how to display images contained in the file, etc.

[edit] Identifiers

The most common filename extension is .dcm. Not all DICOM files have a filename extension.

[edit] Identification

DICOM files with a header have the ASCII signature "DICM" at byte offset 128.

Files without a header cannot be readily identified, though many begin with bytes 08 00 ?? 00.

[edit] Image formats

If a DICOM file contains image data, it contains either a single image, or a video clip (usually composed of multiple still images all having the same size and color format). There is an extension called Papyrus that can store multiple different images in a single file.

The image format is determined by attribute (0002,0010): Transfer Syntax UID. If there is no such attribute, the image is uncompressed. Defined formats include:

[edit] Specifications

[edit] Software

Software that reads DICOM files is pretty much everywhere. Most neuroimaging analysis packages have some way of importing DICOMs and turning them in to a higher-dimensional file; open-source stand-alone libraries abound, as well.

[edit] Sample files

[edit] Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox