Scientific Data formats

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
m (Medical Imaging)
(Listed a bunch more formats, sourced from Wikipedia)
Line 9: Line 9:
 
== General ==
 
== General ==
 
* [[cdf]] (Common Data Format)
 
* [[cdf]] (Common Data Format)
 +
* [[EAS3]] (binary file format for structured data)
 
* [[hdf]] (Hierarchical Data Format, from NASA)
 
* [[hdf]] (Hierarchical Data Format, from NASA)
 
* [[NetCDF]] (Network Common Data Format)
 
* [[NetCDF]] (Network Common Data Format)
 +
* [[SDF]] (Simple Data Format, a platform-independent, precision-preserving binary data I/O format capable of handling large, multi-dimensional arrays)
 
* [[SDXF]] (Structured Data Exchange Format)
 
* [[SDXF]] (Structured Data Exchange Format)
* [[XDF]] (eXtensible Data Format)
+
* [[Silo]] (a storage format for visualization developed at Lawrence Livermore National Laboratory)* [[XDF]] (eXtensible Data Format)
 
* [[XSIL]] (Extensible Scientific Interchange Language)
 
* [[XSIL]] (Extensible Scientific Interchange Language)
  
Line 40: Line 42:
 
* [[Swiss-Prot]] (Flatfile format used for protein sequences from the Swiss-Prot database)
 
* [[Swiss-Prot]] (Flatfile format used for protein sequences from the Swiss-Prot database)
 
* [[VCF]] (Variant Call Format)
 
* [[VCF]] (Variant Call Format)
 +
 +
== Biomedical signals (time series) ==
 +
 +
* [[ACQ]] (AcqKnowledge)
 +
* [[BCI2000]] (The BCI2000 project)
 +
* [[BDF]] (BioSemi data format0
 +
* [[BKR]] (EEG data format)
 +
* [[CFWB]] (Chart Data File Format)
 +
* [[ DICOM-Waveform]] (An extension of Dicom for storing waveform data)
 +
* [[ecgML]] (A markup language for electrocardiogram data acquisition and analysis)
 +
* [[EDF/EDF+]] (European Data Format)
 +
* [[FEF]] (File Exchange Format for Vital signs, CEN TS 14271)
 +
* [[GDF v1.x]] (General Data Format for biomedical signals - Version 1.x)
 +
* [[ GDF v2.x]] (The General Data Format for biomedical signals - Version 2.x)
 +
* [[ HL7aECG]] (Health Level 7 v3 annotated ECG)
 +
* [[OpenXDF]] (Open Exchange Data Format)
 +
* [[SCP-ECG]] (Standard Communication Protocol for Computer assisted electrocardiography)
 +
* [[SIGIF]] (A digital SIGnal Interchange Format)
 +
* [[WFDB]] (Format of Physiobank)
 +
  
 
== Chemical ==
 
== Chemical ==
 +
* [[CCP4]] (X-ray crystallography voxels (electron density))
 
* [[Chemical data]]
 
* [[Chemical data]]
 +
* [[CTab]] (Chemical table file .mol, .sd, .sdf)
 +
* [[HITRAN]] (spectroscopic data with one optical/infrared transition per line in the ASCII file (.hit))
 +
* [[JCAMP]] (Joint Committee on Atomic and Molecular Physical Data, .dx, .jdx)
 +
* [[MRC]] (voxels in cryo-electron microscopy)
 +
* [[SMILES]] (Simplified molecular input line entry specification, .smi)
 +
* [[SPC]] (spectroscopic data)
  
 
== Ecological ==
 
== Ecological ==
Line 68: Line 97:
  
 
== Mathematical ==
 
== Mathematical ==
 
+
* [[graph6, sparse6]] (ASCII encoding of Adjacency matrices (.g6, .s6))
 +
* [[M]] (Mathematica package file)
 
* [[MathML]]
 
* [[MathML]]
  
 
== Medical Imaging ==
 
== Medical Imaging ==
* [[DICOM]]
+
* [[AFNI]] (data, meta-data (.BRIK,.HEAD))
 +
*      [[.MGH]] (uncompressed)
 +
*      [[.MGZ]] (zip-compressed)
 +
* [[Analyze data, meta-data]] (.img,.hdr)
 +
* [[DICOM]] (Digital Imaging and Communications in Medicine (.dcm))
 +
* [[MINC]] (Medical Imaging NetCDF format; since version 2.0, based on HDF5 (.mnc))
 
* [[OME-TIFF]] (Open Microscopy Imaging format)
 
* [[OME-TIFF]] (Open Microscopy Imaging format)
 +
* [[OST (Open Spatio-Temporal)]] (extensible, open alternative for microscope images)
 +
* [[nii]] (Neuroimaging Informatics Technology Initiative (NIfTI) single-file (combined data and meta-data))
 +
* [[gii]] (NIfTI offspring for brain surface data, single-file (combined data and meta-data) style)
 +
* [[.img,.hdr]] (NIfTI offspring for brain surface data, dual-file (separate data and meta-data, respectively) style)
 +
* [[SDM]] (Signed Differential Mapping- brain maps(.sdm))
 +
  
 
== Oceanographic, Atmospheric and Meteorological ==
 
== Oceanographic, Atmospheric and Meteorological ==
Line 87: Line 128:
 
* [[NeXuS]] (Common data format for neutron, x-ray and muon science)
 
* [[NeXuS]] (Common data format for neutron, x-ray and muon science)
 
* [[QCDml]] (Lattice QCD gauge configuration markup language)
 
* [[QCDml]] (Lattice QCD gauge configuration markup language)
 +
 +
== Scientific Signal data ==
 +
 +
* [[ACQ]] (AcqKnowledge File Format for Windows)
 +
* [[BKR]] (EEG data format)
 +
* [[BDF]] (BioSemi data format)
 +
* [[CFWB]] (Chart Data File Format)
 +
* [[EDF]] (European data format)
 +
* [[FEF]] (File Exchange Format for Vital signs)
 +
* [[GDF]] (General data formats for biomedical signals)
 +
* [[GMS]] (Gesture And Motion Signal format)
 +
* [[IROCK]] (intelliRock Sensor Data File Format)
 +
* [[MFER]] (Medical waveform Format Encoding Rules)
 +
* [[SCP-ECG]] (Standard Communication Protocol for Computer assisted electrocardiography)
 +
* [[SEG Y]] (Reflection seismology data format)
 +
* [[SIGIF]] (SIGnal Interchange Format)
  
 
== Social Sciences ==
 
== Social Sciences ==

Revision as of 16:23, 3 November 2012

File Formats > Electronic File Formats > Scientific Data formats

Contents

General

  • cdf (Common Data Format)
  • EAS3 (binary file format for structured data)
  • hdf (Hierarchical Data Format, from NASA)
  • NetCDF (Network Common Data Format)
  • SDF (Simple Data Format, a platform-independent, precision-preserving binary data I/O format capable of handling large, multi-dimensional arrays)
  • SDXF (Structured Data Exchange Format)
  • Silo (a storage format for visualization developed at Lawrence Livermore National Laboratory)* XDF (eXtensible Data Format)
  • XSIL (Extensible Scientific Interchange Language)

Astronomical and Space

  • FITS (Flexible Image Transport System)
  • PDS/ODL (Planetary Data System)

Biological

  • AB1 (Chromatogram files used by DNA sequencing instruments from Applied Biosystems)
  • ACE (Sequence assembly format)
  • BAM (Binary compressed SAM format)
  • BED (Browser extensible display format describing genes and other features of DNA sequences)
  • CAF (Common Assembly Format for sequence assembly)
  • EMBL (Flatfile format used by the EMBL for nucleotide and peptide sequences)
  • FASTA and FASTQ (File format for sequence data, FASTQ with quality).
  • GenBank (Flatfile format used by NCBI for nucleotide and peptide sequences)
  • GFF (General feature format for describing genes and other features of DNA, RNA and protein sequences)
  • GTF (Gene transfer format holds information about gene structure)
  • NEXUS (Encodes mixed information about genetic sequence data in a block structured format)
  • PDB (Structures of biomolecules deposited in Protein Data Bank)
  • PHD (Output from the basecalling software Phred)
  • SAM (Sequence Alignment/Map format)
  • SCF (Staden chromatogram files used to store data from DNA sequencing)
  • SBML (Systems Biology Markup Language used to store biochemical network computational models)
  • Stockholm (Representing multiple sequence alignments)
  • Swiss-Prot (Flatfile format used for protein sequences from the Swiss-Prot database)
  • VCF (Variant Call Format)

Biomedical signals (time series)

  • ACQ (AcqKnowledge)
  • BCI2000 (The BCI2000 project)
  • BDF (BioSemi data format0
  • BKR (EEG data format)
  • CFWB (Chart Data File Format)
  • DICOM-Waveform (An extension of Dicom for storing waveform data)
  • ecgML (A markup language for electrocardiogram data acquisition and analysis)
  • EDF/EDF+ (European Data Format)
  • FEF (File Exchange Format for Vital signs, CEN TS 14271)
  • GDF v1.x (General Data Format for biomedical signals - Version 1.x)
  • GDF v2.x (The General Data Format for biomedical signals - Version 2.x)
  • HL7aECG (Health Level 7 v3 annotated ECG)
  • OpenXDF (Open Exchange Data Format)
  • SCP-ECG (Standard Communication Protocol for Computer assisted electrocardiography)
  • SIGIF (A digital SIGnal Interchange Format)
  • WFDB (Format of Physiobank)


Chemical

  • CCP4 (X-ray crystallography voxels (electron density))
  • Chemical data
  • CTab (Chemical table file .mol, .sd, .sdf)
  • HITRAN (spectroscopic data with one optical/infrared transition per line in the ASCII file (.hit))
  • JCAMP (Joint Committee on Atomic and Molecular Physical Data, .dx, .jdx)
  • MRC (voxels in cryo-electron microscopy)
  • SMILES (Simplified molecular input line entry specification, .smi)
  • SPC (spectroscopic data)

Ecological

  • Darwin Core (Standard for sharing information about biological diversity)
  • EML (Ecological Metadata Language)

Geographic and Geospatial

See also Geospatial

  • DEM (Digital Elevation Model)
  • DOQ (Digital Orthophotos)
  • e00 (ESRI ArcInfo Interchange File)
  • FGDC (Content Standard for Digital Geospatial Metadata??)
  • GeoTIFF (Geospatial extensions to TIFF)
  • GML (Geography Markup Language)
  • HDFEOS, HD2, HD4 (Hierarchical Data Format-Earth Observing System)
  • KML (KML (formerly Keyhole Markup Language), Version 2.2)
  • NDF (National Landsat Archive Production System (NLAPS) Data Format)
  • SAIF (Spatial Archive and Interchange Format, Canadian)
  • SDTS (Spatial Data Transfer Standard)
  • shp and shx (ESRI Shaepfile must have components; other optional components as well, see entry)
  • SID (MrSID- Multi-resolution Seamless Image Database)
  • TAB (MapInfo dataset format, must have component)

Mathematical

Medical Imaging

  • AFNI (data, meta-data (.BRIK,.HEAD))
  • .MGH (uncompressed)
  • .MGZ (zip-compressed)
  • Analyze data, meta-data (.img,.hdr)
  • DICOM (Digital Imaging and Communications in Medicine (.dcm))
  • MINC (Medical Imaging NetCDF format; since version 2.0, based on HDF5 (.mnc))
  • OME-TIFF (Open Microscopy Imaging format)
  • OST (Open Spatio-Temporal) (extensible, open alternative for microscope images)
  • nii (Neuroimaging Informatics Technology Initiative (NIfTI) single-file (combined data and meta-data))
  • gii (NIfTI offspring for brain surface data, single-file (combined data and meta-data) style)
  • .img,.hdr (NIfTI offspring for brain surface data, dual-file (separate data and meta-data, respectively) style)
  • SDM (Signed Differential Mapping- brain maps(.sdm))


Oceanographic, Atmospheric and Meteorological

  • GRIB (Grid in Binary)
  • BUFR (Binary Universal Format Representation)
  • IOAPI (netCDF augmented with metadata from the I/O API)
  • PP (UK Met Office format for weather model data)

Physics

  • CGNS (Computational Fluid Dynamics General Notation System)
  • NeXuS (Common data format for neutron, x-ray and muon science)
  • QCDml (Lattice QCD gauge configuration markup language)

Scientific Signal data

  • ACQ (AcqKnowledge File Format for Windows)
  • BKR (EEG data format)
  • BDF (BioSemi data format)
  • CFWB (Chart Data File Format)
  • EDF (European data format)
  • FEF (File Exchange Format for Vital signs)
  • GDF (General data formats for biomedical signals)
  • GMS (Gesture And Motion Signal format)
  • IROCK (intelliRock Sensor Data File Format)
  • MFER (Medical waveform Format Encoding Rules)
  • SCP-ECG (Standard Communication Protocol for Computer assisted electrocardiography)
  • SEG Y (Reflection seismology data format)
  • SIGIF (SIGnal Interchange Format)

Social Sciences

  • DDI (Data Documentation Initiative)
  • SAS (Statistical package)
  • SPSS (Statistical package)
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox