From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name EDI
Extension(s) .edi, .x12, .edifact
MIME Type(s) application/edifact, application/edi-x12
For the installation software, see EDI Install archive and related articles.

Electronic Document Interchange, or EDI, is a set of text-based data formats for structured communication between business systems.



EDI (Electronic Document Interchange) is an umbrella term for many distinct standards for machine-to-machine communications that has been in continuous use since the 1970s. Despite the existence of numerous detailed standards, an EDI transaction set (e.g. EDI document) will have customer-specific mappings applied before transmission. As a result each EDI integration is often effectively a custom integration to meet specific business requirements. Documents are text or XML and normally transmitted transmitted via FTP / SFTP or with specialized client/server software (e.g. AS1 (SMTP + SMIME) or AS2 - HTTP + SMIME)).

Parsing EDI

XML-based EDI can be parsed any XML parser and then handled accordingly, but text-based EDI requires EDI-specific tools.

Text based EDI files consist of a stream of segments. Each segment contains an array of elements. With each element containing data or sub elements themselves which contain data. Generally all data is ASCII characters and any binary data must be encoded in an ASCII safe manor (ie Base64 encoded). Specific implementations may support using UTF-8, UTF-16 or other locale-specific character-sets in place of ASCII. Although EDI tools often display one segment per line, often there are no line-breaks in EDI data file itself. This is because the "Segment Terminator" (the character or characters used as a delimiter between segments) is specified in a header segment (UNA for EDIFACT, ISA for X12). With Edifact traditionally using single quote ' and X12 using tilde ~ instead of a line-break (\n or \r\n) as is common with text files. Each segment begins with a segment tag. In EDIFACT this is three alphabetic characters[1] while in ANSI X12 it's two or three alpha-numeric characters.[2]



  1. EDIFACT Index of segments by alphabetical sequence by tag (revision 21a)
  2. ASC X12 Segment Directory
Personal tools