HTML
HTML (hypertext markup language) originally was a SGML based markup language. XHTML is HTML redeveloped using the stricter XML rules. Disagreement over the direction of W3C developments from some of the browser vendors lead to the formation of the Web Hypertext Application Technology Working Group (WHATWG). They maintain the spec for the HTML5 or HTML Next or HTML Living Standard, which is not based on SGML any more. The W3C standardisation group will work to formalise the WHATWG specification as a series of standardised 'snapshots' of the living standard.
Contents |
Specs
- W3C specifications:
- HTML (1) specification
- HTML 2.0 specification (see also the RFC)
- HTML 3.2 specification
- HTML 4.01 specification
- HTML5 working draft
- XHTML 1.0 specification
- XHTML 1.1 specification
- The HTML Landscape enumerates the differences between the W3C HTML 5.0, 5.1 and the WHATWG Living Standard. The source for the landscape site is available here.
- Web Hypertext Application Technology Working Group (WHATWG) specifications:
HTML vs. XHTML
In HTML versions prior to HTML 5, there was a "fork" between HTML and XHTML, with the former being SGML-based and the latter XML-based. While the features of both are for the most part very similar, there are some syntactic differences which can trap the unwary, usually not causing any actual problems in rendering in common browsers (which are very forgiving of errors), but preventing validation. For instance, any tags not requiring a matching ending tag (e.g., <br>) need an added slash in XHTML to make them self-closing (<br />). This should not be used in HTML. There are some other differences such as HTML tags and attributes being case-insensitive so they can be entered in either uppercase or lowercase, while XHTML is case-sensitive and its standard tags are all lowercase. Some parts of the respective syntaxes won't mix and still validate as either variety, which is a problem when webmasters paste in code from diverse sources (including ad-network and affiliate links and scripts which may have terms-of-service contracts mandating that they be used in an unmodified form). However, HTML 5, which is not directly based on either SGML or XML, is more forgiving of allowing such mixed syntax.
The "forgiving" processing of mixed syntax applies only to documents served with the MIME type "text/html"; if an XML MIME type is used, browsers are supposed to be stricter in interpreting the syntax and rejecting documents which are improper or which are of a form they don't understand.
Nonstandard extensions
The formal specs, of course, do not fully describe the HTML documents in use in the "real world", as quite a number of nonstandard elements, attributes, and other extensions have been implemented in various browsers (including the most popular ones), and also, browsers have tended to be very forgiving of invalid markup, leading to lots of sloppy coding being widespread because "it works in [name of popular browser], so that's all that matters!"
In 2013, the Mozilla organization announced the removal of support for the nonstandard BLINK element, supported in various browsers since being introduced in the 1990s as a Netscape extension, and persisting despite widespread belief that it was annoying. New versions of Firefox and other Gecko-based browsers no longer flash text that is enclosed in this element, as well as in various CSS rules suggesting blinking or flashing.
Software
Validators
Test suites
Format conversion
Historical information
- W3C Web History Community Group
- Tim Berners-Lee discusses Web protocols/formats in Jan 1992
- Dive into HTML5 - How did we get here? also documents how HTML has developed.
- The Lost Tags of HTML, documenting early HTML versions and the tags that have been dropped from the standards.
- The Origins of the <Blink> Tag
- CERN's 'The birth of the web'. Includes work on restoring the first website and building a line-mode/terminal web browser simulation.
Other resources
- Markdown CSS: makes HTML look like plain text
- EFF Makes Formal Objection to DRM in HTML5
- W3C green-lights adding DRM to the Web's standards, says it's OK for your browser to say "I can't let you do that, Dave"
- Bug 923590 - Pledge never to implement HTML5 DRM (Bugzilla@Mozilla)
- Stop standardizing HTML
- BridgeIt: JavaScript library to add native mobile features to HTML 5 web apps