What is SGML?
-
Standard Generalized Markup Language: ISO standard ISO 8879:1986.
-
A meta language for describing data, regardless of what the data is.
-
The base from which
XML was developed.
-
An open, fully extensible standard for data.
-
SGML is a markup language used to impose identification and structure on information.
-
GML is extensible, which means it is not really a markup language at all!
- Instead, SGML is a set of rules you can use to create your own markup language.
- Basing markup languages on a common set of syntax rules allows:
- use of generic processing tools,
- evolving markup vocabularies, as needs change, and
- "multi-lingual" systems, which use multiple vocabularies.
-
A way to separate data from its display properties.
-
A better, faster, cheaper and much more flexible way of dealing with data.
-
A way to free applications from data storage architecture characteristics.
-
A way to leverage industry standard data definitions and naming conventions.
-
A standard way to describe ALL data, serialized and optimized for use over the Web.
SGML has been and continues to be in wide use around the world. Existing SGML-based information processing systems are just are useful and relevant today as they were when they were first built. However, because XML is a proper subset of SGML and is supported by tools that cannot support non-XML SGML documents, it would be rare for new projects to be developed using those features of SGML that XML does not include. That is, existing SGML-based systems can continue indefinitely, but there is no reason not to build new systems as XML-only systems: everything that was useful with SGML continues to be useful with XML and the things in SGML that were pared away to create XML were pared away because they offered little value relative to their cost. But the powerful SGML ideas of data abstraction, descriptive markup, and separating data from processing and presentation are just as powerful and important today as they were in 1986 when the SGML standard was published.
There is no real sense in which SGML has been supplanted by XML; rather XML is the next evolutionary step in the refinement of the ideas first standardized in SGML.
How SGML Works:
From the standpoint of how it works and is used, there is no useful difference between SGML and XML. See the XML entry for a discussion of how generalized markup works and is used.
How Does SGML Differ From XML?
The development of XML was the act of paring away from SGML all the features that had been found over the years to complicate SGML without adding sufficient value. The features of SGML not included in XML are:
- Markup minimization. In SGML it is possible to omit a lot of markup that the SGML parser can infer. This is convenient for hand authoring of SGML documents but is a serious complicating factor. In XML the only form of markup minimization is attribute value defaults.
- Required document type declarations. SGML requires that all documents have explicit document type declarations. This enables markup minimization, among other things but again complicates parsers. By eliminating markup minimization, XML removes the need to require doctype declarations. It also allows alternative ways of defining document types, such as XML Schemas.
- Data attributes (attributes on external, unparsed entity declarations). This was a rarely-used feature of SGML. Given the trend away from the use of entities at all, it was difficult to justify this feature in XML.
- A number of rarely or never-used features of SGML, including CONCUR, RANK, and DATATAG.
XML also imposes a number of additional syntax constraints, simplifying the rules for markup declarations, comments, and marked sections. It changes the syntax of processing instructions from “<? >” to “<? ?>”.
Note that since the development of XML, the SGML standard has been amended so that XML's feature set and additional constraints can be fully defined in an SGML declaration (the mechanism for specifying a particular set of SGML features and concrete syntax). However, these revisions to SGML are not widely implemented, such that most commercial SGML tools still reflect the SGML standard as it was before the latest XML-based amendments.