InfoMedia Inc.

What Are GROVEs ?

An ISO/IEC standard (Annex A.4 of ISO/IEC 10744:1997, HyTime).
A generic, implementation-independent data model for representing data of any type.
A mechanism for enabling reliable, robust addressing of data regardless of syntax or format.
A data abstraction layer that can be used to unify disparate data systems.

Benefits:

Application Development

Provides a common data model and API for data access.
Allows hyperlinking systems to meaningfully mix and match data from different sources to create new results.

How GROVEs Work:

The basic idea underlying GROVEs is that notations like XML and SGML exist only as a syntax for some underlying data model.

GROVEs are usually, but not always, tied to particular media types ("notations"). So GROVEs for CGM documents would look fairly different from groves for XML documents. The terminology and semantics of these two specifications are quite different, so you would expect their APIs and query languages to also be fairly different. The GROVE model is designed to allow them to be exactly as different as they need to be, and no more! What that means is that the basic concepts are the same, but that every media type ("notation") defines its own vocabulary of "properties" in terms of the basic concepts underlying the notation. We call these vocabularies "property sets." In a GROVE-based view of the world, an XML document is a collection of hundreds of properties, all drawn from the XML property set. This is analogous to the way that a valid XML document is a collection of element types, all drawn from some DTD.

All properties are held in containers called nodes. Nodes represent everything in an XML document: elements, attributes, every significant character, all insignificant whitespace, etc. GROVEs are so complete that given a complete implementation, HyTime can make a hypertext link to the keyword "#REQUIRED" in an attribute list declaration and XSL stylesheets can (in theory) vary their formatting on the amount of whitespace between attribute values in a start-tag. Of course nobody is likely to go that far, but the point is that all of that information is available and addressable. For someone creating (for example) an XML editing or maintenance system, these issues could be important. Consider the case where the only difference between a checked-in document and the version in the archive is insignificant whitespace. A smart repository might choose not to increment the version number.

System designers can ask a grove builder to trim nodes that they do not need from the grove using a "GROVE plan". This means that your applications do not need to keep track of all of that information if you are not using it. Limited GROVE builders can describe their capabilities in terms of GROVE plans. Two products that claim to support the same grove plan should build identical groves for a particular document.

Property sets are defined in documents that conform to the "propset" DTD. You can think of these documents as simple schemas for property sets. They can specify that particular properties must contain particular types of values (integers, strings, nodes, lists of nodes). They can specify that some properties are so-called "sub-nodal" properties, which means that in the logical tree, the node with the property logically "owns" the node that is the value of the property. For example, elements in the SGML property set have a "subnodal" property called "attributes." This means that elements "own" attributes.