Introduction to XML
Extensible Markup Language (XML) XML is a standard, simple, self-descriptive way of encoding both text and data so that content can be processed with relatively little human intervention and exchanged across diverse hardware, operating systems, and applications. It is a subset of Standard Generalized Markup Language (SGML), which was first developed in the 1970's as a means of exchanging text files between printers. The primary driving force between the adoption of XML on the Internet was to simplify the delivery of information.
XML Specification
XML documents consist entirely of Unicode characters and according to the specification, XML documents must be well-formed, that is they satisfy a list of syntax rules provided in the specification. The list is fairly extensive; some keypoints are:
- It contains only properly encoded legal Unicode characters.
- None of the special syntax characters such as "<" and "&" appear except when performing their markup-delineation roles.
- The begin, end, and empty-element tags which delimit the elements are correctly nested, with none missing and none overlapping.
- The element tags are case-sensitive; the beginning and end tags must match exactly.
- There is a single "root" element which contains all the other elements.
Example of XML
Here is a small but complete example of an XML document.
This example contains 5 elements: painting, img, caption, and two dates. The date elements are children of caption, which is a child of the root element painting. img has two attributes, src and alt.

Benefits of XML
Information coded in XML is easy to read and understand, plus it can be processed easily by computers.
OpennessXML is a W3C standard, endorsed by software industry market leaders.
ExtensibilityThere is no fixed set of tags. New tags can be created as they are needed.
Self-descriptionIn traditional databases, data records require schemas set up by the database administrator. XML documents can be stored without such definitions, because they contain meta data in the form of tags and attributes.
XML provides a basis for author identification and versioning at the element level. Any XML tag can possess an unlimited number of attributes such as author or version.
Contains machine-readable context informationTags, attributes and element structure provide context information that can be used to interpret the meaning of content, opening up new possibilities for highly efficient search engines, intelligent data mining, agents, etc.
This is a major advantage over HTML or plain text, where context information is difficult or impossible to evaluate.
Separates content from presentationXML tags describe meaning not presentation. The motto of HTML is: "I know how it looks", whereas the motto of XML is: "I know what it means, and you tell me how it should look." The look and feel of an XML document can be controlled by XSL style sheets, allowing the look of a document (or of a complete Web site) to be changed without touching the content of the document. Multiple views or presentations of the same content are easily rendered.
Supports multilingual documents and UnicodeThis is important for the internationalization of applications.
Facilitates the comparison and aggregation of dataThe tree structure of XML documents allows documents to be compared and aggregated efficiently element by element.
Can embed multiple data typesXML documents can contain any possible data type - from multimedia data (image, sound, video) to active components (Java applets, ActiveX).
Can embed existing dataMapping existing data structures like file systems or relational databases to XML is simple. XML supports multiple data formats and can cover all existing data structures and Provides a 'one-server view' for distributed data
XML documents can consist of nested elements that are distributed over multiple remote servers. XML is currently the most sophisticated format for distributed data - the World Wide Web can be seen as one huge XML database.
Rapid adoption by industrySoftware AG, IBM, Sun, Microsoft, Netscape, DataChannel, SAP and many others have already announced support for XML. Microsoft will use XML as the exchange format for its Office product line, while both Microsoft's and Netscape's Web browsers support XML. SAP has announced support of XML through the SAP Business Connector with R/3. Software AG supports XML in its Natural product line and provides Tamino, a native XML database.


