XML Means Business
XML, the eXtensible Markup Language, is being
touted as the hottest Internet technology since Java.
XML is a standard for defining and sharing information (data) in
a clean, platform-neutral
way. It allows organizations and groups to define custom markup
languages for specific tasks, such as supply-chain integration,
e-commerce, or travel. XML is being developed by the World
Wide Web Consortium (W3C) and has been embraced by numerous technology
vendors including IBM, Oracle, Sun, and Microsoft.
The Need for Portable Data
Today, most web content is based on HTML.
While this has allowed the Internet to rapidly grow to its present
size, HTML does not readily support large-scale e-Business as it
is concerned only with the display of text.
HTML = (semi) Portable Text
HTML describes the presentation aspect of text
(e.g., heading level, font color) without regard to the actual information.
For example, an order status page on the Acme
Widget Company's web site might display a nicely formatted table
of information using the following snippet of HTML:
<TABLE BORDER="1">
<TR>
<TD BGCOLOR="#FFFFCC"><B>Order
Number</B></TD>
<TD BGCOLOR="#FFFFCC"><B>Status</B></TD>
</TR>
<TR>
<TD>100035</TD>
<TD>shipped</TD>
</TR>
<TR>
<TD>100064</TD>
<TD><B><FONT
COLOR="red">credit hold</FONT></B></TD>
</TR>
</TABLE>
While this HTML may present information in
a format suitable for human interaction, it does not lend itself to
automate processing; it is very difficult to program a computer to
automatically decipher the information presented.
A further problem with HTML is that it has
reached the limit of its usefulness as a way of describing information
and is overburdened with incompatible extensions from different
browser manufacturers.
XML = Portable Data
XML will allow specialized communities to create
their own customized markup languages or vocabularies for exchanging
information in their domain (e.g., general business, manufacturing,
music, finance, education). Once standardized vocabularies are
created and shared across an industry, information may be easily exchanged.
Coming back to the order status example,
consider the following XML:
<?xml version="1.0" standalone="yes"?>
<ORDER_STATUS>
<ORDER>
<ORDER_NUMBER>
100035 </ORDER_NUMBER>
<STATUS> shipped
</STATUS>
</ORDER>
<ORDER>
<ORDER_NUMBER>
100064</ORDER_NUMBER>
<STATUS> credit
hold </STATUS>
</ORDER>
</ORDER_STATUS>
If this XML data were sent, it would be easy
to process this information automatically.
XML can significantly change the Web and
provide the needed mechanism for widespread e-Business. Not
only will it be an excellent mechanism for providing information
to people, but one where automated transactions are commonplace.
What is XML?
XML is a metalanguage for defining markup languages
for classes of documents containing structured information.
The structured information is made of both content (e.g. words and
pictures) and an indication of what role that content plays (e.g.
certain content in a page heading has a different meaning from the
same content in the body).
XML lets developers define or use customized
markup languages for specific classes of documents (e.g., purchase
order, invoice).
The XML 1.0 standard was approved and published
by the World Wide Web Consortium (W3C) on February 10, 1998.
Since then, XML technology has quickly gained favor as a universal
data interchange format. XML has a number of important characteristics:
Simplicity – XML uses a text-based tag
language that is easy to understand.
Sophistication – While conceptually simple,
XML can model data to any level of complexity.
Extensibility – New "tags" and
"vocabularies" to support a particular domain can be invented
when needed.
Validation – Data can be checked for semantic
or structural correctness.
Independence – XML is media, vendor, and
platform independent.
Maturity – Even though XML itself is relatively
young, it is a proven technology based on over a decade of experience
with the SGML markup language from which XML is derived.
International Support - XML has built-in
support for Unicode, an international language-encoding standard.
To use XML technology, one must create or
use a domain-specific markup language. For example, XML could
be used to create an Order Markup Language or an Invoice Markup
Language.
What Does XML Look Like?
As we've see above, the basic structure
of XML is similar to HTML. Documents can be very simple, with
no document type declaration, and a straightforward nested markup:
<?xml version="1.0" standalone="yes"?>
<dialogue>
<question>What color is the sky?</question>
<answer>Blue, sometimes.</answer>
</dialogue>
Or they can be significantly more complex,
with a DTD (Document Type Definition) specified and a much more
elaborate structure.
DTD and Validity
In XML, a DTD specifies the definition of
the constituent elements and attributes, and rules for their use.
A DTD may be embedded within an XML document or external to it.
If the DTD is stored externally then the XML document must provide
a reference to the DTD. A document that provides a DTD and
adheres to the rules it is termed valid.
XML does not require that a DTD be used.
Documents without DTDs that follow the rules of XML are designated
as well-formed, but not valid.
Related Technologies
XSL
The eXtensible Stylesheet Language (XSL)
is a language for defining stylesheets. An XSL stylesheet
controls the transformation of XML documents and specifies formatting
semantics. In this way, it supports the XML notion of separating
the content (XML) from the presentation.
XSL can be used server-side and client-side.
Server-side processing of XML into HTML allows display within older
browsers, however all the content semantics are replaced by display
formatting—this obviates many of the advantages of XML for information
delivery.
Java
Sun has announced a Java extension for XML.
This will provide standard classes to generate and manipulate XML,
and as a standard extension, these classes will be available on
just about every Java platform. Sun has also announced that
it is adding a standard extension based upon XML technology to the
next release of the Enterprise JavaBeans architecture—this will
increase the portability of enterprise beans components.
Electronic Data Exchange and E-Commerce
One would think that universal data exchange would
be simple given the current state of computing and the Internet—sadly,
this is not the case. Deciphering and validating data format
and interpretation of content are difficult. Thankfully, using
XML as the basis for data exchange should help for a number of reasons.
Off-the-shelf Parsers –
Historically, electronic data exchange of non-standard data formats
required construction of proprietary parsers. With XML, standard
parsers are available.
Validation – As discussed earlier,
an XML parser can immediately validate the content structurally
against a DTD —ensuring that all required fields are present and
in the proper relationship.
Electronic Data Interchange (EDI)
EDI is a special type of electronic data
exchange. It relies on either the X.12 or EDIFACT standards
to format documents (information) being exchanged. Additionally,
EDI is nearly always transmitted using a VAN (Value-Added Network).
Thus, EDI very expensive to install, usually requires customization
depending upon the terms established by the exchanging parties,
and incurs ongoing costs for the VAN.
The high cost and complexity of EDI has
been a significant obstacle to its implementation. According
to the XML/EDI Group, only 2 percent of U.S. businesses are using
EDI. XML is viewed as the up-and-coming alternative that will
supplant EDI.
Conclusion
XML holds much promise. It has rapidly become
an industry-wide standard for building defining and sharing data
in a clean, platform-neutral way. Work is under way to extend
the technology and industry groups are developing domain-specific
markups.
References
The W3C's XML home page –
http://www.w3.org/XML/
The XML Page from Seybold – http://www.xml.com
IBM's XML Page - http://www.ibm.com/xml
Microsoft's XML Page – http://msdn.microsoft.com/xml
TRADEMARKS. PowerVision and We Make
IT Happen! are service marks of PowerVision Corporation. Other
product and company names mentioned herein may be the trademarks
of their respective owners.
|