Deitel & Associates, Inc. Logo

Back to www.deitel.com

The Extensible Markup Language (XML) was developed in 1996 by the World Wide Web Consortium (W3C) XML Working Group. XML is a widely supported open technology for describing data that has become the standard format for data exchanged between applications over the Internet. In this tutorial, we introduce basic XML syntax. We also present an overview of technologies used to parse, validate and format XML documents.

Download the examples for this tutorial here.

[Notes: This tutorial is an excerpt (Section 19.2) of Chapter 19, XML, from our textbook Visual C# 2005 How to Program, 2/e (pages 931-934). This tutorial may refer to other chapters or sections of the book that are not included here. Permission Information: Deitel, Harvey M. and Paul J., Visual C# How to Program, 2/E ©2006. Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey.]

19.2   XML Basics (Continued)
Processing XML Documents
Processing an XML document requires software called an XML parser (or XML processor). A parser makes the document's data available to applications. While reading the contents of an XML document, a parser checks that the document follows the syntax rules specified by the W3C's XML Recommendation (www.w3.org/XML). XML syntax requires a single root element, a start tag and end tag for each element, and properly nested tags (i.e., the end tag for a nested element must appear before the end tag of the enclosing element). Furthermore, XML is case sensitive, so the proper capitalization must be used in elements. A document that conforms to this syntax is a well-formed XML document, and is syntactically correct. We present fundamental XML syntax in Section 19.3. If an XML parser can process an XML document successfully, that XML document is well formed. Parsers can provide access to XML-encoded data in well-formed documents only.
Often, XML parsers are built into software such as Visual Studio or available for download over the Internet. Popular parsers include Microsoft XML Core Services (MSXML), the Apache Software Foundation's Xerces (xml.apache.org) and the open-source Expat XML Parser (expat.sourceforge.net). In this chapter, we use MSXML.
Validating XML Documents
An XML document can optionally reference a Document Type Definition (DTD) or a schema that defines the proper structure of the XML document. When an XML document references a DTD or a schema, some parsers (called validating parsers) can read the DTD/schema and check that the XML document follows the structure defined by the DTD/schema. If the XML document conforms to the DTD/schema (i.e., the document has the appropriate structure), the XML document is valid. For example, if in Fig. 19.1 we were referencing a DTD that specifies that a player element must have firstName, lastName and battingAverage elements, then omitting the lastName element (line 8 in Fig. 19.1) would cause the XML document player.xml to be invalid. However, the XML document would still be well formed, because it follows proper XML syntax (i.e., it has one root element, and each element has a start tag and an end tag). By definition, a valid XML document is well formed. Parsers that cannot check for document conformity against DTDs/schemas are nonvalidating parsers-they determine only whether an XML document is well formed, not whether it is valid.
We discuss validation, DTDs and schemas, as well as the key differences between these two types of structural specifications, in Sections 19.5 and 19.6. For now, note that schemas are XML documents themselves, whereas DTDs are not. As you will learn in Section 19.6, this difference presents several advantages in using schemas over DTDs.
Software Engineering Observation 19.1
DTDs and schemas are essential for business-to-business (B2B) transactions and mission-critical systems. Validating XML documents ensures that disparate systems can manipulate data structured in standardized ways and prevents errors caused by missing or malformed data.
Formatting and Manipulating XML Documents
XML documents contain only data, not formatting instructions, so applications that process XML documents must decide how to manipulate or display each document's data. For example, a PDA (personal digital assistant) may render an XML document differently than a wireless phone or a desktop computer. You can use Extensible Stylesheet Language (XSL) to specify rendering instructions for different platforms. We discuss XSL in Section 19.7.
XML-processing programs can also search, sort and manipulate XML data using technologies such as XSL. Some other XML-related technologies are XPath (XML Path Language-a language for accessing parts of an XML document), XSL-FO (XSL Formatting Objects-an XML vocabulary used to describe document formatting) and XSLT (XSL Transformations-a language for transforming XML documents into other documents). We present XSLT in Section 19.7. We also introduce XPath in Section 19.7, then discuss it in greater detail in Section 19.8.
 
Page 1 | 2
Other XML Tutorials:
Structuring Data

Return to Tutorial Index