Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine readable.
The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages.
Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services.
XML is mainly designed to carry (or transmit) data, not to display data. XML tags are not predefine, user can define their own tags.
Example:
<root>
<child>
<subchild>...........</subchild>
</child>
</root>
XML Architecture :
CDATA & PCDATA :
By default, all text inside an XML document is parsed. But text inside a CDATA section will be ignored by the parser.
PCDATA - Parsed Character Data
XML parsers normally parse all the text in an XML document. When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):
Example:
<name>
<first>Bill</first>
<last>Gates</last>
</name>
CDATA - (Unparsed) Character Data
The term CDATA is used about text data that should not be parsed by the XML parser. Everything inside a CDATA section is ignored by the parser. To avoid errors, script code can be defined as CDATA.
A CDATA section starts with "<![CDATA[" and ends with "]]>"
Example:
<child>
<![CDATA[
function matchtwo(a,b)
{
if (a < b && a < 0 ) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</child>
Document Type Definations :
A set of structural rules called declarations, which specify a set of elements that can appear in the document as well as how and where these elements may appear.
Purpose of DTD to provide a standard form for a collection of XML documents. Not all XML documents have or need a DTD.
Types of DTD:
There are two types of DTD. They are
Internal DTD (appears within a XML document)
External DTD (appears as a external file – can be used with more than one document)
Declaring Elements within DTD:
DTD contains declarations that define elements, attributes, etc.
Syntax:
<!ELEMENT element-name (element-content)>
Example:
<!ELEMENT person(parent, age, spouse, sibling)>
Declaring Empty Elements within DTD:
Empty elements are declared with the keyword - EMPTY
Syntax:
<!ELEMENT element-name EMPTY>
<!ELEMENT br EMPTY>
XML Namespaces :
XML Namespaces provide a method to avoid element name conflicts. To use XML Namespaces, elements are given qualified names.
In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.
For example, this file carries HTML table information:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
Whereas this XML file carries user defined tags:
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
If these two files were added together, there would be a name conflict. Both contain a <table> element, but the elements have different content and meaning. An XML parser will not know how to handle these differences.
Solving the name conflict using a prefix. Name conflicts in XML can easily be avoided using a name prefix.
Example :
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
The xmlns Attribute:
When using prefixes in XML, a so-called namespace for the prefix must be defined. The namespace is defined by the xmlns attribute in the start tag of an element.
Syntax :
xmlns:prefix=“URI”
Example :
<root>
<h:table xmlns:h="http://www.w3.org/1999/xhtml/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.myschools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
Applications of XML:
Hundreds of document formats using XML syntax have been developed, including RSS, Atom, SOAP, SVG, and XHTML. XML-based formats have become the default for many office-productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple's iWork.
XML has come into common use for the interchange of data over the Internet. IETF RFC 7303 gives rules for the construction of Internet Media Types for use when sending XML. It also defines the media types application/xml and text/xml, which say only that the data is in XML, and nothing about its semantics.