PHP provides several functions to process
XML-compliant documents by using the expat library. These functions are
made available to the programmer if PHP has been compiled with expat.
Processing XML with PHP In this tutorial
you will learn the basics of using PHP to interpret XML.
A markup
language is a collection of directions and special symbols that are
inserted throughout a document. These directions can identify special
sections of a document and further define the information's background
and meaning.
A commonly used markup language on the Internet
today is Hyper Text Markup Language (HTML). HTML directions (which are
known as elements) are interspersed throughout a text file that can
then be viewed by a web browser program. The elements can affect the
formatting of text, break the document up into logical sections such as
headers, paragraphs and lists and can even link several documents
together.
The eXstensible Markup Language (XML) isn't really a
language like HTML. Rather, XML is a subset of rules taken from a
larger language known as SGML that lay down guidelines on how new
markup languages should be written. Many new and specialized markup
languages (called Applications) have cropped up and anyone can use XML
to write their own markup language to fit a certain need. Because the
Application follows the XML specification, the information within its
documents can be easily shared with others and it is easier to write
the interpreting programs.
Some examples of popular XML Applications are:
BSML - Bioinformatic Sequence Markup Language
CDF - Channel Definition Format
CKML - Conceptual Knowledge Markup Language
CML - Chemical Markup Language
EAD - Encoded Archival Description
GedML - Genealogical Data in XML
ICE - Information and Content Exchange
IMS - Information Metatdata Specification
JSML - Java Speech Markup Language
MathML - Mathematical Markup Language
OFX - Open Financial eXchange
OSD - Open Software Description
RDF - Resource Description Framework
RSS - Really Simple Syndication
SMIL - Synchronized Multimedia Integration Language
SVG - Scalable Vector Graphics
TIM - Telecommunications Interchange Markup
UXF- UML eXchange Format
XHTML - eXstensible Hypertext Markup Language
XML/EDI - XML/Electronica Data Interchange |
PHP
provides several functions to process XML-compliant documents by using
the expat library. These functions are made available to the programmer
if PHP has been compiled with expat.
Processing XML with PHP - Create the Parser ObjectAn
XML parser object is created to process the XML document with the
function xml_parser_create. This object will be destroyed automatically
when PHP has completed processing the script, but it still may be a
good idea to destroy the object and free up resources by using the
xml_parser_free function, especially if there still more script
processing after the last use of the parser.
<?php
$parser = xml_parser_create();
/* xml processing code here */
xml_parser_free($parser);
?> |
Processing XML with PHP - Configure the Parser
The
behavior of the XML processor can be fine-tuned by passing the parser
handle, the option and the desired value to xml_parser_set_option. Most
often you will find yourself using this function to disable case
folding (which oddly enough is enabled by default even though it goes
against the XML specification).
<?php
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
?> |
Processing XML with PHP - Define the Callbacks
The
processor will make use of callback functions to handle the opening
tags, closing tags and content of the XML document. These functions are
identified to the processor though use of xml_set_element_handler and
xml_set_character_data_handler functions. Each accepts the parser
handle and the names of the callback functions we'll need to write
later.
<?php
function opening_element($parser, $element, $attributes) {
/* opening XML element callback function */
}
function closing_element($parser, $element) {
/* closing XML element callback function */
}
function character_data($parser, $data) {
/* callback function for character data */
}
xml_set_element_handler($parser, "opening_element", "closing_element");
xml_set_character_data_handler($parser, "character_data");
?> |
Note
that the callback functions receive arguments: the opening XML element
callback function accepts the parser handle, the current element and
the element's attributes. The closing element callback function accepts
the parser handle and the current element. The callback function
responsible for handling the character data accepts the parser handle
as well as the data. The attributes accepted by the opening element
function is an array, of which the attribute names act as keys.
Processing XML with PHP - Read the XML Document Once
the framework has been set up for the parser, it's time to read the XML
document. This is accomplished by opening the file for standard reading
and passing the information to the xml_parse function. Like the other
functions, xml_parse first expects the parser handle.
<?php
$document = file("sample.xml");
foreach ($document as $line) {
xml_parse($parser, $line);
}
?> |
Processing XML with PHP - ConclusionThe
contents of your callback functions are obviously dependant upon the
XML document your script will process and how you want to display its
information. Let's assume the following sample XML document:
<list>
<languages type="interpreted">
<name>PHP</name>
<name>Python</name>
<name>Ruby</name>
</languages>
<languages type="compiled">
<name>C</name>
<name>Fortan</name>
<name>Pascal</name>
</languages>
</list> |
We can write the functions of our framework to manipulate the contents of the sample XML file.
<?php
$compiled_langs = array();
$interprt_langs = array();
$flag = "";
$count = 0;
function opening_element($parser, $element, $attributes) {
/* opening XML element callback function */
global $flag;
if ($element == "languages")
$flag = $attributes["type"];
}
function closing_element($parser, $element) {
/* closing XML element callback function */
global $flag;
if ($element == "languages")
$flag = "";
}
function character_data($parser, $data) {
/* callback function for character data */
global $flag;
if ($flag == "compiled") {
global $compiled_langs;
$compiled_langs[] = $data;
}
if ($flag == "interpreted") {
global $interprt_langs;
$interprt_langs[] = $data;
}
}
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($parser, "opening_element", "closing_element");
xml_set_character_data_handler($parser, "character_data");
$document = file("sample.xml");
foreach ($document as $line) {
xml_parse($parser, $line);
}
xml_parser_free($parser);
echo "The following compiled languages were found...<br />";
foreach ($compiled_langs as $name) {
$count++;
echo "$count. $name <br />";
}
echo "<br />";
$count = 0;
echo "The following interpreted languages were found...<br />";
foreach ($interprt_langs as $name) {
$count++;
echo "$count. $name <br />";
}
?> |
When run, the script produces the following output:
The following compiled languages were found...
1. C
2. Fortran
3. Pascal
The following interpreted languages were found...
1. PHP
2. Ruby
3. Python |
|