Articles: 843 | Categories: 148   
   
   
Home Articles Contact Us
 
 
 
 
Processing XML with PHP (0 Comments)
Admin: Posted Date: April 4, 2010

PHP provides several functions to process XML-compliant documents by using the expat library. These functions are made available to the programmer if PHP has been compiled with expat.

Processing XML with PHP

In this tutorial you will learn the basics of using PHP to interpret XML.

A markup language is a collection of directions and special symbols that are inserted throughout a document. These directions can identify special sections of a document and further define the information's background and meaning.

A commonly used markup language on the Internet today is Hyper Text Markup Language (HTML). HTML directions (which are known as elements) are interspersed throughout a text file that can then be viewed by a web browser program. The elements can affect the formatting of text, break the document up into logical sections such as headers, paragraphs and lists and can even link several documents together.

The eXstensible Markup Language (XML) isn't really a language like HTML. Rather, XML is a subset of rules taken from a larger language known as SGML that lay down guidelines on how new markup languages should be written. Many new and specialized markup languages (called Applications) have cropped up and anyone can use XML to write their own markup language to fit a certain need. Because the Application follows the XML specification, the information within its documents can be easily shared with others and it is easier to write the interpreting programs.

Some examples of popular XML Applications are:

   BSML - Bioinformatic Sequence Markup Language
   CDF - Channel Definition Format
   CKML - Conceptual Knowledge Markup Language
   CML - Chemical Markup Language
   EAD - Encoded Archival Description
   GedML - Genealogical Data in XML
   ICE - Information and Content Exchange
   IMS - Information Metatdata Specification
   JSML - Java Speech Markup Language
   MathML - Mathematical Markup Language
   OFX - Open Financial eXchange
   OSD - Open Software Description
   RDF - Resource Description Framework
   RSS - Really Simple Syndication
   SMIL - Synchronized Multimedia Integration Language
   SVG - Scalable Vector Graphics
   TIM - Telecommunications Interchange Markup
   UXF- UML eXchange Format
   XHTML - eXstensible Hypertext Markup Language
   XML/EDI - XML/Electronica Data Interchange

PHP provides several functions to process XML-compliant documents by using the expat library. These functions are made available to the programmer if PHP has been compiled with expat.

Processing XML with PHP - Create the Parser Object

An XML parser object is created to process the XML document with the function xml_parser_create. This object will be destroyed automatically when PHP has completed processing the script, but it still may be a good idea to destroy the object and free up resources by using the xml_parser_free function, especially if there still more script processing after the last use of the parser.

<?php
$parser 
xml_parser_create();
/* xml processing code here */
xml_parser_free($parser);
?>

Processing XML with PHP - Configure the Parser


The behavior of the XML processor can be fine-tuned by passing the parser handle, the option and the desired value to xml_parser_set_option. Most often you will find yourself using this function to disable case folding (which oddly enough is enabled by default even though it goes against the XML specification).

<?php
xml_parser_set_option
($parserXML_OPTION_CASE_FOLDINGfalse);
?>

Processing XML with PHP - Define the Callbacks



The processor will make use of callback functions to handle the opening tags, closing tags and content of the XML document. These functions are identified to the processor though use of xml_set_element_handler and xml_set_character_data_handler functions. Each accepts the parser handle and the names of the callback functions we'll need to write later.

<?php
function opening_element($parser$element$attributes) {
  
/* opening XML element callback function */

}

function 
closing_element($parser$element) {
  
/* closing XML element callback function */

}

function 
character_data($parser$data) {
  
/* callback function for character data */

}

xml_set_element_handler($parser"opening_element""closing_element");
xml_set_character_data_handler($parser"character_data");
?>

Note that the callback functions receive arguments: the opening XML element callback function accepts the parser handle, the current element and the element's attributes. The closing element callback function accepts the parser handle and the current element. The callback function responsible for handling the character data accepts the parser handle as well as the data. The attributes accepted by the opening element function is an array, of which the attribute names act as keys.

Processing XML with PHP - Read the XML Document

Once the framework has been set up for the parser, it's time to read the XML document. This is accomplished by opening the file for standard reading and passing the information to the xml_parse function. Like the other functions, xml_parse first expects the parser handle.

<?php
$document 
file("sample.xml");

foreach (
$document as $line) {
  
xml_parse($parser$line);
}
?>

Processing XML with PHP - Conclusion

The contents of your callback functions are obviously dependant upon the XML document your script will process and how you want to display its information. Let's assume the following sample XML document:
&lt;list&gt;
  &lt;languages type="interpreted"&gt;
    &lt;name&gt;PHP&lt;/name&gt;
    &lt;name&gt;Python&lt;/name&gt;
    &lt;name&gt;Ruby&lt;/name&gt;
  &lt;/languages&gt;
  &lt;languages type="compiled"&gt;
    &lt;name&gt;C&lt;/name&gt;
    &lt;name&gt;Fortan&lt;/name&gt;
    &lt;name&gt;Pascal&lt;/name&gt;
  &lt;/languages&gt;
&lt;/list&gt;

We can write the functions of our framework to manipulate the contents of the sample XML file.

<?php
$compiled_langs 
= array();
$interprt_langs = array();
$flag "";
$count 0;

function 
opening_element($parser$element$attributes) {
  
/* opening XML element callback function */

  
global $flag;

  if (
$element == "languages")
    
$flag $attributes["type"];
}

function 
closing_element($parser$element) {
  
/* closing XML element callback function */

  
global $flag;

  if (
$element == "languages")
    
$flag "";
}

function 
character_data($parser$data) {
  
/* callback function for character data */

  
global $flag;

  if (
$flag == "compiled") {
    global 
$compiled_langs;
    
$compiled_langs[] = $data;
  }

  if (
$flag == "interpreted") {
    global 
$interprt_langs;
    
$interprt_langs[] = $data;
  }
}

$parser xml_parser_create();
xml_parser_set_option($parserXML_OPTION_CASE_FOLDINGfalse);
xml_set_element_handler($parser"opening_element""closing_element");
xml_set_character_data_handler($parser"character_data");

$document file("sample.xml");

foreach (
$document as $line) {
  
xml_parse($parser$line);
}

xml_parser_free($parser);

echo 
"The following compiled languages were found...&lt;br /&gt;";
foreach (
$compiled_langs as $name) {
  
$count++;
  echo 
"$count. $name &lt;br /&gt;";
}
echo 
"&lt;br /&gt;";

$count 0;
echo 
"The following interpreted languages were found...&lt;br /&gt;";
foreach (
$interprt_langs as $name) {
  
$count++;
  echo 
"$count. $name &lt;br /&gt;";
}
?>

When run, the script produces the following output:

The following compiled languages were found...
  1. C
  2. Fortran
  3. Pascal

The following interpreted languages were found...
  1. PHP
  2. Ruby
  3. Python

 

 
 
Add a Comment:
 
(You must be signed in to comment on an article. Not a member? Click here to register)
   
Title:

Comments: