Introduction
One of the
incubator projects in the Zend Framework is Zend_Db_Xml. Zend_Db_Xml,
also known as the XML Content Store (XCS), allows web applications that
use XML data to easily update, save, and otherwise manage this data. In
this article I will introduce the XCS persistence API and discuss an
implementation using IBM’s DB2 9 database with its pureXML technology.
Finally, I’ll discuss a sample social networking application to show
how easy and fun it is to develop XML-centric applications using the
XCS.
XML Content Store (XCS)
One of the major
advantages of using an application development framework, such as the
Zend Framework (ZF), is that it provides an abstraction to the database
layer for data-driven Web sites. Data abstraction is helpful because it
allows the developer to concentrate on the behavior of data rather than
the often tedious details of database access and manipulation. We have
seen this with the Zend_Db package in the ZF, with such objects as
Zend_Db_Adapter, Zend_Db_Table, and Zend_Db_Select. With the
ever-increasing proliferation of XML data over the Web for better or
for worse, there is also a need to abstract the mechanics of persisting
XML data, including Create, Read, Update, and Delete (CRUD) operations.
The
XCS is an incubator project in the ZF and provides both a persistence
data access layer as well as an API for managing XML data easily. As an
introduction to the XCS, I’ll provide an architectural overview by
first describing the components that make up the XCS and by explaining
how they work together. Then we will put it all in action by using the
XCS to create a small social networking application built on top of the
ZF.
XMLContent (Zend_Db_Xml_XmlContent)
As
developers, we may encounter XML data in many forms, such as Web
Service messages, RSS/Atom feeds, and/or configuration files. Once you
determine that your application must “talk” XML and that the XML needs
to be saved somewhere, we can assume several things about this data.
- First
we need a way to uniquely identify it, so that once it is saved; it can
easily be programmatically found and retrieved. The unique name can be
a numeric id or a user-provided name.
- The second assumption
is that the XML data will be stored as is. It will not be modified or
changed in any way. No additional header or metadata elements or
attributes will be added and certainly nothing will be removed. Any
modifications that are required to the actual data elements will be
done by the application outside the XCS. Internally, the XML data is
stored as a DOM document, but an application is free to access the data
as a file stream, a string, or several other convenient access methods
which may or may not be implementation-dependent.
- Third, if
metadata is needed, the capability will be provided to add it, but it
will be saved separate from the XML data. For example, if the XML data
is a blog entry, perhaps the application would care to know the date
and title of the entry, or the hostname where the entry originated. The
metadata is saved in an “about” property and is also XML.
- Finally,
often times XML data will be accompanied by binary data, such as .jpeg,
.pdf, . or .doc files. An “attachment” property will associate this
binary data with the XML data. In the current implementation of
Zend_Db_Xml_XmlContent, the attachment property can contain either 0 or
at most 1 item, though optionally, a future version can contain any
number of items.
The XML data and its properties
(id, about, and attachment) are encapsulated in an object called
Zend_Db_Xml_XmlContent. Zend_Db_Xml_XmlContent objects are the
fundamental components of the XCS as they are the XCS representation of
XML data. As we will see next, the Zend_Db_Xml_XmlContentStore
component needs to know about the persistence technology (for example,
a relational database) used and how to access it, but
Zend_Db_Xml_XmlContent objects need not know anything about it.
XMLContentStore (Zend_Db_Xml_XmlContentStore)
Zend_Db_Xml_XmlContentStore
is an abstract class that represents a repository of XML documents. It
is responsible for updating the data source based on changes made to an
Zend_Db_Xml_XmlContent object in the repository as well as retrieving
Zend_Db_Xml_XmlContent objects from the data source based on search or
id criteria. A data source is defined very generally as the persistence
layer where the XML data is stored. It can be a relational database, an
XML database, or a file system, and it stores the XML data in its own
format. When an Zend_Db_Xml_XmlContentStore object is instantiated, it
receives a “connection handle” which describes in a meaningful way what
the data source is.
In the ZF, it becomes very convenient to
allow the connection handle to be a Zend_Db_Adapter object which allows
the persistence layer to be a relational database. Then, a call to the
insert() method on an Zend_Db_Xml_XmlContentStore object will allow the
underlying Zend_Db_Adapter object to build an appropriate SQL insert
statement based on the structure of the underlying tables used and the
contents of the Zend_Db_Xml_XmlContent object. It will connect to the
database and execute the statement. Other CRUD methods work in a
similar fashion and include: update(), delete(), deleteById(), and
selectAll().
Zend_Db_Xml_XmlContentStore also contains a
simple search facility that retrieves Zend_Db_Xml_XmlContent by its id
or by searching within the XML data or the “about” metadata in
Zend_Db_Xml_XmlContent. The search on XML data is done using XPath
expressions. These methods are find(), and findById(). There is also a
method, executeXPathPredicateQuery() that does simple XPath searches on
the data.
An Implementation using the DB2 Database
DB2
Express C V9 provides innovative pureXML technology to store and manage
XML data as a native data structure. This makes it very easy to store
and retrieve XML data without having to map or “shred” the XML data
into relational columns. The class Zend_Db_Xml_XmlContentStore_Db2 is
implemented using the Zend_Db_Adapter for DB2. The adapter uses the
ibm_db2 CLI driver. You can get a copy of DB2 Express C V9 and the
ibm_db2 driver by installing the Zend Core for IBM product. Please see
the resources section at the end of this article to learn where to
download Zend Core for IBM.
Because DB2 V9 supports a native
XML data type, one Zend_Db_Xml_XmlContentStore_Db2 object maps to one
table with four columns. These columns are:
- id, a unique integer and used as the primary key of the table
- data, defined as an XML column
- about, also defined as an XML column
- attachment, defined as a BLOB column
Using DB2, each row in the table represents one Zend_Db_Xml_XmlContent object.
The following classes are helper classes that allow for easier processing of Zend_Db_Xml_XmlContent objects.
Zend_Db_Xml_Xmlterator
It is possible that a search returns one
Zend_Db_Xml_XmlContent or a set of Zend_Db_Xml_XmlContent objects. In
the case where a set is returned, the Zend_Db_Xml_XmlIterator class is
used to easily iterate over the set of XML documents that meet the
search criteria. Zend_Db_Xml_XmlIterator implements the Iterator
interface so it knows several essential things about the set of
Zend_Db_Xml_XmlContent objects over which it is iterating. These
include its current location in the set, how to retrieve the next
object in the set, how to go back to the beginning of the set, and when
it has reached the last item in the set. This allows the developer to
assign behavior on the XML data at each iteration, using a foreach
construct for example, without having to worry about the details of
loop control.
Zend_Db_Xml_XmlUtil
Zend_Db_Xml_XmlUtil is a utility class that provides static
convenience methods for passing XML data back and forth from the
application to Zend_Db_Xml_XmlContent, either for the raw XML data or
for the “about” metadata. Though the XML is stored internally as a DOM,
Zend_Db_Xml_XmlUtil allows an application to use strings, file streams,
SimpleXML, or any other implementation-specific object representation
of XML data. Convenience methods for converting back and forth between
these different types of representations and DOM are provided.
Some XCS Applications
So
now you have seen the components that make up the XCS:
Zend_Db_Xml_XmlContent, Zend_Db_Xml_XmlContentStore,
Zend_Db_Xml_XmlIterator, and Zend_Db_Xml_XmlUtil. But what types of
applications would use the XCS?
There are several types of applications that will benefit from the XCS architecture. Here are a few examples:
- RSS/Atom Feed Aggregator
An RSS/Atom Feed Aggregator application is well-suited for the XCS. A
typical use case is that a user can input different feed URLs and the
application can periodically go out and retrieve feed updates and store
them in the XCS. These feeds can be displayed, searched, and possibly
be published as a “feed of feeds” as well.
- Content Management Systems
A content management system (CMS) is a computer software system for
organizing and facilitating collaborative creation of documents and
other content. Storing content as XML in the XCS allows easy retrieval
and search. “about” metadata can be used for workflow management and
processing. Though the data is stored as XML, an CMS application can
export documents as needed in .pdf, .doc, html, etc. by transforming
the XML into the required format. Storing the data as XML will be
transparent to the user of the application.
- Web Services and Mashups
Web Services send messages using XML data. Many applications can
benefit by storing data to be served in the XCS. As a web service
request comes in, the result can be easily retrieved or even composed
of several Zend_Db_Xml_XmlContent objects. Similarly, a mashup is a
website or web application that seamlessly combines content from more
than one source into an integrated experience. The content used in
mashups typically comes from a third party via a public interface or
API. Most of these API return data as XML and often times the data is
refreshed using AJAX, which inherently returns XML data. The XCS can be
used to maintain and manage this data. By querying and joining the
data, interesting scenarios can be created dynamically, in essence
creating a “mashup of mashups”.
I
have only listed a few applications of the XCS, but essentially, any
application that requires the storage, processing, and interchange of
XML data can be easily implemented using the XCS. In the world of Web
2.0, this may mean all applications!
A Sample Application: my.Net.wrk
Enough of the theory behind the XCS. Let’s see some code!
Sites
such as myspace.com (http://www.myspace.com/), LinkedIn
(http://www.linkedin.com/), and Friendster
(http://www.friendster.com/index.php) allow people to interact with
each other and to form social networks. We will build a simple social
networking site using the ZF and the XCS. It will not contain the full
functionality and features that the other sites contain, but will
contain the same, basic functionality:
- Create and maintain a user profile for social/professional networking
- Search for people within networks
- Make contacts with people and collaborate with them
Application Overview and Architecture
Users
will be able to register and log into the my.Net.wrk application. User
login information will be stored in a member table in the database.
This data will be updated and searched using the standard
Zend_Db_Adapter.
By registering, a user is also creating a
profile with their interests and experience. A user’s profile will be
stored as XML data in the XCS. A typical profile for someone
hypothetically named George Smith might look like this:
<member>
<fname>George</fname>
<lname>Smith</lname>
<email>gsmith@company.com</email>
<city>San Franscisco</city>
<state>CA</state>
<zipcode>11111</zipcode>
<org>USA</org>
<title>President</title>
<industry>Travel and Hospitality</industry>
<education>Masters Degree</education>
<exp>leader, some experience as a general</exp>
</member>
If you recall, an Zend_Db_Xml_XmlContent object has an
“about” property. This is a convenient place to store a user’s contacts
and relationships. We don’t have to store all the contact information.
Since the Zend_Db_Xml_XmlContent object also has an id property,
knowing your contacts’ id is enough to be able to pull the information
if we need it. George’s contact list might look like this:
<contacts>
<entry id="101">
<relationship>Co-worker</relationship>
</entry>
<entry id="211">
<relationship>Co-worker</relationship>
<relationship>Mentor</relationship>
</entry>
</contacts>
The id gives us an easy way to look up information on
George’s contacts. Also notice how one person can have multiple
relationships with someone.
This is the basic data model. The XCS and the Zend Framework will help put it all together.
Plugging into the Zend Framework
The
application is built on top of the ZF. I will assume that you are
already familiar with the basic MVC setup of applications in the
framework so I will only describe the important pieces of the
application as they pertain to the XCS. I will also assume that you are
using the Apache httpd server. Finally, I won’t talk too much about the
views. I will just say that the views grab and output data used by the
application. For more information about the ZF, please see the
resources at the end of this article.
To use the XCS
functionality, you will need the following directory from version 0.1.4
or later of the ZF: /incubator/library/Zend/Db/Xml. Copy this directory
to lib/Zend/Db/Xml of your actual framework installation under the
Apache htdocs directory. The general framework set up on my machine
looks like this under htdocs/zframework:
/app
/controllers
/views
/www
/images
/styles
.htaccess
index.php
/lib
/Zend
/Db
/Xml
_other Zend_Db components, including Zend_Db_Adapter
_other ZF components
DirectoryNameTypeDescription
| Directory |
Name |
Type |
Description |
| www |
| |
index.php |
Bootstrap |
XCS is initialized here |
| |
style1.css |
CSS |
Style sheet for html presentation |
| controllers |
| |
IndexController.php |
Controller |
Main controller |
| |
AddController.php |
Controller |
Creating a profile and adding contacts |
| |
ViewController.php |
Controller |
Main logic for navigating the site |
| views |
| |
index.php |
View |
Initial view |
| |
member.php |
View |
Member profile data |
| |
searchResult.php |
View |
Displays search results |
| |
view.php |
View |
User to get new user profile data |
| |
thanks.php |
View |
Acknowledgement of successful user action |
| |
error.php |
View |
Displays error messages |
| lib |
| |
Database.php |
Encapsulates XCS |
XCS and database functions |
Setting up the Database
The database
for the my.Net.wrk application is simple. We will store login
information in a table called db2admin.member. Remember that profiles
and contact information will be stored in the XCS as XML data.
We
will use the Zend_Db_Xml_XmlContent “data” property to track the user
profile and the “about” property to track a user’s set of contacts.
There are only a couple of DB2 administration commands needed. These may be executed on the DB2 command line:
-- create the database
create database contacts using codeset utf-8 territory us;
connect to contacts;
-- create the member table
create table db2admin.member(
xmlid bigint,
email varchar(50) not null primary key,
passwd varchar(10),
fname varchar(30),
lname varchar(30));
That’s it, we are done with the database! Notice that we
didn’t create the XCS. When the Zend_Db_Xml_XmlContentStore_Db2 object
is instantiated, the underlying DB2 table is created automatically
(that is, if it didn’t exist already – we wouldn’t want to delete
existing data)! The XCS checks for this.
Let’s a look a
little closer at how the Zend_Db_Xml_XmlContentStore_Db2 object is
created in the bootstrap index.php file. The factory() method for the
DB2 Zend_Db_Adapter is called and passed into the constructor for the
Database object. It is then placed in the ZF registry since we will use
the database in various places throughout the application. The database
class encapsulates the XCS.
$dbuser = "db2admin";
$dbpass = "db2admin";
$dbname = "CONTACTS";
$params = array( 'username' => $dbuser,
'password' => $dbpass,
'dbname' => $dbname );
$conn = Zend_Db::factory('Db2', $params);
$db = new Database($conn);
Zend::register('db', $db);
The Database Class
Below are the first few lines of the Database class, including the constructor:
<?php
require_once('Zend/Db/Xml/Db2.php');
class Database
{
private $_db;
private $_db1;
public function __construct($conn)
{
$this->_db = $conn;
$this->_db1 = new Zend_Db_Xml_XmlContentStore_Db2($conn);
}
The constructor instantiates the XCS with a DB2
Zend_Db_Adapter and assigns it to the $_db1 instance variable. We also
have one more database table that we will need to access so we will
keep the Zend_Db_Adapter available by assigning it to the $_db instance
variable.
The first time an operation occurs on $_db1, it
checks to make sure the underlying table exists. If the name of the
table is not passed in the constructor, it assumes the table is called
“xmldata”. If the table does not exist it creates it automatically
along with some indexes on the XML data to improve performance for
searches.
All databases operations on the XCS and on the MEMBER table are encapsulated in the Database class.
XCS: A Simple Example
Let’s
see a simple example of how to use the XCS. When a new user joins,
he/she creates a profile. The AddController::memberAction() takes care
of grabbing all the form data and creating an Zend_Db_Xml_XmlContent
object with the XML data.
After performing some basic validation steps on the data, it creates the XML document using the PHP DOMDocument object:
$doc = new DOMDocument();
$root = $doc->createElement("member");
$doc->appendChild($root);
// grab all the $_POST variables
// and create elements out of each
// using DOM methods
foreach ($_POST as $key => $value ) {
$elem = $doc->createElement($key);
$root->appendChild($elem);
$elemtext = $doc->createTextNode($value);
$elem->appendChild($elemtext);
}
// create the XMLContent object
// takes the DOM as a parameter
$myDoc = new Zend_Db_Xml_XmlContent($doc);
// we also want to create the contact information
// but since this is a new user, the contacts list
// will be empty
$about = new DOMDocument();
$abtRoot = $about->createElement("contacts");
$about->appendChild($abtRoot);
// $about is a property of Zend_Db_Xml_XmlContent
$myDoc->about = $about;
// save it!!
$db->saveNew($myDoc);
$id = $myDoc->id;
// keep track of the user/pwd information in
// the member table
// these are all $_POST variables that we already grabbed
// please see the attached .zip file to view the entire
// source
$db->addMember($id, $email, $passwd, $fname, $lname);
You probably noticed that once the XML document is created,
it is sent to the Database class with a call to the saveNew() method.
It looks like this:
public function saveNew($entry)
{
return $this->_db1->insert($entry);
}
$entry is a Zend_Db_Xml_XmlContent object and to save,
simply call the insert() method on the $_db1 XCS instance variable. The
point is that the developer does not have to worry about the SQL or
other implementation details of data persistence and can spend most of
his/her time with other aspects of the application such as implementing
business rules or making a really nice, interactive GUI, perhaps using
AJAX.
Figure 3 is a screen shot of a new user profile (notice that this user does not have any contacts set up as of yet):
XCS: A More Interesting Example
Now, let’s look at a slightly more interesting and complex example. What happens when a member wants to add a contact?
This
happens in AddController::contactAction(). Again, after performing some
validation and making sure the user is actually logged in, we add the
contact:
$idToAdd = $_POST['contactId'];
$relationShip = $_POST['relationship'];
// we don't want a user to add himself as a contact
if ($id != $idToAdd) {
$contacts = $db->addContact($id, $idToAdd, $relationShip );
// let's give the contact list
// to a view for display
$view->contacts = $contacts;
$view->member = $db->getMember($id);
$view->profile = $db->getProfile($id);
echo $view->render('member.php');
}
We let the database class do the work of adding the contact
by calling addContact(). It returns a list of contacts, including the
one we just added, so that the view can display the latest list of
contacts and relationships. Also, notice that all this happens only if
the id you are adding is different than your own. You wouldn’t want to
add yourself as a contact for yourself!
The fun code is in the Database class addContact() method:
public function addContact($id, $idToAdd, $relationship)
{
$result = $this->_db1->findById($id);
$xmlContent = $result->current();
// contact info is in 'about'
// but it's easier to process using SimpleXML
$sxml = Zend_Db_Xml_XmlUtil::exportToSimpleXML($xmlContent, Zend_Db_Xml_XmlUtil::ABOUT);
$exists = false;
$count = 0;
foreach ($sxml->xpath('//entry') as $curr) {
// person is a contact already?
if ((int)$curr['id'] $idToAdd) {
$relExists = false;
foreach ($curr->relationship as $currRel) {
if ((string) $currRel==$relationship) {
$relExists = true;
$exists = true;
break;
}
}
// add new relationship
if (!$relExists) {
$curr->addChild('relationship');
$curr->relationship[++$count] = $relationship;
$exists = true;
break;
}
if ($exists) {
break;
}
}
$count++;
}
// add new contact and relationship
if (!$exists) {
$entry = $sxml->addChild('entry');
$entry->addAttribute('id', $idToAdd);
$entry->addChild('relationship');
$entry->relationship = $relationship;
}
// save changes back to the persistence layer
$xmlContent->about = Zend_Db_Xml_XmlUtil::importSimpleXML($sxml);
$this->_db1->update($xmlContent);
return $this->processContacts($sxml);
}
We get the user information by searching by the user’s
unique id. Recall that the contact information is stored in the ‘about’
property as a DOM. Sometimes working with DOM can be cumbersome, so
let’s use a utility method to extract it as a SimpleXML object.
The
code checks to see if the person is already a contact. If so, and the
new relationship is different than any existing ones, then we add the
new <relationship>. If the person is not yet a contact, we add a
new <entry> element with the new relationship information.
Finally, import the SimpleXML back into a DOM and save the update by
calling the update() method.
The method returns an updated
list of contacts so that the view can display it. The processContacts()
method is a helper function that extracts the contacts, looks up the
names, and returns a nice array for easy manipulation by the view.
XCS: A Complex Example
As
users of the ZF, we are familiar with its design philosophy (I am
paraphrasing a bit): Simple enough to easily meet the needs of 80% of
use-cases, while powerful and flexible enough that advanced developers
can implement the remaining more difficult 20%, if desired.
The
XCS was designed with the same philosophy. The XCS API allows
developers to essentially forget about mundane CRUD tasks by hiding
these tasks in API calls. But sometimes the type of query you would
like to issue just cannot be done in a simple API call. When this
situation arises, the flexibility of the XCS allows us to take control
of the database ourselves and issue an SQL Query, an XQL/XML Query, or
an XQuery directly to the XCS table.
As you may recall, member
names and passwords are stored in the MEMBER table, while the member
profile is stored in the XCS. So what happens when we would like view a
member’s profile by doing a search on someone’s name? What if we only
know a first name or only a last name or only a portion of a name? The
solution is that we need to do a fuzzy search on a join of two tables
and we don’t care about case. Some of the data is in a relational table
while some of it is stored as XML in the XCS. Oh my!
Never
fear. Let’s get our hands dirty and bypass the XCS API by issuing a
SQL/XML query directly to DB2. Here is the search method found in the
Database class:
public function search($fname, $lname)
{
// select a bunch of data from 2 tables
// use SQL/XML to retrieve data from the XML doc
$sql = "SELECT m.xmlid, m.fname, m.lname, m.email,";
$sql .= "xmlserialize(xmlquery('\$data/member/org/text()' ";
$sql .= "passing x.data as \"data\") as varchar(120)) as company,";
$sql .= "xmlserialize(xmlquery('\$data/member/title/text()' ";
$sql .= "passing x.data as \"data\") as varchar(120)) as title ";
$sql .= "from db2admin.member m, db2admin.xmldata x where ";
// the where clause tells us what data to retrieve
// but we have to take into account where we have last name
// only, first name only, or both
if ($fname && $lname) {
$param = array();
$sql .= "(upper(m.fname) like ? or upper(m.lname) like ?) AND ";
$param[]= strtoupper("%$fname%");
$param[]= strtoupper("%$lname%");
} else if ($fname && !$lname) {
$param = array();
$sql .= "(upper(m.fname) like ?) AND ";
$param[]= strtoupper("%$fname%");
} else if (!$fname && $lname) {
$param = array();
$sql .= "(upper(m.lname) like ?) AND ";
$param[]= strtoupper("%$lname%");
}
$sql .= "m.xmlid = x.id";
if ($param && $result = $this->_db->fetchAssoc($sql, $param)) {
return $result;
}
return false;
}
We are selecting some information to display by joining, by
id, the “member” table and the XCS table, which is called “xmldata”. To
get data from within the XML document, we use the SQL/XML function
XMLQuery(). XMLQuery() can be used to execute XQueries, but for our
purposes we use it with a simple XPath expression. XMLSerialize() will
convert the XML to a string in order for us to be able to easily
manipulate the data in PHP. Finally, our where clause checks to see
what parameter was passed in: last name, first name, or both. And to
return as many results that might match our search, we ignore case, use
wildcard characters, and we use the OR operator.
Since we have crossed the boundary outside the XCS and into SQL, we use the Zend_Db_Adapter, $_db, to fetch the results.
Figure
5 is a screenshot of a search result. The search term was “sal” as a
first name and no last name was entered. Two “Sals” and one “Salvador”
were returned. By clicking on one of the links, you will be directed to
their profile and you will be given the option to add this person to
your contacts list. This is shown in Figure 6.
Also, please
note that some of the links in the application, such as “Our Privacy
Policy”, “About My.Net.wrk”, and “Customer Service”, were not
implemented, but hopefully you get the idea of what kind of view should
be rendered by each those links.
IndexController
indexAction: Renders the main page where a user can register, login, or search for someone.
AddController
newAction: Renders the registration page for someone to register
memberAction: Processes the form data from the registration page to add a new member
contactAction: Adds a new contact relationship for a member
ViewController
searchAction: Executes a search by calling the appropriate Database class search method. Renders results.
displayAction: Looks up and displays a member profile
_call():
Processes a user login and starts a session for the user. If the login
is successful, the user’s profile and contacts is displayed.
Conclusion
So
now you’ve seen the XCS, its components and API, and a cool social
networking application built on top of the ZF and XCS. There is already
a DB2 implementation in the class Zend_Db_Xml_XmlContentStore_Db2 that
takes advantage of DB2’s pureXML technology. There may be a need for
other implementations using your favorite database engine or other
persistence mechanisms, such as a file system or cache. Or
alternatively, download DB2 Express C V9 edition or Zend Core for IBM
and give the XCS a spin to see if it suits your needs. As the XCS is
still in the incubator, there are plenty of opportunities to provide
comments and suggestions for improvement.