One of the best things about programming in PHP
is how easy it is to learn the language and its libraries. Everything
seems to fit together and you can almost guess how to do a specific
task, even not knowing much about the function that you need to use.
Improving Performance by Profiling PHP Applications
One of the best things about programming in PHP is how easy it is to
learn the language and its libraries. Everything seems to fit together
and you can almost guess how to do a specific task, even not knowing
much about the function that you need to use. A lot of times you will
just try doing something and be surprised that the language has that
specific feature.
All of this is very nice, but it takes a little while to get to know
PHP's tricks, especially those related to performance and memory
consumption. There are lots of subtle things in PHP that can save a lot
of memory and give a small extra boost in performance. This article
serves as an introduction to profiling PHP applications, and to making
changes to a script and comparing the values of the optimizations.
To quote the dictionary, a profile is a formal summary or analysis
of data, often in the form of a graph or table, representing
distinctive features or characteristics. By placing timers within the
code and running them over and over, we can obtain a profile of how
fast or slow specific areas of a program are executed. This data can be
used to discover bottlenecks and other areas that can be optimized to
improve the program's performance.
Assuming you have read my series of articles about PEAR, the concept
of profiling PHP scripts is covered in a simple manner by a PEAR
library. We will use this library to create examples of profiling
scenarios, as this is the easiest and most compatible way to analyze
your current code. You don't even have to use a commercial product or
compile an extension to profile your code.
The library name is PEAR::Benchmark and it is very useful to profile
and do quick benchmarks on your code set. One of the classes available
in this library is named Benchmark_Timer() and tracks the
time difference from one function call to the next. When you are done
with the benchmark, you can get verbose result of the script execution.
It is as simple as this:
<?php
include_once("Benchmark/Timer.php");
$bench = new Benchmark_Timer;
$bench->start();
$bench->setMarker('Start of the script');
// now sleep for a few seconds
sleep(5);
$bench->stop();
// get the profiling information from this timer
print_r($bench->getProfiling());
?>
This script will output something like this:
Array
(
[0] => Array
(
[name] => Start
[time] => 1013214253.05751200
[diff] => -
[total] => 0
)
[1] => Array
(
[name] => Start of the script
[time] => 1013214253.05761100
[diff] => 9.8943710327148E-05
[total] => 9.8943710327148E-05
)
[2] => Array
(
[name] => Stop
[time] => 1013214258.04920700
[diff] => 4.9915959835052
[total] => 4.9916949272156
)
)
That might seem like a bunch of weird numbers, but it is very useful if used in a bigger scale.
Dissecting the Numbers
As most people will guess, the topmost array entries are the actual calls to the Benchmark_Timer() class, like $bench->start(), $bench->setMarker(), and $bench->stop(). The number associated with each of these array entries is pretty straightforward. Let's look closer at them:
[0] => Array
(
[name] => Start
[time] => 1013214253.05751200
[diff] => -
[total] => 0
)
The entry time refers to the UNIX timestamp of when the call to the start() method of Benchmark_Timer() was done. The diff
item shows the difference in time between this call and the previous
one -- since there is no previous one, it shows a dash. The total
item is the total time of execution since the beginning of the
benchmark until this specific method call. Seeing the output of the
next array entry is useful as well.
[1] => Array
(
[name] => Start of the script
[time] => 1013214253.05761100
[diff] => 9.8943710327148E-05
[total] => 9.8943710327148E-05
)
We can see from these numbers that the call to $bench->setMarker(....) was made 9.8943710327148E-05 seconds after the call to $bench->start(). No need to go after a calculator, that means 0.0000989 seconds. :)
A Benchmark Anecdote
The example was good, but it is really not a good example of how to
make decisions related to your site, and especially to your site code.
I will use a personal experience that I had while working as a
contractor for a company that had a huge Web community on a niche Web
site and was having performance problems.
I really didn't understand much of the code because it was written
along the years with specific needs -- one block of include files
relating to the site translation code, another one to log the site's
usage, and so on. We both knew (me and the main programmer of that
site) that something needed to be optimized, but didn't know what the
problem was.
To shorten the story, I ended up going into the main script and adding several lines of $bench->setMarker() all over the script and its included files. We then analyzed the output of $bench->getProfiling()
and were surprised by the outcome. It seemed that the problem was
actually related to a function call that was being used hundreds of
times on every page to get the translation code for a specific language
name (i.e. "en" for "english"). Every time the function was called, the
script would do a query to a MySQL database to get the real name from a
table.
We ended up creating a cache system for this type of information. In
two days we had a huge performance boost, and consequently the number
of page views increased by 40% in the first week. All in all, this
serves as an interesting example of how profiling your code can help on
the general health of your Web application or Web site.
Benchmarking Function Calls
While the Benchmark_Timer() class is very useful in
profiling the code on a script or page (and its included files), it is
not very scientific, since you have to reload the script several times
to get the mean of the profiled values, and it is not very specific to
a class or function call.
The other class in the PEAR::Benchmark library, called Benchmark_Iterator,
is intended to tackle this very issue -- running a series of executions
for a very specific function or class method and displaying its
profiling information. Its purpose is to get a consistent response from
the benchmarks, since everybody knows that running a script one time
and receiving an execution time of 10 seconds doesn't mean that it
always performs like that.
In any case, let's see some examples:
<?php
// code to connect to the database here
include_once("DB.php");
$dsn = array(
'phptype' => 'mysql',
'hostspec' => 'localhost',
'database' => 'database_name',
'username' => 'user_name',
'password' => 'password'
);
$dbh = DB::connect($dsn);
function getCreatedDate($id)
{
global $dbh;
$stmt = "SELECT created_date FROM users WHERE id=$id";
// as always, let's use PEAR::DB here
$created_date = $dbh->getOne($stmt);
if ((PEAR::isError($created_date)) ||
(empty($created_date))) {
return false;
} else {
return $created_date;
}
}
include_once 'Benchmark/Iterate.php';
$bench = new Benchmark_Iterate;
// run the getDate function 10 times
$bench->run(10, 'getCreatedDate', 1);
// print the profiling information
print_r($bench->get());
?>
The code above will generate output similar to this:
Array
(
[1] => 0.055413007736206
[2] => 0.0012860298156738
[3] => 0.0010279417037964
[4] => 0.00093603134155273
[5] => 0.00094103813171387
[6] => 0.00092899799346924
[7] => 0.0010659694671631
[8] => 0.00096404552459717
[9] => 0.0010690689086914
[10] => 0.00093603134155273
[mean] => 0.0064568161964417
[iterations] => 10
)
This should be pretty self explanatory -- the mean item shows the average value for all 10 iterations of the getCreatedDate() function. The benchmark should probably run for more than a thousand iterations, but this example is good enough :).
I hope that with this article I've given some insight into practical
ideas of how to quickly profile PHP scripts. Please be aware that
analyzing your code is not something simple, as you need to know a lot
about the language's specific features.
|