Mitigating XPath Injection Attacks in PHP

PHP has two libxml based extensions that allow to execute XPath 1.0 expressions: DOM (by the DOMXPath class) and SimpleXML (with its xpath() method).

Both extensions are prone to XPath Injection Attacks, a common attack form. Albeit all this, and information about the topic is available, it seems that concrete PHP code to deal with these is harder to find.

XPath String Quote

In the Stackoverflow question How to handle double quotes in string before XPath evaluation? I did place the following code as an answer which is useful for encoding a string value that can be used in an xpath query:

 * xpath string handling xpath 1.0 "quoting"
 * @param string $input
 * @return string
function xpath_string($input) {

    if (false === strpos($input, "'")) {
        return "'$input'";

    if (false === strpos($input, '"')) {
        return "\"$input\"";

    return "concat('" . strtr($input, array("'" => '\', "\'", \'')) . "')";

It can be used to encode a string value that should allow single and double quotes. Otherwise you can also keep this more conservative and disallow single and double quotes in the first place.

Take note that a string used as XPath in PHP has always to be UTF-8 encoded.

A small usage example of the xpath_string() function:

    <element>Nobody complains.</element>
    <element>She said: "It's the time of my life."</element>
    <element>Jan moans: "It's too late when it's too late."</element>
    <element>Frida comments: "..."</element>
$term   = '"It\'s';

$xpath  = sprintf('/*/element[contains(., %s)]', xpath_string($term));
$result = $xml->xpath($xpath);
/*/element[contains(., concat('"It', "'", 's'))]

She said: "It's the time of my life."
Jan moans: "It's too late when it's too late."

Writing such an xpath query by hand is easy to make an error with. Using xpath_string() and sprintf() being an xpath query with a placeholder where the quoted string gets insert is rather straight forward.

Constant Xpath Expression with Parameters

There is a second way to do this without such a function by re-using an (attribute) node-value that is part of the document and can be set prior running the query. This is demonstrated by the following example: On the root-element the attribute named parameter is set to the search term. In the xpath expression that attribute node is used for the comparison then. I had this as an answer on Stackoverflow as well: Searching XML items PHP XPath, sure the idea is way older, see Encoding XPath Expressions with both single and double quotes.

$term = '"It\'s';

$xml['parameter'] = $term;

$xpath  = '/*/element[contains(., /*/@parameter)]';
$result = $xml->xpath($xpath);
XPath Query: /*/element[contains(., /*/@parameter)]

She said: "It's the time of my life."
Jan moans: "It's too late when it's too late."

As it’s clearly visible the XPath expression is now constant, nothing is even injected any longer into the string. This has the downside that the document gets changed.

Pre-compiled / cached / secure XPath queries are not possible in PHP yet, there is just no such thing in PHP.

A similar way of parametrized XPath compared to xpath_string()/sprintf() above can be seen in an XML Library called fDOMDocument written by Arne Blankerts. It wraps the DOMXPath object and comes with everything needed for quoting (similar to xpath_string() above) and also for preparation and binding. Classes of interest are fDOMXPath and XPathQuery.

All this should help to protect your applications flow against Xpath injection of malicious strings. I know there exists other code as well, but this here plus the linked fDOMXpath is what I can actually suggest to use with a good feeling. Other routines often look very complicated and that attracts errors. Also it’s harder to compare, so I picked those which are more easy to compare as well to have a better review of the code. Sure I’m open to feedback, so drop a line in the comments.

This entry was posted in Hakre's Tips, PHP Development, Pressed, Surviving the Internet and tagged , , , , , , , . Bookmark the permalink.

1 Response to Mitigating XPath Injection Attacks in PHP

  1. Pingback: XPath Null Byte Injection in PHP | hakre on wordpress

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.