This is not the full PHP Syntax as Regex (which would not make much sense, use the tokenizer instead), but a collection of regular expressions for various elements of the PHP syntax I ran over lately.
PHP Names
While compiling this list it came to my attention that it is the same for Constant, Variable, Function and Class Names:
Subject | Regex | Comment |
---|---|---|
Constant | [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* |
Source |
Variable | [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* |
Source |
Function | [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* |
Source |
Class | [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* |
Source |
Namespace | ??? * |
Source |
* Unknown so far. Must somehow contain "\"
. But the same for class and function names, right? Probably the PHP Manual needs an update because of PHP 5.3.
PHP Values
Values, they are mostly quite easy, aren’t they? So I only pick those subjectively noteworthy for the moment.
Type | Regex | Comment |
---|---|---|
Integer | (([+-]?([1-9][0-9]*|0))|([+-]?(0[xX][0-9a-fA-F]+))|([+-]?(0[0-7]+))) |
As specified in the PHP Manual for Integers |
Float | ([+-]?INF|[+-]?(([0-9]+|([0-9]*[\.][0-9]+)|([0-9]+[\.][0-9]*))|([0-9]+|(([0-9]*[\.][0-9]+)|([0-9]+[\.][0-9]*)))[eE][+-]?[0-9]+)) |
Float was named Double in the past. I needed to extend the expression from the Float Manual as it didn’t care about the INF part and some other details. |
Serialized Values
If you’re dealing with the string representation of a PHP value in its serialized form, you need to differ. They do not strictly match with the regexes above. Those follow some similar but own rules. If you want to find out more, unserialize() is giving note if something did not work.
I probably will add some more if I need them and I run over them. Same probably for serialized values. You can leave suggestions in the comments as well.
Image Credits: Citations from Along the River During Qingming Festival, 18th century remake of a 12th century original by Chinese artist Zhang Zeduan; via Trialsanderrors
Pingback: PHP Autoload Invalid Classname Injection | hakre on wordpress