Regular Expresions for Source Code Analysis
Some codebases are stunningly large. Because there's no way to read over every line of code, I use complex search queries to look for interesting, potentially vulnerable code.
I recently started to keep track of some of the better queries I've written while doing penetration tests at work so that I could publish them here.
Just to clarify a couple confusing columns:
| vuln | lang | noise | type | files | query |
|---|---|---|---|---|---|
| XSS | low | regex | (backend)|<(a|span|p|img|button|script|style|input|form|i|b|u|iframe|body|base|select|label|div|td|tr|table) (.*?href|.*?src|onload|id|title)=[``'"].*?[{\+``"'] |
||
| XSS | med | regex | (backend)|<a href=['"].*?[{\+"'] |
||
| SQLi | med | regex | (backend)|(((SELECT|DELETE)[^\S].*?[^\S]FROM[^\S]|INSERT[^\S].*?[^\S]INTO))(.*?[^\S]WHERE.*?(=|IS))? |
||
| Secrets | med | regex | (secret|pass(word)?|API_?KEY)\s*?=[^=]?\s*?("(.*?)"|'(.*?)')? |
||
| Cmd Inj | med | regex | (Process\.Start|[^\S]system|execve|(\.getRuntime\(\)|runtime|[^\S]rt).exec)\(("([^"]*?)"|.*?)\) |
||
| XSS | ASP.NET | high | regex | *.ts|<[\S ]{1,5}>?.*(\$\{.*?\}) |
|
| XSS | ASP.NET | high | regex | *.cs|<(a|span|p|img|button|script|style|input|form|i|b|u)( |>).*(\{.*?\}) |
|
| XSS | ASP.NET | med | regex | *.cs|(\.Append\()?.*?(<(a|span|p|img|button|script|style|input|form|i|b|u)( |>)).*?([^\$]?".?\+.?[^\$]?") |
|
| SQLi | ASP.NET | high | regex | *.cs|(ExecuteSQLCommand) |
|
| XXE | ASP.NET | high | regex | *.cs|XmlDocument\( |
|
| rXSS | Java | med | simple | *.java, *.jsp|<%=*request.getParameter*%> |
|
| rXSS | Java | low | regex | <%=((?!Constant|getContext|Globals|Sanitize|StringEscapeUtils|new (Long|Integer|Boolean)|application\.getAttribute).)*?%> |
|
| SQLi | Java | med | regex | *.java, *.jsp|^(?!\s*(\/\/|\<logic|\*|\{|logger|whereTo|The|\(like|\/\*|if|boolean|Voter|see\:|return|\<bean|bp\.set|Action|public\ |\<input|\<label|\<\%|2\.|private)).+where.* |
|
| CRLFi | Java | high | simple | *.java, *.jsp|*.setHeader(* |
|
| Double Formatting | Python | low | regex | *.py|[^%\S]f'.*%[a-zA-Z] |
This table is pretty small right now. As I do more penetration tests at work over this summer, I will continue to update this table.