# Generic malware detection patterns
## Backend code
- gz header in non-gz file
- Random var/func names of exactly 16/32 chars `$pojnekdohucyiyur(($gbqjmsxsbwfjdajm`
- curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
- Unformatted code `;function`
- /.{500}/ regex
- Formatting error (space vs tabs)
- Disabling of error reporting
- @ini_set("error_log",null);@ini_set("log_errors",0)
- ;error_reporting(0);
- PHP silence operator
- @eval
- @file_get_contents
- Check for var names with 1 char uppercase variations `$hBcS('onfr64_rapbqr')`
- Long lines/strings: `[^\n]{140}` or `[\S]{140}`
- External mail address: `grep --color=none -rP '\w{3}@\w{3}' . | egrep -v '@(example|swiftotter|webtexsoftware|opencommercellc|paradoxlabs|vandyke.com|libssh.org|php.net|domain.com|domain.tld|magentocommerce.com|zend.com|author|copyright|magento.com|varien.com)'`
- Find large files (exfil storage?) > 10MB
- Find growing files
- Find image files with incorrect image header (mime)
- Find direct access to input vars: `grep -rP '\$_(REQUEST|POST|GET|COOKIE)' .`
- Find files with mtime < ctime
- Suspicious PHP functions: `grep -rP 'base64_encode|shell_exec|str_rot13|die|goto|create_function' .`
- Credit card references: `grep -rP '\bcc_' .`
- Standard store framework:
- compare found core files to expected core files
- compare found core filesizes to expected core filesizes
## Log analysis
- db name is present in access logs
- same ip in short timeframe in access logs
- high volume of requests on access point
- suspicious UA (old ua, ua changes constantly for same ip)
- suspect country codes (RU, UA, CH etc)
## Frontend
- file size/content changes when on checkout
- file size/content changes when different UA and/or ip location
- detect outbound requests initiated by site
- html elements with scripts in tags (onload='\<script\>')
- html elements which are created but have no position/representation (could be many fpos for tracking tho)
- detection of leading spaces (used when attacker wants to hide code when viewer has no wraparound)
- comparison of *some* popular widgets/scripts to what is found (ex: we can compare what a google script should be like with what is present)
- Google/Facebook
- Chat scripts (zopim)
- unexpected filenames or files in unexpected dir
## Front- & Backend
- Literal string concatenation: `'+'`, `"."` (plus any whitespace)
- Base64 source: `[a-z0-9]{70,}`+ automatic decoding
- Detect too much whitespace or newlines
- Detect unexpected formatting of code style
- No newlines in context of code with newlines
- Minification/obfuscation
- Inconsequent use of var/func naming or style
- Detect unusual UTF-8 characters
## Manual checkout inspection
- Use devtools regex or `curl -sL <site> | grep -P` to find:
- Long strings of spaces, trying to hide something out of view
- `[^ ] {30}`
- Long strings of newlines, often indicating injection
- ???
- Very long lines, often indicating injection
- `.{10000}`
- Base64 strings
- `[a-z0-9]{70,}`
## Specific searches
- Find stacks of curl_setopt
- `grep -rP '(curl_setopt\(\$.{2,30}, CURLOPT_.{5,60}\n\s{0,50}){6,12}'`
- Find everystylish codestyle
- `grep -rP '\.\.\. .{10,30};;`