Generic malware detection patterns

# Generic malware detection patterns ## Backend code - gz header in non-gz file - Random var/func names of exactly 16/32 chars `$pojnekdohucyiyur(($gbqjmsxsbwfjdajm` - curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); - Unformatted code `;function` - /.{500}/ regex - Formatting error (space vs tabs) - Disabling of error reporting - @ini_set("error_log",null);@ini_set("log_errors",0) - ;error_reporting(0); - PHP silence operator - @eval - @file_get_contents - Check for var names with 1 char uppercase variations `$hBcS('onfr64_rapbqr')` - Long lines/strings: `[^\n]{140}` or `[\S]{140}` - External mail address: `grep --color=none -rP '\w{3}@\w{3}' . | egrep -v '@(example|swiftotter|webtexsoftware|opencommercellc|paradoxlabs|vandyke.com|libssh.org|php.net|domain.com|domain.tld|magentocommerce.com|zend.com|author|copyright|magento.com|varien.com)'` - Find large files (exfil storage?) > 10MB - Find growing files - Find image files with incorrect image header (mime) - Find direct access to input vars: `grep -rP '\$_(REQUEST|POST|GET|COOKIE)' .` - Find files with mtime < ctime - Suspicious PHP functions: `grep -rP 'base64_encode|shell_exec|str_rot13|die|goto|create_function' .` - Credit card references: `grep -rP '\bcc_' .` - Standard store framework: - compare found core files to expected core files - compare found core filesizes to expected core filesizes ## Log analysis - db name is present in access logs - same ip in short timeframe in access logs - high volume of requests on access point - suspicious UA (old ua, ua changes constantly for same ip) - suspect country codes (RU, UA, CH etc) ## Frontend - file size/content changes when on checkout - file size/content changes when different UA and/or ip location - detect outbound requests initiated by site - html elements with scripts in tags (onload='\<script\>') - html elements which are created but have no position/representation (could be many fpos for tracking tho) - detection of leading spaces (used when attacker wants to hide code when viewer has no wraparound) - comparison of *some* popular widgets/scripts to what is found (ex: we can compare what a google script should be like with what is present) - Google/Facebook - Chat scripts (zopim) - unexpected filenames or files in unexpected dir ## Front- & Backend - Literal string concatenation: `'+'`, `"."` (plus any whitespace) - Base64 source: `[a-z0-9]{70,}`+ automatic decoding - Detect too much whitespace or newlines - Detect unexpected formatting of code style - No newlines in context of code with newlines - Minification/obfuscation - Inconsequent use of var/func naming or style - Detect unusual UTF-8 characters ## Manual checkout inspection - Use devtools regex or `curl -sL <site> | grep -P` to find: - Long strings of spaces, trying to hide something out of view - `[^ ] {30}` - Long strings of newlines, often indicating injection - ??? - Very long lines, often indicating injection - `.{10000}` - Base64 strings - `[a-z0-9]{70,}` ## Specific searches - Find stacks of curl_setopt - `grep -rP '(curl_setopt\(\$.{2,30}, CURLOPT_.{5,60}\n\s{0,50}){6,12}'` - Find everystylish codestyle - `grep -rP '\.\.\. .{10,30};;`