# AppSec Manifesto ## Rule #0 ### Absolute Zero > No code = no issues. No sinks = no vulnerabilities. No user-controlled input = no vector of attack. #### TODO - always delete `obsolete`, `dead`, `unreachable`, `unreferenced code`. - do not ask the user to provide more input than needed. In several situations, needed data can be generated by the system. e.g. name of the uploaded file - use least expressive language to talk to the application > **Obsolete code**: code that may have been useful in the past, but is no longer used, e.g. code to use a deprecated protocol. May be called dead code as well > **Dead code**: code that is executed but redundant, either the results were never used or adds nothing to the rest of the program. Wastes CPU performance. ```php function(){ ... // dead code since it's calculated but not saved or used anywhere $foo + $bar; } ``` > **Unreachable code**: code that will never be reached regardless of logic flow. ```php function(){ return 'foobar'; // following line is unreachable $a = $b + 1; } ``` > **Unreferenced**: variable (method, function etc.) that is defined but which is never used. `Unreachable`, `unreferenced` and `dead` code can be found with static analysis ( [PHPStan](https://github.com/phpstan/phpstan), [Phan](https://github.com/phan/phan), [Psalm](https://github.com/vimeo/psalm) ). `Obsolete`(dead) code can be found with dynamic analysis and [`tombstone`](https://github.com/krakjoe/tombs) concept ### *Complication*: [The Halting problem](https://en.wikipedia.org/wiki/Halting_problem) is reducible to the problem of finding dead code. That is, if you find an algorithm that can detect dead code in any program, then you can use that algorithm to test whether any program will halt. Since that has been proven to be impossible, it follows that writing an algorithm for dead code is impossible as well. #### *WHY?* - Unused code adds complexity - Unused code is misleading - Dead code can come alive >Rise of dead code: During the summer of 2012, Knight Capital Group caused a major stock market disruption and suffered a loss of over $400 million when a botched software deployment caused dead order handling code to be executed. The code had not been tested in many years and resulted in a deluge of orders hitting the market that could not be canceled. ## Rule #1 ### The Lord of the Sinks Do context-specific escaping on context boundary. Caution is the parent of safety: #### TODO - escape data as close to the sink as possible. - escape any data no matter if it's user-provided or system generated. - extract the raw value of ValueObject right before escaping ## Rule #2 ### [Least (Computational) Power Principle](https://en.wikipedia.org/wiki/Fail-fast) Access to computational power is a privilege - [`Parse`, don't `validate`](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/ ) input data as close to the `source` as possible - Use Always valid `ValueObjects` (force invariants) - Do not transfer malicious data from `source` to `sink` > **Parsing**: [A parser](https://en.wikipedia.org/wiki/Parsing) is a software component that takes input data (frequently text) and builds a data structure > **Validation**: [Validation](https://en.wikipedia.org/wiki/Data_validation) is a process that uses routines, often called "validation rules", "validation constraints", or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. > **Source**: The program point that reads external resource (user-input or any other data that can be manipulated by a potential attacker). > **Sink**: The program point that writes to the external resource. ## Rule #3 ### Forget-me-not Don't forget information regarding the validity of a certain input #### TODO - Do not use strings as a ubiquitous data type for unstructured data. Declare custom types instead, using `ValueObjects` to distinguish different kinds of data - Pass instances of custom types (`ValueObjects`) from `source` to `sink` - Use raw values of `ValueObject` only inside a single context, always pass ValueObjects across the boundaries ### *WHY?* This is needed to not have to do validation again here and there or blindly assume that validation was done before, in other words, it prevents [shotgun parsing](http://langsec.org/papers/langsec-cwes-secdev2016.pdf) problem. ## Rule #4 ### Declaration of Sources Rights All sources are born at the same architectural level and should be treated equally. #### TODO - Apply identical parsing/validation/escaping/sanitization for the same data coming from different `sources`. --- ## Appendix ### Quod licet Iovi, non licet bovi Throw exceptions (fail hard) for interactive input, degrade gracefully for non-interactive one ### Chinglish is not English *Avoid ambiguity* In multi-tier software ensure that every tier parses input identically. (See **HTTP request smuggling**, **HTTP Parameter Pollution**) ### A word spoken is past recalling Use read-once ValueObjects for sensitive data. Its main purpose is to facilitate detection of unintentional use.