Detecting SSRF Attack by using Deep Learning Techniques

# Detecting SSRF Attack by using Deep Learning Techniques ## What is an SSRF attack ![](https://i.imgur.com/7QK75aR.png) <br/> - Internal servers behind firewalls can be accessed by the attackers by submitting a URL within a web request to the web application. --- ## Types of SSRF - **non-blind SSRF** ![](https://i.imgur.com/Vd7lVMM.png) Attacker can access the data via the HTTP response. Server retrieves the contents of the resource located at the URL submitted, without verification, in an HTTP response to the user is given. - **blind SSRF** ![](https://i.imgur.com/hKrfjfD.png) Here the attacker sends his own URL, he can access it, and the server sends an HTTP response to this URL. However, this method detects vulnerability, but it is possible that sensitive data will not be obtained. ## How to Prevent an SSRF Attack 1. Use a whitelist of approved domains and protocols through which remote resources can be acquired by the web server. 2. User input should always be sanitised or validated.One must verify that the server response received is as planned to avoid response data leakage to an attacker. The raw response body of the request sent by the server should not be delivered to the client under any circumstances. 3. If only HTTP or HTTPS are used by your application to make requests, allow only these URL schemas. If URL schemas like file:///, dict://, ftp:// and gopher:// are disabled, the attacker won’t be able to use the web application to make dangerous requests using these URL schemas. 4. Services like Memcached, Redis, Elasticsearch, and MongoDB do not need authentication by default. Server Side Request Forgery vulnerabilities may be used by an attacker to access any of these services without any authentication. Therefore it is best to allow authentication wherever possible, even for services on the local network to ensure security for the web application. --- ## Architecture to defend against SSRF ![](https://i.imgur.com/ZpuOtf9.png) ### What is a reverse proxy? ![](https://i.imgur.com/Luzr6nM.png) 1. 負載平衡：每天數百萬使用者的流量可能無法使用單個伺服器處理所有傳入流量，反向代理可以提供負載平衡解決方案，在不同的伺服器之間均勻分配傳入流量，以防止任何伺服器過載。如果伺服器完全故障，其他伺服器可以代為處理 2. 防止攻擊：使用反向代理后，網站或服務永遠不需要顯示其原始伺服器的 IP 位址。這使得攻擊者更難利用針對性攻擊，例如 DDoS攻擊，更安全和有更多的資源來抵禦網路攻擊 --- ![](https://i.imgur.com/ldieZ2B.png) ## How does the deep learning model work - The researcher presented a deep learning-based model using an auto-encoder that is able to learn from the presence of a sequence of words while giving weight to these words according to their presence in the sequence. ### What is an auto encoder ![](https://i.imgur.com/GCVVkxK.jpg) - The model receives an entry request for the web application and then decodes and encodes the requests vector and calculates the reconstruction or loss error. - If the loss error value is large then it classifies this request as anomalous requests, and conversely if the value is low then the request is classified as normal requests. (threshold is a given value θ) --- ## CNN ![](https://i.imgur.com/kkc7CRP.png) ## LSTM ![](https://i.imgur.com/ebqm4Gy.png) - **why do we prefer LSTM over CNN** Can train larger sequence of data because if there are too many data, CNN multiplies the input by the weight by <1 (ex. 0.5) causes the gradient descent to vanish, by >1(ex. 2)causes the gradient descent to explode --> **hard to train** - **What does the sigmoid function affect the LT-mem** makes the remembered value between 0 and 1 - **Three inputs: LT-mem, ST-mem, Input val** $$STmem \ast weight + Input \ast weight + Bias$$ ***calculate potential memory to remember, bypass sigmoid function and add it to long term mem*** ![](https://i.imgur.com/BurR4J6.png) ***calculate potential memory to remember, bypass sigmoid function and add it to short term mem*** ![](https://i.imgur.com/YQQ5o2K.png) ## A set of these makes a cell and can be used to predict stock prices you can input everyday stock prices as input value to use long short term memory to predict prices today ![](https://i.imgur.com/LPq42XO.png) ## Metric measures 1) True Positive (TP): The data instances correctly predicted as an Attack by the classifier. 2) False Negative (FN): The data instances wrongly predicted as Normal instances. 3) False Positive (FP): The data instances wrongly classified as an Attack. 4) True Negative (TN): The instances correctly classified as Normal instances. ***Used numpy, pandas, keras, scikit learning , LSTM model and a batch size of 25, 100 epochs, learning rate of 0.005, and MSE as a loss function , you get a F1 score of 0.970 and accuracy of 0.969 and the confusion matrix below*** ![](https://i.imgur.com/8eDn2x2.png) ![](https://i.imgur.com/NVZ0dFj.png)