Web Apps 101 - HackMD

# Web Apps 101 ###### tags: `cybersecurity` `web security` ## Walking a web app In-built browser tools can be used in the following ways: * View Source - Use your browser to view the human-readable source code of a website. * Inspector - Learn how to inspect page elements and make changes to view usually blocked content. * Debugger - Inspect and control the flow of a page's JavaScript * Network - See all the network requests a page makes. Often, the parts of a website that are exploitable are the ones requiring interaction with the user. A good way to start is exploring the website and noting the individual pages/areas/features with a summary of each one. **Inspect element** provides a real time view of what is being displayed **Debugger** is intended for debugging java script. in firefox and safari it is labelled debugger, however, in chrome it is labelled sources. **Network tab** can be used to keep track of every external request a webpage makes. If you click on the network tab then refresh the page, you will see all the files the page is requesting. Ajax is a method for sending and receiving data in the background without interfering by changing the current web page. ## Content Discovery The main ways of discovering content on a website are: * manually * automated * osint ### Manual Discovery - Robots.txt The robots.txt file tells search engines which websites they are or aren't allowed to show on search results / ban specific search engines from crawling the site all together. ### Manual Discovery - Favicon You can use the [owasp favicon reference](https://wiki.owasp.org/index.php/OWASP_favicon_database) to find frameworks sites are built on in case the default icon was left before the site was pushed to production. ```=1 # Get the favicon from the view source page # Get the md5 hash for the fav icon with the curl and md5sum commands curl [favicon uri] | md5sum # search for the hash on the owasp favicon reference ``` ### Manual Discovery - Sitemap.xml Sitemap.xml gives a list of every page the website owner wishes to be listed on a search engine. This typically contains areas of the website that are a bit more difficult to navigate / old web pages which are still functional but no longer in use. ### Manual Discovery - HTTP headers Requests to web servers return various HTTP headers. At times these headers can point us to the architecture the server is running on. ```=1 curl [ip] -v # check the headers ``` ### Manual Discovery - Framework stack Comments in source code can point to the implementation framework. Once you deduce the implementation framework from one of the aforementioned techniques, look at the framework documentation to establish: * default paths used e.g where the site would be deployed by default * default credentials since sometimes admins don't bother to change these. ### OSINT - Google dorking Read up on google dorking [here](https://en.wikipedia.org/wiki/Google_hacking) ### OSINT - Wappalyzer The wappalyzer extension can show you the tech a website is running. ### OSINT - Wayback machine The [Wayback Machine](https://archive.org/web/) is a historical archive of websites that dates back to the late 90s. You can search a domain name, and it will show you all the times the service scraped the web page and saved the contents. This service can help uncover old pages that may still be active on the current website. ### OSINT - GITHUB GitHub is a hosted version of Git on the internet. Repositories can either be set to public or private and have various access controls. You can use GitHub's search feature to look for company names or website names to try and locate repositories belonging to your target. Once discovered, you may have access to source code, passwords or other content that you hadn't yet found. ### OSINT - S3 Buckets Sometimes access permissions are incorrectly set and inadvertently allow access to files that shouldn't be available to the public. The format of the S3 buckets is http(s)://{name}.s3.amazonaws.com where {name} is decided by the owner ### Automated Discovery Alternatively you can use tools for content discovery. This is Made possible by wordlists e.g [seclists](https://github.com/danielmiessler/SecLists) ### Automated Discovery - ffuf get [ffuf here](https://github.com/ffuf/ffuf) ``` ffuf -w /usr/share/seclists/Discovery/Web-Content/common.txt -u [ip/FUZZ] -c -v ``` ### Automated Discovery - dirb ``` dirb [ip] /usr/share/seclists/Discovery/Web-Content/common.txt ``` ### Automated Discovery - gobuster ``` gobuster dir --url http://10.10.101.159/ -w /usr/share/seclists/Discovery/Web-Content/common.txt ``` ## Subdomain Enumeration This is the process of finding valid subdomains for a domain in order to expand the attack surface and find more infiltration points. ### OSINT - SSL / TLS Certificates Certificate authorities (CA) participate in certificate transparency(CTS) logs when an SSL/TLS certificate is created for a domain by a CA. These CT logs are publicly accessible logs of every TLS/SSL certificate created for a domain name. The aim of CT logs is to stop malicious and accidentally made certificates from being used. To leverage this for subdomain enumeration, we can use [crt.sh](https://crt.sh/) ### OSINT - Search Engines Google fu allows you to use a search filter like: ``` -site:www.targetsite.com site:*.targetsite.com``` which will essentially return subdomains of targetsite.com and will exclude the main domain. ### DNS Bruteforce This enumeration involves trying tens, hundreds, thousands or even millions of different possible subdomains from a pre-defined list of commonly used subdomains. ``` dnsrecon -t brt -d [ip] ``` ### OSINT - sublist3r See the [github repo for sublist3r](https://github.com/aboul3la/Sublist3r) ``` # usage: ./sublist3r.py -d [domain name] ``` ### Virtual Host The DNS record could be kept on a private DNS server or recorded on the developer's machines in their /etc/hosts file (or c:\windows\system32\drivers\etc\hosts file for Windows users) which maps domain names to IP addresses. A server hosting multiple websites knows the website the user wants from the **HOST** header. Make changes to this host header and monitor the responses accordingly. ```=1 ffuf -w /usr/share/seclists/Discovery/DNS/namelist.txt -H "Host: FUZZ.targetdomain.com" -u http://[ip] # the -H flag adds/edits a header, in this case the host reader # the fuzz keyword represents where a subdomain name would go and is where we shall try all the options from the wordlist # filter the output by noting the most occuring file size then excluding it with the -fs flag as shown below ffuf -w /usr/share/seclists/Discovery/DNS/namelist.txt -H "Host: FUZZ.targetdomain.com" -u http://[ip] -fs [size] ``` ## Authentication Bypass ### Username enumeration Sometimes when you try to sign up for an account on a website and the user already exists, the site lets you know the username is taken. This is useful for enumeration. We can use ffuf to enumerate through such a site with a list of common names e.g from seclists ```=1 # use ffuf to signup with a list of usernames, catch the usernames the site labels as taken and save them to a file called vunames to be used later ffuf -w /usr/share/seclists/Usernames/Names/names.txt -X POST -d "username=FUZZ&email=x&password=x&cpassword=x" -H "Content-Type: application/x-www-form-urlencoded" -u http://[ip]/customers/signup -mr "username already exists" | tee vunames ``` The parameters are as following: -w : path to wordlist. -X : specifies the request method, default is get but since we are targeting a login form we use POST. -d : specifies the data that we are going to send. We set the value of username to FUZZ as that's the parameter we want our wordlist to be inserted in. -H : adds additional headers to our request, we set content-type so the webserver knows we are sending form data. ### Brute force In some instances, given a list of valid usernames you can iterate through each of them and fuzz for the password value ```=1 ffuf -w /usr/share/seclists/Passwords/Common-Credentials/10-million-password-list-top-100.txt -X POST -d 'username=[valid username]&password=FUZZ' -H 'Content-Type: application/x-www-form-urlencoded' -u http://[url for login page] -fc 200 ``` ### Logic flaws A logic flaw is when the typical logical path of an application is bypassed, circumvented or manipulated by a hacker Take an example of an application where password reset email is sent to the address in the PHP variable $_REQUEST The $_REQUEST variable is an array containing data received from the query string and post data. If the same name is used for both query string and post data, the application might favor the POST data field over the query string field. Leveraging this, we can create an account and add our own address to the post form controlling where the password reset email will be sent. ### Cookie Tampering When cookies are used for authentication and authorization purposes such as setting privileges through admin flags, you can tamper with the cookie for priv escalation. ```=1 # set header as cookie and desired values for example: curl -H "Cookie: logged_in=true; admin=true" http://[url] ```