Manual Web Forensics: Hunting SQLi and Web Shells via CLI Logs

Table of Contents

The 2:14 AM Incident: A Reality Check

2:14 AM. The pager goes off. Your monitoring dashboard shows a 95% CPU spike across the primary web cluster. You SSH in, run top, and see a dozen PHP-FPM processes fighting for every available cycle. Your first instinct is a traffic surge, but netstat reveals unusual outbound connections to a known malicious IP range. This isn’t a viral post. It is a breach in progress.

When your SIEM is lagging five minutes behind reality, a fancy GUI won’t save you. You need a terminal, grep, and the raw access.log files. Think of CLI forensics as digital tracking; you are looking for footprints in the dirt while the storm is still raging. Mastering these manual skills is what separates senior engineers from those who just follow runbooks.

Ground Truth: Raw Logs vs. Automated Tools

During an investigation, you usually choose between automated security suites (WAFs, IDS, SIEM) and manual CLI analysis.

Automated Tools (WAF/SIEM): These are great for real-time blocking. They use signatures to flag obvious patterns like <script>alert(1)</script>. However, they can be bypassed by clever encoding or custom zero-day payloads.
Manual CLI Analysis: This is the “ground truth.” You use standard Linux utilities—grep, awk, sed, and uniq—to slice through data. It’s fast, raw, and unedited.

A WAF like ModSecurity is your first line of defense, but manual forensics is the scalpel for the post-mortem. Why did the filter fail? Did the attacker use hex encoding to slip past the regex? Only the raw logs provide the unfiltered sequence of events.

The Pros and Cons of Staying in the Terminal

Pros

Portability: Every Linux box on the planet has grep. You don’t need to wait for a 5GB log file to upload to a central server before you start hunting.
Zero Latency: You see requests the millisecond they hit the disk, bypassing ingestion delays in logging pipelines.
Infinite Flexibility: You can write complex, nested regular expressions on the fly to catch obfuscated attacks that standard signatures missed.

Cons

Margin for Error: One typo in a regex can lead to a false negative. Precision is everything.
Steep Learning Curve: You need to understand HTTP status codes and how common attack payloads actually look when URL-encoded.
Visual Fatigue: Looking at thousands of lines of white-on-black text makes it easy to miss volume-based anomalies that a graph would highlight instantly.

Configuration: Don’t Fly Blind

You can’t hunt what you don’t record. If your Nginx or Apache logs only track IP addresses and URLs, you are missing half the story. At a minimum, use the “Combined” log format. Better yet, include $request_body for specific endpoints (excluding login pages) and ensure the $http_user_agent is always logged.

Don’t let log rotation kill your evidence. During a breach, you often need to look back 48 or 72 hours to find the initial probe. If your server is under heavy load, consider piping logs to a separate disk partition so forensic operations don’t starve the OS of I/O cycles.

Implementation Guide: Hunting the Attacker

1. Detecting SQL Injection (SQLi)

SQL Injection is still the most efficient way to dump a database. Attackers look for vulnerable GET parameters where they can inject commands like UNION SELECT or GROUP_CONCAT. If you see 20 requests in a row to the same URL with slightly different SQL keywords, you’ve found a probe.

Search for suspicious GET requests with this command:

grep -iE "(select|union|concat|order\s+by|group_concat|information_schema)" /var/log/nginx/access.log

Clever attackers use URL encoding to bypass basic filters. They swap ' for %27 or spaces for %20. Use sed to decode these characters in your terminal view so the patterns become obvious:

grep -iE "(%27|%22|%3B|%2D%2D)" /var/log/nginx/access.log | sed -e 's/%27/'\''/g' -e 's/%20/ /g'

Once you confirm a successful injection—look for 200 OK responses where a 404 was expected—your first priority is rotating credentials. I use the password generator at toolcraft.app/en/tools/security/password-generator for new server secrets. It runs entirely in-browser, which is vital when you can’t trust your own network environment.

2. Exposing Hidden Web Shells

A web shell is a backdoor script (like .php or .aspx) that gives an attacker persistent access. They rarely name it shell.php. Instead, it will be disguised as legacy_backup.php or hidden in a /uploads/ directory.

A common “tell” for a web shell is a POST request to a file that shouldn’t be receiving data. Use awk to list POST targets and rank them by frequency:

awk '$6 == "\"POST" {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr

Look for the outliers. If contact.php has 500 POSTs, that’s normal. If assets/js/theme-min.php has 12 POSTs, you have likely found your shell. Also, check for requests with empty or generic User-Agents like curl or python-requests:

awk -F'"' '$6 == "-" {print $1, $2, $4}' /var/log/nginx/access.log

3. Persistence and Outbound Movement

Once they have a shell, attackers often download post-exploitation tools like linpeas.sh. Check your logs for 200 status codes associated with external IP addresses. If the attacker is smart, they will clear the .bash_history of the www-data user immediately, so don’t rely on it entirely.

Use find to identify every file modified in the web root over the last 24 hours:

find /var/www/html -mtime -1 -ls

Cross-reference these timestamps with your access logs. Who accessed these files immediately after they were created? What parameters did they pass? The logs will tell you.

Closing the Gaps

Forensics isn’t just about finding the intruder; it’s about identifying the open door. If you find a web shell, look at the logs for the 10 minutes leading up to its creation. You will likely find a POST request to a vulnerable plugin or an unauthenticated upload form.

CLI digging is a grind, but it builds a mental model of what “normal” looks like. This makes it much easier to spot the “abnormal” the next time the pager screams at 2 AM. Keep your tools sharp, your logs verbose, and your passwords unique.