Linux Digital Forensics: Recovering Evidence with Autopsy and TSK

Table of Contents

The Monday Morning Crisis: Files Vanished After a Breach

It’s 8:00 AM on a Monday, and a critical production server is red-lining at 98% CPU usage. You log in to investigate, but the trail is cold. The logs are gone. The suspicious scripts flagged by your EDR at 3:00 AM have vanished. The attacker was thorough; they used rm -rf on the evidence and wiped the shell history to hide their tracks.

Standard system tools will tell you these files no longer exist. However, in digital forensics, “deleted” rarely means “erased.” When a file is removed from an EXT4 filesystem, the Linux kernel simply unlinks the inode and marks those data blocks as available for future use. The actual bits usually sit untouched on the platter or flash cells until a new process overwrites them.

To catch an intruder, you have to stop looking at the filesystem through the OS and start looking at the raw disk. This guide explores how to use The Sleuth Kit (TSK) and the Autopsy graphical interface to reconstruct a timeline and pull “ghost” data back from the void.

The Trap: Why You Shouldn’t Investigate a Live System

Using commands like ls, find, or grep on a compromised machine is a fundamental mistake. If an attacker has unlinked a file, the kernel’s API won’t report it to these tools. Even worse, a sophisticated rootkit can hook system calls to hide malicious processes from ps or top while you’re looking right at them.

Every second a compromised system stays online, you risk losing evidence. Background processes, log rotations, and even your own investigation commands write new data to the disk. This activity can overwrite the deleted attacker scripts permanently. Effective forensics requires a “dead” analysis: you pull the plug, image the drive, and work on a bit-for-bit copy on a clean workstation.

Choosing Your Tools: TSK vs. Autopsy

Forensic investigators generally rely on three levels of analysis depth:

Manual Hex Analysis: Using hexedit to parse raw bytes. It is grueling work, reserved for when you need to manually reconstruct a corrupted file header.
The Sleuth Kit (TSK): A powerhouse collection of CLI tools. fls lists file entries, while icat extracts data by inode. It’s fast and perfect for automation scripts.
Autopsy: The GUI wrapper for TSK. It handles the heavy lifting of indexing, keyword searching, and visual timeline construction.

For most scenarios, the most efficient workflow involves using Autopsy for the broad discovery phase and TSK for surgical, granular file recovery.

A Practical Step-by-Step Investigation

1. Creating a Forensic Image

Never perform analysis on the original hardware. You need a bit-for-bit clone. For a standard 500GB SATA drive, I use dc3dd because it provides progress bars and integrated hashing. If the suspect disk is at /dev/sdb, run:

# Create a raw image with built-in hashing
sudo dc3dd if=/dev/sdb of=evidence_disk.img hash=sha256 log=image_log.txt

Verify that the SHA-256 hash matches the source. If the hash changes later, your evidence is tainted and won’t hold up in a formal audit or legal proceeding.

2. Preparing Your Environment

Analyze the data on a dedicated, air-gapped machine if possible. Before I begin a case, I generate unique, high-entropy passwords for my encrypted containers. I use the tool at toolcraft.app/en/tools/security/password-generator because it runs locally in the browser. This ensures no sensitive credentials ever touch a network during the setup phase.

Install the necessary toolkit on your analysis station:

sudo apt update && sudo apt install sleuthkit autopsy -y

3. Rapid Recovery with TSK

Sometimes you need a quick win before diving into a full Autopsy case. First, find the partition offset using mmls. This tells you exactly where the Linux partition starts.

# Find the starting sector (e.g., 2048)
mmls evidence_disk.img

# List deleted files recursively on that partition
fls -o 2048 -r -d evidence_disk.img

Look for entries marked with an asterisk (*). If you spot a suspicious file at inode 45678, extract it immediately for analysis:

icat -o 2048 evidence_disk.img 45678 > recovered_malware.bin

4. Deep Analysis in Autopsy

Launch the interface by typing autopsy in your terminal. Open the provided URL in your browser (default is localhost:9999). Create a new case and import your evidence_disk.img.

The real power lies in the Ingest Modules. Enable File Type Identification and Extension Mismatch Detector. These find attackers who try to hide executable scripts by renaming them to .jpg or .txt. For a web server breach, the Keyword Search module is indispensable for finding specific IP addresses or PHP shells hidden in unallocated space.

5. Reconstructing the Timeline

The Timeline View is your most effective weapon. It aggregates every file creation, access, and modification (MAC) time into a single graph. Attackers often leave a “noise signature”—a cluster of 30 or 40 file modifications in /etc/ or /var/www/ within a 10-second window. By pinpointing this moment, you can identify exactly how they entered the system and which backdoors they planted.

The Golden Rules of Forensics

Successful investigations depend on discipline rather than just fancy tools. Follow these three rules:

Use Write-Blockers: If you must connect the physical drive, use a hardware write-blocker to ensure not a single bit is altered.
Maintain a Chain of Custody: Log every command you run. If you find a file, document exactly how you found it and its original inode.
Verify Results: If Autopsy flags a file as malicious, double-check the raw hex with TSK to ensure it isn’t a false positive.

By moving from live analysis to disk-level forensics, you stop guessing and start proving. You’re no longer just a sysadmin cleaning up a mess; you’re a security professional uncovering the truth.