Carving Chaos: Building and Breaking Filesystems for Fun and Forensics

Posted on May 30, 2025 15 mins

Forensics Carving File-Systems Foremost Scalpel Loopback Linux Scripting Bash

Table of Contents

Let’s do this folks! Been absent from digital forensics.. for quite a while? I enjoy penetration testing so much right now that i now try to make the conscious decision to do this session here.

Digital Forensics is fun, but when something’s more fun right now, ADHD won’t let me take side-missions for thaat long.

But this will be a fun one. Today’s all about improving on my understandings - of not only forensics, but also bash scripting.

🪵 Log Tooling & Prep

First of all, as I’ve made a fresh VM for this one, let’s install my Forensic Log Tracker .

# Clone the repository
git clone https://github.com/mev0lent/forensic-log-tracker.git
cd forensic-log-tracker

# adjust permissions
chmod +x setup.sh

# Run the setup script
./setup.sh

# After the setup script ran, make sure to EXECUTE THE COMMANDS PRINTED OUT FOR YOU AT THE END:
source ~/.zshrc # or ~/.bashrc, depending on your setup
source forensic-log-venv/bin/activate

BTW i think i need to test the Windows setup.. don’t know how strong that one stands.. Anyways, off to Digital Forensics!

🛠️ Task One: Bash Up a Disk

“Create a script that generates a storage device of any size and formats it with a file system of your choice.”

A great opportunity to get into Bash Scripting, the whole session depends on this script being fairly good.

Honestly i’ve wanted to dive into Neovim or a similar editor for a long time now but it doesn’t seem to me like that day is today - i am quite tired so let’s focus our motivation on actually getting stuff done. As a wise human once said, preparing to get stuff done is NOT getting stuff done!

📥 Input Gathering

nano generation.sh

#!/bin/bash

read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME

Off to a good start, reading in user customization parameters.

💾 Disk Creation

Now we’ll initialize the file as empty with dd - didn’t mention in my earlier forensic guides right? I’ll have to.

echo "[+] Creating empty disk image of size $FILESIZE..." 
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progress

if=/dev/zero -> our input file is null-bytes (initializing empty, remember), the outputfile is of=${FILENAME}.img and the blocksize is 1MB via bs=1M. Also, with the count and echo part, we extract the size/ “the amount of MB wanted” out with sed.

We might want to make that script smarter later so one can also use G for Gigabyte, but for know let’s go with MB.

🔗 Loop Device Linking

echo "[+] Attaching to loop device..." 
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV"

losetup --find --show searches for the next free loopback-device (/dev/loopN) and connects it with our created image. The path, e.g. /dev/loop0 is saved in the variable LOOPDEV.

🧹 Filesystem Formatting

echo "[+] Formatting with $FILESYSTEM..." 
mkfs.$FILESYSTEM $LOOPDEV

This, varying from input to input, will be like mkfs.fat /dev/loop0 or mkfs.ext4 /dev/loop0.

🔁 Recap

#!/bin/bash

# === Inputs ===
read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME

# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..." 
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progress

# === Creating loop-back-device ===
echo "[+] Attaching to loop device..." 
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV" 

# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..." 
mkfs.$FILESYSTEM $LOOPDEV

just to recap, this is our current script.

Code along, see to it that you understand what’s happening. Bash is fun!

📂 Mounting & File Injection

MOUNT_DIR="./mnt_${FILENAME}" 
mkdir -p $MOUNT_DIR

echo "[+] Mounting..." 
mount $LOOPDEV $MOUNT_DIR

What’s happening here? A local folder (or structure) is created (recursively, that’s what the mkdir -p-flag indicates - if we give X/Y/Z, they’ll all be created if non-existing.) The loop-back-device is mounted in this created folder, then we can write files to that folder and they’ll land directly in the image.

You know what’d be great? To be able to provide a folder in which our files will be that we’ll create to fill the disk image. Lets add that at the top.

#!/bin/bash

# === Inputs ===
read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
read -p "Payload directory (the folder with files you'll feed your newly created filesystem): " PAYLOAD_DIR

# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..." 
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progress

# === Creating loop-back-device ===
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV"

# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..."
mkfs.$FILESYSTEM $LOOPDEV

# === Mounting ===
MOUNT_DIR="./mnt_${FILENAME}"
mkdir -p $MOUNT_DIR

echo "[+] Mounting..."
mount $LOOPDEV $MOUNT_DIR

# === Bringing in data ===
echo "[+] Copying files from $PAYLOAD_DIR to disk image..."
cp -r $PAYLOAD_DIR/* $MOUNT_DIR/

Let’s do something fun shall we? Making a command menu from here :D - this part is our base setup, now we need to be able to dynamiccally device.

☕ Quick break? If you’re enjoying this tutorial and feel like you’re learning a lot…

Maybe drop a coffee into the forge?
Every Ko-fi helps me carve more bytes, write more bash, and build better walkthroughs.

Support the Forge on Ko-Fi

The full script with this is, and some adjustments for errors, root check, and such:

#!/bin/bash
set -e  # stop script on any error

# === checking root rights ===
if [ "$EUID" -ne 0 ]; then
  echo "[!] Please run this script as root (sudo)."
  exit 1
fi

# === Inputs ===
read -p "Filesize (e.g. 100M): " FILESIZE
read -p "Filesystem (e.g. ext4, fat): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
read -p "Payload directory (the folder with files you'll feed your newly created filesystem): " PAYLOAD_DIR

# === Ensure payload directory exists ===
if [ ! -d "$PAYLOAD_DIR" ]; then
  echo "[!] Payload directory '$PAYLOAD_DIR' does not exist. Creating it..."
  mkdir -p "$PAYLOAD_DIR"
  echo "[i] Payload directory created. Please add your files before running the script again."
  exit 0
fi

# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..." 
DISK_IMAGE="${FILENAME}.img"
dd if=/dev/zero of="$DISK_IMAGE" bs=1M count=$(echo "$FILESIZE" | sed 's/M//') status=progress

# === Creating loop-back-device ===
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show "$DISK_IMAGE")
if [ -z "$LOOPDEV" ]; then
  echo "[!] Failed to create loop device."
  exit 1
fi
echo "[+] Attached at $LOOPDEV"

# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..."
mkfs."$FILESYSTEM" "$LOOPDEV"

# === Mounting ===
MOUNT_DIR="./mnt_${FILENAME}"
mkdir -p "$MOUNT_DIR"

echo "[+] Mounting..."
mount "$LOOPDEV" "$MOUNT_DIR"

# === Bringing in data ===
echo "[+] Copying files from $PAYLOAD_DIR to disk image..."
cp -r "$PAYLOAD_DIR"/* "$MOUNT_DIR"/

# === Interactive menu after setup stage ===
while true; do
    echo ""
    echo "=== MENU ==="
    echo "1) Unmount and detach disk" 
    echo "2) Create dump/backup of image"
    echo "3) Check remaining space on mounted image"
    echo "4) Run Foremost"
    echo "5) Run Scalpel"
    echo "6) Generate SHA256 checksums of payload"
    echo "7) Exit"
    echo "==========="
    read -p "Choose an option: " option

    case $option in
        1)
            echo "[+] Unmounting and detaching..."
            sync
            umount "$MOUNT_DIR"
            losetup -d "$LOOPDEV"
            echo "[+] Done."
            ;;
        2)
            echo "[+] Creating image dump..."
            DUMP_DIR="./dumps"
            mkdir -p "$DUMP_DIR"
            TIMESTAMP=$(date +%Y%m%d_%H%M%S)
            cp "$DISK_IMAGE" "$DUMP_DIR/${FILENAME}_$TIMESTAMP.img"
            echo "[+] Dump saved to $DUMP_DIR/${FILENAME}_$TIMESTAMP.img"
            ;;
        3)
            echo "[+] Checking space..."
            df -h "$MOUNT_DIR"
            ;;
        4)
            echo "[+] Running Foremost..."
            mkdir -p output_foremost
            foremost -i "$DISK_IMAGE" -o output_foremost
            echo "[+] Foremost done. Output in output_foremost/"
            ;;
        5)
            echo "[+] Running Scalpel..."
            mkdir -p output_scalpel
            scalpel -c /etc/scalpel/scalpel.conf -o output_scalpel "$DISK_IMAGE"
            echo "[+] Scalpel done. Output in output_scalpel/"
            ;;
        6)
            echo "[+] Generating SHA256 checksums of payloads..."
            find "$PAYLOAD_DIR" -type f -exec sha256sum {} \; > "checksums_${FILENAME}.txt"
            echo "[+] Checksums saved in checksums_${FILENAME}.txt"
            ;;
        7)
            echo "[✓] Exiting. Have a productive day & drink water!"
            break
            ;;
        *)
            echo "[!] Invalid option. Try again."
            ;;
    esac
done

This script might be subject to future changes, check it’s most current version out here in my git so you can be sure to have the most current version.

Alright, we should be good to go, let’s install missing stuff and go!

sudo apt-get install foremost
sudo apt-get install scalpel

Working with our new script

From now on, i will be tracking everything with the forensic-log-tracker comments function, i recommend you do that aswell. BE SURE TO CONFIGURE the log-tracker’s config!

sudo nano config/config.yaml # inside the repo

project:
  analyst: "Max Mustermann"
  timezone: "UTC"

execution:
  default_output_lines: 20
  dry_run_label: "[!] DRY RUN: Command not executed."

output:
  language: "de"              # "en", "de" – for future translation of explanations
  format: "md"                # "md", "html", "pdf" - NOT WORKING YET
  preview_lines: 20
  include_sha256: true
  hash_algorithm: "sha256"
  comment_type: "Comment"               # "Callout" oder "Comment"

gpg:
  enabled: true
  auto_verify: true
  default_key: ""             # optional: GPG fingerprint

logging:
  level: INFO                 # DEBUG, INFO, WARNING, ERROR, CRITICAL

I will change the timezone to my sweet CET and the name to.. well my name.

Log extensively (if you like, with flt) — not just for grades, but because future-you will forget today’s genius.

# Switching back your current dir to where your script is
chmod +x generation.sh

sudo ./generation.sh
Filesize (e.g. 100M): 50M
Filesystem (e.g. ext4, fat): ext4
Filename (e.g. virtual_disk): task_1 
Payload directory (the folder with files you'll feed your newly created filesystem): ./payload_1 
[!] Payload directory './payload_1' does not exist. Creating it...
[i] Payload directory created. Please add your files before running the script again.

A bit dumb that we put the dir creation after, but damn it, was a good way to test the inputs haha.

Now let’s collect some files that we can use this with - my professor told to collect types of:

.png
.jpeg, .jpg
.doc, .docx, .xls, .xslx, .ppt, .pptx, .txt, .pdf
.mpeg
.mp3
.py, .java
.html, .xhtml, .xml

Just for good’ol security reasons, delete the folder and make it anew so we don’t have to create everything in it as root.

rmdir payload_1
mkdir payload_1
cd payload_1

Init: creating.. stuff..

# Text
echo "test" > test.txt

# Code
echo 'print("Hello")' > hello.py
echo '<html><body>Hello</body></html>' > index.html

# Image
wget -O image1.png https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png
wget -O image2.jpg https://upload.wikimedia.org/wikipedia/commons/1/1f/Wikipedia_mini_globe_handheld.jpg

# Office-Dummy
touch file.doc file.docx file.xls file.xlsx file.ppt file.pptx

# Media
ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 2 test.mp3

# PDF
wget -O test.pdf https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf

You might have to install ffmpeg for this.

🧪 A Nice Little Run

sudo ./generation.sh 
Filesize (e.g. 100M): 50M
Filesystem (e.g. ext4, fat): ext4
Filename (e.g. virtual_disk): task1
Payload directory (the folder with files you'll feed your newly created filesystem): ./payload_1
[+] Creating empty disk image of size 50M...
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 0.0182856 s, 2.9 GB/s
[+] Attaching to loop device...
[+] Attached at /dev/loop0
[+] Formatting with ext4...
mke2fs 1.47.2 (1-Jan-2025)
Discarding device blocks: done                            
Creating filesystem with 51200 1k blocks and 12824 inodes
Filesystem UUID: 2f8a53f8-aadf-4607-8b36-188eecc92864
Superblock backups stored on blocks: 
        8193, 24577, 40961

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

[+] Mounting...
[+] Copying files from ./payload_1 to disk image...

=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option:

Well that looks good doesn’t it.

Let’s:

hash as we scripted command 6 for
run foremost and scalpel
compare their outputs
unmount and detach as then, task 1 is done.

[+] Mounting...
[+] Copying files from ./payload_1 to disk image...

=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 6
[+] Generating SHA256 checksums of payloads...
[+] Checksums saved in checksums_task1.txt

=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 4
[+] Running Foremost...
Processing: task1.img
|*|
[+] Foremost done. Output in output_foremost/

=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 5
[+] Running Scalpel...
Scalpel version 1.60
Written by Golden G. Richard III, based on Foremost 0.69.

Opening target "/home/kali/task1.img"

ERROR: The configuration file didn't specify any file types to carve.
(If you're using the default configuration file, you'll have to
uncomment some of the file types.)

See /etc/scalpel/scalpel.conf.

What’s that, scalpel fails with a complaint about the config file?

Fixing Scalpel config

By default, scalpel uses /etc/scalpel/scalpel.conf as config file. Unless changed, all file types there are commented out - nothing is carved unless we explicitly activate some.

Search with CTRL+W for the the formats you want and remove the leading #. You’ll come to notice that this is already a big difference between Scalpel and Foremost: Scalpel, out of the box, does not support docx, xls, .py and many others.

Now let’s hit:


Scalpel version 1.60
Written by Golden G. Richard III, based on Foremost 0.69.

Opening target "/home/kali/task1.img"

Image file pass 1/2.
task1.img: 100.0% |************************************************************|   50.0 MB    00:00 ETAAllocating work queues...
Work queues allocation complete. Building carve lists...
Carve lists built.  Workload:
gif with header "\x47\x49\x46\x38\x37\x61" and footer "\x00\x3b" --> 0 files
gif with header "\x47\x49\x46\x38\x39\x61" and footer "\x00\x3b" --> 0 files
jpg with header "\xff\xd8\xff\x3f\x3f\x3f\x45\x78\x69\x66" and footer "\xff\xd9" --> 1 files
jpg with header "\xff\xd8\xff\x3f\x3f\x3f\x4a\x46\x49\x46" and footer "\xff\xd9" --> 0 files
png with header "\x50\x4e\x47\x3f" and footer "\xff\xfc\xfd\xfe" --> 0 files
doc with header "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" and footer "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" --> 0 files
doc with header "\xd0\xcf\x11\xe0\xa1\xb1" and footer "" --> 0 files
htm with header "\x3c\x68\x74\x6d\x6c" and footer "\x3c\x2f\x68\x74\x6d\x6c\x3e" --> 1 files
pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0d" --> 0 files
pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0a" --> 1 files
txt with header "\x2d\x2d\x2d\x2d\x2d\x42\x45\x47\x49\x4e\x20\x50\x47\x50" and footer "" --> 0 files
Carving files from image.
Image file pass 2/2.
task1.img: 100.0% |************************************************************|   50.0 MB    00:00 ETAProcessing of image file complete. Cleaning up...
Done.
Scalpel is done, files carved = 3, elapsed = 0 seconds.

I did it manually again to avoid running the whoole script again - also something that i should probably automate.

Comparison of Scalpel and Foremost

Step 1: Listing my original Payload-Files

find payload_1 -type f | sort > original_files.txt

This will be my ground-truth to compare their results to.

Step 2: Creating Recover-Lists

find output_foremost -type f | sort > foremost_files.txt
find output_scalpel -type f | sort > scalpel_files.txt

Step 3: Comparing Results

Foremost:

flt run "cat foremost_files.txt" --case Uebung_05                                
2025-05-30 10:58:45,657 [INFO] Executing command: cat foremost_files.txt
2025-05-30 10:58:45,674 [INFO] [+] Log written to: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-58-45-657154+02-00_command.log
2025-05-30 10:58:45,674 [INFO] Command executed, logged: cat foremost_files.txt
[+] Signed logfile: 2025-05-30T16-58-45-657154+02-00_command.log.sig
2025-05-30 10:58:46,026 [INFO] Log file signed: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-58-45-657154+02-00_command.log

[+] Command Output:
[STDOUT]
output_foremost/audit.txt
output_foremost/htm/00017412.htm
output_foremost/jpg/00018434.jpg
output_foremost/pdf/00017414.pdf
output_foremost/png/00016902.png

[STDERR]

Scalpel:

flt run "cat scalpel_files.txt" --case Uebung_05 
2025-05-30 10:59:24,403 [INFO] Executing command: cat scalpel_files.txt
2025-05-30 10:59:24,421 [INFO] [+] Log written to: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-59-24-403533+02-00_command.log
2025-05-30 10:59:24,421 [INFO] Command executed, logged: cat scalpel_files.txt
[+] Signed logfile: 2025-05-30T16-59-24-403533+02-00_command.log.sig
2025-05-30 10:59:24,775 [INFO] Log file signed: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-59-24-403533+02-00_command.log

[+] Command Output:
[STDOUT]
output_scalpel/audit.txt
output_scalpel/htm-7-0/00000001.htm
output_scalpel/jpg-2-0/00000000.jpg
output_scalpel/pdf-9-0/00000002.pdf

[STDERR]

ls payload_1
file.doc   file.ppt   file.xls   foremost_files.txt  image1.png  index.html  test.mpeg  test.txt
file.docx  file.pptx  file.xlsx  hello.py            image2.jpg  test.mp3    test.pdf

Wooah, a lot missing there.. why?

Due to my researches and understandings, there are some main reasons that can happen. Just a small side-quest for a better grasp of involved concepts:

🔍 What Are File Signatures?

A file signature, also known as magic number, is a fixed sequence of Bytes at the start (and in rare cases also at the end) of a file. The signature identifies the file’s type. Some examples include:

File type	Hex-Signature	ASCII
`.docx`	`50 4B 03 04`	`PK\x03\x04`
`.pdf`	`25 50 44 46`	`%PDF`
`.jpg`	`FF D8 FF`	-
`.png`	`89 50 4E 47`	`.PNG`

Imagine this to be like the fingerprint of a file. It let’s you/ a software recognize the file type.

How do Foremost and Scalpel work?

Tool	Method
Foremost	Uses a built-in database of common file signatures. It works automatically using hardcoded patterns.
Scalpel	Highly configurable. You must manually enable each file type you want to carve via `scalpel.conf`.

So now let’s try reasoning the results we’ve gotten..

Why most files were not recovered

If a file doesn’t have a consistent or unique signature, it can’t be carved.

1. No File Signature

→ .txt, .py, .java: No fixed byte pattern = not carvable.

2. Missing Scalpel Config

→ .docx, .pptx, .xlsx not enabled in scalpel.conf.

3. Files Not Deleted

→ Carving tools look for “lost” files. Your files are intact in the FS = not recovered.

4. Too Small or Incomplete

→ Short .mpeg, .mp3 lacked headers/trailers or were below detection size.

5. Unsupported Format

→ .doc, .ppt, .xls may not be recognized unless tools have signatures for them.

We’ll later on need to see how to better up these results, these were now mere speculations - i think that’s what my professor’s eluding to in the upcoming tasks! Some quick recap:

✅ Wins

Created a dynamic disk image builder in Bash
Injected test files and understood mount/loop behavior
Ran Foremost and Scalpel to carve lost files
Compared results and learned why some file types elude carving tools

⚠️ Lessons

Scalpel requires manual config to do anything
File signatures matter more than extensions
Deleted ≠ recoverable unless the OS forgets about the file

☕ Fuel The Forge

🔥 This post took quite a bit of digital alchemy, scripting, and the occasional bash-induced headache.

If it helped you understand forensic file systems or inspired your own toolkit — maybe toss a byte into the brew?

🛠 Your support helps me:

brew more walkthroughs like this
build better tooling
survive the occasional mount: unknown filesystem meltdown

→ ko-fi.com/niklasheringer