Carving Chaos: Building and Breaking Filesystems for Fun and Forensics
Let's do this folks! Been absent from digital forensics.. for quite a while? I enjoy penetration testing so much right now that i now try to make the conscious decision to do this session here.
Digital Forensics is fun, but when something's more fun right now, ADHD won't let me take side-missions for thaat long.
But this will be a fun one. Today's all about improving on my understandings - of not only forensics, but also bash scripting - by working on a university task i got for my module.. well, Digital Forensics, lol.
Log Tooling & Prep
First of all, as I've made a fresh VM for this one, which from time to time i can only recommend, I install my Forensic Log Tracker; feel free to check it out for yourself hehe, i won't post installation instructions here incase they change in the meantime, just look them up on GitHub :P.
Check out my Forensic Log Tracker
BTW i think i need to test the Windows setup.. don't know how strong that one stands.. Anyways, off to Digital Forensics!
Task One: Bash up a Disk
"Create a script that generates a storage device of any size and formats it with a file system of your choice."
A great opportunity to directly get into Bash Scripting, the whole session depends on this script being fairly good lol.
Honestly i've wanted to dive into Neovim or a similar editor for a long time now but it doesn't seem to me like that day is today - i am quite tired so let's focus our motivation on actually getting stuff done.
Preparing to get stuff done is NOT getting stuff done!
Input Gathering
nano generation.sh
#!/bin/bash
read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
Off to a good start, reading in user customization parameters.
Disk Creation
Now we'll initialize the file as empty with dd - didn't mention in my earlier forensic guides right? I'll have to.
echo "[+] Creating empty disk image of size $FILESIZE..."
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progressif=/dev/zero -> our input file is null-bytes (initializing empty, remember), the outputfile is of=${FILENAME}.img and the blocksize is 1MB via bs=1M.
Also, with the count and echo part, we extract the size/ "the amount of MB wanted" out with sed.
We might want to make that script smarter later so one can also use G for Gigabyte, but for know let's go with MB.
Loop Device Linking
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV" losetup --find --show searches for the next free loopback-device (/dev/loopN) and connects it with our created image.
The path, e.g. /dev/loop0 is saved in the variable LOOPDEV.
Filesystem Formatting
echo "[+] Formatting with $FILESYSTEM..."
mkfs.$FILESYSTEM $LOOPDEVThis, varying from input to input, will be like mkfs.fat /dev/loop0 or mkfs.ext4 /dev/loop0.
Recap
#!/bin/bash
# === Inputs ===
read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..."
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progress
# === Creating loop-back-device ===
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV"
# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..."
mkfs.$FILESYSTEM $LOOPDEVjust to recap, this is our current script.
Code along, see to it that you understand what's happening. Bash is fun!
Mounting & File Injection
MOUNT_DIR="./mnt_${FILENAME}"
mkdir -p $MOUNT_DIR
echo "[+] Mounting..."
mount $LOOPDEV $MOUNT_DIRWhat's happening here? A local folder (or structure) is created (recursively, that's what the mkdir -p-flag indicates - if we give X/Y/Z, they'll all be created if non-existing.)
The loop-back-device is mounted in this created folder, then we can write files to that folder and they'll land directly in the image.
You know what'd be great? To be able to provide a folder in which our files will be that we'll create to fill the disk image. Lets add that at the top.
#!/bin/bash
# === Inputs ===
read -p "Filesitze (e.g. 100Mb): " FILESIZE
read -p "Filesystem (e.g. ext4): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
read -p "Payload directory (the folder with files you'll feed your newly created filesystem): " PAYLOAD_DIR
# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..."
dd if=/dev/zero of=${FILENAME}.img bs=1M count=$(echo $FILESIZE | sed 's/M//') status=progress
# === Creating loop-back-device ===
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show $DISK_IMAGE)
echo "[+] Attached at $LOOPDEV"
# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..."
mkfs.$FILESYSTEM $LOOPDEV
# === Mounting ===
MOUNT_DIR="./mnt_${FILENAME}"
mkdir -p $MOUNT_DIR
echo "[+] Mounting..."
mount $LOOPDEV $MOUNT_DIR
# === Bringing in data ===
echo "[+] Copying files from $PAYLOAD_DIR to disk image..."
cp -r $PAYLOAD_DIR/* $MOUNT_DIR/Interactive Toolkit menu
Let's do something fun shall we?
Making a command menu from here :D - this part is our base setup, now we need to be able to dynamiccally device.
The full script with this is, and some adjustments for errors, root check, and such:
#!/bin/bash
set -e # stop script on any error
# === checking root rights ===
if [ "$EUID" -ne 0 ]; then
echo "[!] Please run this script as root (sudo)."
exit 1
fi
# === Inputs ===
read -p "Filesize (e.g. 100M): " FILESIZE
read -p "Filesystem (e.g. ext4, fat): " FILESYSTEM
read -p "Filename (e.g. virtual_disk): " FILENAME
read -p "Payload directory (the folder with files you'll feed your newly created filesystem): " PAYLOAD_DIR
# === Ensure payload directory exists ===
if [ ! -d "$PAYLOAD_DIR" ]; then
echo "[!] Payload directory '$PAYLOAD_DIR' does not exist. Creating it..."
mkdir -p "$PAYLOAD_DIR"
echo "[i] Payload directory created. Please add your files before running the script again."
exit 0
fi
# === Creating empty image from specs ===
echo "[+] Creating empty disk image of size $FILESIZE..."
DISK_IMAGE="${FILENAME}.img"
dd if=/dev/zero of="$DISK_IMAGE" bs=1M count=$(echo "$FILESIZE" | sed 's/M//') status=progress
# === Creating loop-back-device ===
echo "[+] Attaching to loop device..."
LOOPDEV=$(losetup --find --show "$DISK_IMAGE")
if [ -z "$LOOPDEV" ]; then
echo "[!] Failed to create loop device."
exit 1
fi
echo "[+] Attached at $LOOPDEV"
# === Formatting the Filesystem ===
echo "[+] Formatting with $FILESYSTEM..."
mkfs."$FILESYSTEM" "$LOOPDEV"
# === Mounting ===
MOUNT_DIR="./mnt_${FILENAME}"
mkdir -p "$MOUNT_DIR"
echo "[+] Mounting..."
mount "$LOOPDEV" "$MOUNT_DIR"
# === Bringing in data ===
echo "[+] Copying files from $PAYLOAD_DIR to disk image..."
cp -r "$PAYLOAD_DIR"/* "$MOUNT_DIR"/
# === Interactive menu after setup stage ===
while true; do
echo ""
echo "=== MENU ==="
echo "1) Unmount and detach disk"
echo "2) Create dump/backup of image"
echo "3) Check remaining space on mounted image"
echo "4) Run Foremost"
echo "5) Run Scalpel"
echo "6) Generate SHA256 checksums of payload"
echo "7) Exit"
echo "==========="
read -p "Choose an option: " option
case $option in
1)
echo "[+] Unmounting and detaching..."
sync
umount "$MOUNT_DIR"
losetup -d "$LOOPDEV"
echo "[+] Done."
;;
2)
echo "[+] Creating image dump..."
DUMP_DIR="./dumps"
mkdir -p "$DUMP_DIR"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
cp "$DISK_IMAGE" "$DUMP_DIR/${FILENAME}_$TIMESTAMP.img"
echo "[+] Dump saved to $DUMP_DIR/${FILENAME}_$TIMESTAMP.img"
;;
3)
echo "[+] Checking space..."
df -h "$MOUNT_DIR"
;;
4)
echo "[+] Running Foremost..."
mkdir -p output_foremost
foremost -i "$DISK_IMAGE" -o output_foremost
echo "[+] Foremost done. Output in output_foremost/"
;;
5)
echo "[+] Running Scalpel..."
mkdir -p output_scalpel
scalpel -c /etc/scalpel/scalpel.conf -o output_scalpel "$DISK_IMAGE"
echo "[+] Scalpel done. Output in output_scalpel/"
;;
6)
echo "[+] Generating SHA256 checksums of payloads..."
find "$PAYLOAD_DIR" -type f -exec sha256sum {} \; > "checksums_${FILENAME}.txt"
echo "[+] Checksums saved in checksums_${FILENAME}.txt"
;;
7)
echo "[✓] Exiting. Have a productive day & drink water!"
break
;;
*)
echo "[!] Invalid option. Try again."
;;
esac
doneThis script might be subject to future changes, check it's most current version out here in my git so you can be sure to have the most current version.
Alright, we should be good to go, let's install missing stuff and go!
> sudo apt-get install foremost
> sudo apt-get install scalpelWorking with our new Script
From now on, i will be tracking everything with the forensic-log-tracker comments function, i recommend you do that aswell.
Currently the log trackers config works like this, please check out the git incase that changes:
> sudo nano config/config.yaml # inside the repo
project:
analyst: "Max Mustermann"
timezone: "UTC"
execution:
default_output_lines: 20
dry_run_label: "[!] DRY RUN: Command not executed."
output:
language: "de" # "en", "de" – for future translation of explanations
format: "md" # "md", "html", "pdf" - NOT WORKING YET
preview_lines: 20
include_sha256: true
hash_algorithm: "sha256"
comment_type: "Comment" # "Callout" oder "Comment"
gpg:
enabled: true
auto_verify: true
default_key: "" # optional: GPG fingerprint
logging:
level: INFO # DEBUG, INFO, WARNING, ERROR, CRITICALI will change the timezone to my sweet CET and the name to.. well my name.
Log extensively (if you like, with
flt) — not just for grades, but because future-you will forget today's genius.
# Switching back your current dir to where your script is
> chmod +x generation.sh> sudo ./generation.sh
Filesize (e.g. 100M): 50M
Filesystem (e.g. ext4, fat): ext4
Filename (e.g. virtual_disk): task_1
Payload directory (the folder with files you'll feed your newly created filesystem): ./payload_1
[!] Payload directory './payload_1' does not exist. Creating it...
[i] Payload directory created. Please add your files before running the script again.A bit dumb that we put the dir creation after, but damn it, was a good way to test the inputs haha.
Now let's collect some files that we can use this with - my professor told to collect types of:
- .png
- .jpeg, .jpg
- .doc, .docx, .xls, .xslx, .ppt, .pptx, .txt, .pdf
- .mpeg
- .mp3
- .py, .java
- .html, .xhtml, .xml
Just for good'ol security reasons, delete the folder and make it anew so we don't have to create everything in it as root.
> rmdir payload_1
> mkdir payload_1
> cd payload_1# Text
echo "test" > test.txt
# Code
echo 'print("Hello")' > hello.py
echo '<html><body>Hello</body></html>' > index.html
# Image
wget -O image1.png https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png
wget -O image2.jpg https://upload.wikimedia.org/wikipedia/commons/1/1f/Wikipedia_mini_globe_handheld.jpg
# Office-Dummy
touch file.doc file.docx file.xls file.xlsx file.ppt file.pptx
# Media
ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 2 test.mp3
# PDF
wget -O test.pdf https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdfYou might have to install ffmpeg for this.
A Nice Little Run
> sudo ./generation.sh
Filesize (e.g. 100M): 50M
Filesystem (e.g. ext4, fat): ext4
Filename (e.g. virtual_disk): task1
Payload directory (the folder with files you'll feed your newly created filesystem): ./payload_1
[+] Creating empty disk image of size 50M...
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 0.0182856 s, 2.9 GB/s
[+] Attaching to loop device...
[+] Attached at /dev/loop0
[+] Formatting with ext4...
mke2fs 1.47.2 (1-Jan-2025)
Discarding device blocks: done
Creating filesystem with 51200 1k blocks and 12824 inodes
Filesystem UUID: 2f8a53f8-aadf-4607-8b36-188eecc92864
Superblock backups stored on blocks:
8193, 24577, 40961
Allocating group tables: done
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done
[+] Mounting...
[+] Copying files from ./payload_1 to disk image...
=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: Well that looks good doesn't it.
Let's:
- hash as we scripted command 6 for
- run foremost and scalpel
- compare their outputs
- unmount and detach as then, task 1 is done.
[+] Mounting...
[+] Copying files from ./payload_1 to disk image...
=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 6
[+] Generating SHA256 checksums of payloads...
[+] Checksums saved in checksums_task1.txt
=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 4
[+] Running Foremost...
Processing: task1.img
|*|
[+] Foremost done. Output in output_foremost/
=== MENU ===
1) Unmount and detach disk
2) Create dump/backup of image
3) Check remaining space on mounted image
4) Run Foremost
5) Run Scalpel
6) Generate SHA256 checksums of payload
7) Exit
===========
Choose an option: 5
[+] Running Scalpel...
Scalpel version 1.60
Written by Golden G. Richard III, based on Foremost 0.69.
Opening target "/home/kali/task1.img"
ERROR: The configuration file didn't specify any file types to carve.
(If you're using the default configuration file, you'll have to
uncomment some of the file types.)
See /etc/scalpel/scalpel.conf.What's that, scalpel fails with a complaint about the config file?
Fixing Scalpel Config
By default, scalpel uses
/etc/scalpel/scalpel.confas config file.
Unless changed, all file types there are commented out - nothing is carved unless we explicitly activate some.
Search with CTRL+W for the the formats you want and remove the leading #.
You'll come to notice that this is already a big difference between Scalpel and Foremost:
Scalpel, out of the box, does not support docx, xls, .py and many others.
Now let's hit:
Scalpel version 1.60
Written by Golden G. Richard III, based on Foremost 0.69.
Opening target "/home/kali/task1.img"
Image file pass 1/2.
task1.img: 100.0% |************************************************************| 50.0 MB 00:00 ETAAllocating work queues...
Work queues allocation complete. Building carve lists...
Carve lists built. Workload:
gif with header "\x47\x49\x46\x38\x37\x61" and footer "\x00\x3b" --> 0 files
gif with header "\x47\x49\x46\x38\x39\x61" and footer "\x00\x3b" --> 0 files
jpg with header "\xff\xd8\xff\x3f\x3f\x3f\x45\x78\x69\x66" and footer "\xff\xd9" --> 1 files
jpg with header "\xff\xd8\xff\x3f\x3f\x3f\x4a\x46\x49\x46" and footer "\xff\xd9" --> 0 files
png with header "\x50\x4e\x47\x3f" and footer "\xff\xfc\xfd\xfe" --> 0 files
doc with header "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" and footer "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" --> 0 files
doc with header "\xd0\xcf\x11\xe0\xa1\xb1" and footer "" --> 0 files
htm with header "\x3c\x68\x74\x6d\x6c" and footer "\x3c\x2f\x68\x74\x6d\x6c\x3e" --> 1 files
pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0d" --> 0 files
pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0a" --> 1 files
txt with header "\x2d\x2d\x2d\x2d\x2d\x42\x45\x47\x49\x4e\x20\x50\x47\x50" and footer "" --> 0 files
Carving files from image.
Image file pass 2/2.
task1.img: 100.0% |************************************************************| 50.0 MB 00:00 ETAProcessing of image file complete. Cleaning up...
Done.
Scalpel is done, files carved = 3, elapsed = 0 seconds.I did it manually again to avoid running the whoole script again - also something that i should probably automate.
Comparison of Scalpel and Foremost
Step 1: Listing my original Payload-Files
> find payload_1 -type f | sort > original_files.txtThis will be my ground-truth to compare their results to.
Step 2: Creating Recover-Lists
> find output_foremost -type f | sort > foremost_files.txt
> find output_scalpel -type f | sort > scalpel_files.txtStep 3: Comparing Results
Foremost:
> flt run "cat foremost_files.txt" --case Uebung_05
2025-05-30 10:58:45,657 [INFO] Executing command: cat foremost_files.txt
2025-05-30 10:58:45,674 [INFO] [+] Log written to: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-58-45-657154+02-00_command.log
2025-05-30 10:58:45,674 [INFO] Command executed, logged: cat foremost_files.txt
[+] Signed logfile: 2025-05-30T16-58-45-657154+02-00_command.log.sig
2025-05-30 10:58:46,026 [INFO] Log file signed: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-58-45-657154+02-00_command.log
[+] Command Output:
[STDOUT]
output_foremost/audit.txt
output_foremost/htm/00017412.htm
output_foremost/jpg/00018434.jpg
output_foremost/pdf/00017414.pdf
output_foremost/png/00016902.png
[STDERR]Scalpel:
> flt run "cat scalpel_files.txt" --case Uebung_05
2025-05-30 10:59:24,403 [INFO] Executing command: cat scalpel_files.txt
2025-05-30 10:59:24,421 [INFO] [+] Log written to: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-59-24-403533+02-00_command.log
2025-05-30 10:59:24,421 [INFO] Command executed, logged: cat scalpel_files.txt
[+] Signed logfile: 2025-05-30T16-59-24-403533+02-00_command.log.sig
2025-05-30 10:59:24,775 [INFO] Log file signed: /home/kali/forensic-log-tracker/logs/Uebung_05/2025-05-30T16-59-24-403533+02-00_command.log
[+] Command Output:
[STDOUT]
output_scalpel/audit.txt
output_scalpel/htm-7-0/00000001.htm
output_scalpel/jpg-2-0/00000000.jpg
output_scalpel/pdf-9-0/00000002.pdf
[STDERR]> ls payload_1
file.doc file.ppt file.xls foremost_files.txt image1.png index.html test.mpeg test.txt
file.docx file.pptx file.xlsx hello.py image2.jpg test.mp3 test.pdfWooah, a lot missing there.. why?
Due to my researches and understandings, there are some main reasons that can happen.
Just a small side-quest for a better grasp of involved concepts:
What are File Signatures?
File Signatures: The Digital Fingerprint
A file signature, often called a magic number, is a fixed sequence of bytes found at the beginning (and occasionally the end) of a file. This signature is crucial because it identifies the file's type, allowing software and operating systems to correctly recognize and process the data.
You can think of the file signature as the digital fingerprint of a file.
For example, while the file extension (.docx or .pdf) can be easily renamed and falsified, the signature provides a definitive proof of type.
You've got it. Here is the highly informative, table-free, and purely English text explaining file signatures, focusing on their role in digital forensics.
Common File Signature Examples
- The signature for a DOCX file (
.docx) begins with the hexadecimal sequence50 4B 03 04, often represented in ASCII asPK\x03\x04. - A PDF file (
.pdf) is immediately identifiable by its hexadecimal signature25 50 44 46, corresponding to the ASCII characters%PDF. - The JPEG image format (
.jpg) starts with the distinctive hexadecimal sequenceFF D8 FF. - A PNG image (
.png) begins with the sequence89 50 4E 47, which includes the readable ASCII string.PNG.
How do Foremost and Scalpel work?
- Foremost uses a built-in database of common file signatures. It works automatically using hardcoded patterns.
- Scalpel is highly configurable. You must manually enable each file type you want to carve via
scalpel.conf.
So now let's try reasoning the results we've gotten..
Why most files were not recovered
If a file doesn’t have a consistent or unique signature, it can’t be carved.
1. No File Signature
→ .txt, .py, .java: No fixed byte pattern = not carvable.
2. Missing Scalpel Config
→ .docx, .pptx, .xlsx not enabled in scalpel.conf.
3. Files Not Deleted
→ Carving tools look for "lost" files. Your files are intact in the FS = not recovered.
4. Too Small or Incomplete
→ Short .mpeg, .mp3 lacked headers/trailers or were below detection size.
5. Unsupported Format
→ .doc, .ppt, .xls may not be recognized unless tools have signatures for them.
We'll later on need to see how to better up these results, these were now mere speculations - i think that's what my professor's eluding to in the upcoming tasks!
Some quick recap:
Wins
- Created a dynamic disk image builder in Bash
- Injected test files and understood mount/loop behavior
- Ran Foremost and Scalpel to carve lost files
- Compared results and learned why some file types elude carving tools
Lessons
- Scalpel requires manual config to do anything
- File signatures matter more than extensions
- Deleted ≠ recoverable unless the OS forgets about the file
That's it for today guys. Dring enough, stay healthy!
No spam, no sharing to third party. Only you and me.
Member discussion