How to use this site.
You already know this bit
Return-oriented programming (ROP) is a mechanism that can be leveraged to bypass exploit mitigation schemes such as NX/DEP.
For some background on the subject you can check out the Wikipedia page.
For a little more depth take a look at Sarif El-Sherei's ROP FTW paper.
Why you're here
ROP Emporium provides a series of challenges that are designed to teach ROP in isolation, with minimal requirement for reverse-engineering or bug hunting.
Each challenge introduces a new concept with slowly increasing complexity.
Follow the links on the homepage to see exercise descriptions and the challenge binaries.
Sometimes a few clues as to how you might go about solving a challenge are included, but there aren't any spoilers.
Architecture
The ROP Emporium challenges are available in 4 different architectures; x86, x86_64, ARMv5 & MIPS.
The idea is to help you understand the differences in ROP chain construction between them, and it's definitely worth giving different versions a try.
Get familiar with each calling convention so you know where to place your arguments before a call.
Under the assumption that you're working from an x86_64 Linux OS, the Working with foreign architectures section in Appendix B can help you get started exploiting the x86, ARMv5 & MIPS challenges.
Below is a short list of tools you may find useful, it is by no means exhaustive. These tools all have their own documentation and tutorials online but some one-shot uses that will allow you to quickly find information about your target binaries are covered later in this guide.
ropper
Standalone ROP gadget finder written in Python, can also display useful information about binary files. It has coloured output, interactive search and supports bad character lists.
Check out the github page for more information.
ROPGadget
Another powerful ROP gadget finder, doesn't have the interactive search or colourful output that ropper features but it has stronger gadget detection when it comes to ARM architecture.
It too has a github page.
pwntools
CTF framework written in Python.
Simplifies interaction with local and remote binaries which makes testing your ROP chains on a target a lot easier.
Check out the github page.
After solving this site's first "ret2win" challenge, consider browsing an example solution written by the developer/maintainer of pwntools.
radare2
radare2 is a disassembler, debugger and binary analysis tool amongst many other things.
It's absurdly powerful and you have more or less everything you need to complete the challenges on this site entirely within the radare2 framework.
It's actively developed and you can find more detail on their github page
which also hosts a cheatsheet.
pwndbg
pwndbg is a GDB plugin that greatly enhances its exploit development capability.
It can make it much easier to understand your environment when debugging your solution to a challenge.
The project has a homepage
and is hosted on github.
Also worth checking out are the list of features
and cheatsheet.
Bug hunting
The ROP Emporium challenges attempt to remove as much reliance on reverse-engineering and bug hunting as possible so you can focus on building your ROP chains.
Each binary has the same vulnerability: a user-provided string is copied into a stack-based buffer with no bounds checking, allowing a function’s saved return address to be overwritten.
Since this overflow occurs on the stack, you can just stick your ROP chain right there and the program will dutifully return through it.
The focus is on how you design your chains around the restrictions that the challenges put in place;
in some cases you may have to deal with bad characters, in others you may only have access to a limited or obscure set of gadgets.
Debuggery
Once you've planned your ROP chain, a debugger can make things much clearer and highlight any errors you've made early on.
pwntools can launch binaries via GDB and radare2 features a powerful debugger.
Debugging can give you a good indication of whether your chain will work in practice.
Automation
Using ROP chain generators such as angrop for these exercises is discouraged.
Learning the concepts of ROP by manually constructing your chains is the goal.
That said, don't shy away from writing your own automation tools if you find yourself always searching for the same set of gadgets in a binary, or writing some template functions for your exploits so you're not duplicating effort.
Needles in callstacks
The first issue many come across when developing their first ROP chain is gathering the information they need to begin crafting it.
What useful functions are present? What selection of gadgets are available?
This section contains some use cases of the tools mentioned above to help you find this information quickly.
Each challenge page gives a little more depth as to what's required and in some cases more specific tool-use tips as well.
Confirming protections
When approaching a challenge it's good practice to check the protections it has enabled (if any).
Bear in mind that some are dependent on both the flags the binary was compiled with and the OS.
The ROP Emporium challenges all implement NX, since that's the mitigation we're trying to bypass, and are designed to run on an OS with ASLR enabled.
You can confirm you're not wasting your time by using rabin2 or checksec:
$ rabin2 -I <binary>
$ checksec <binary>
rabin2 is one of many standalone binaries that make up the radare2 suite, it will be available to you if you’ve installed radare2.
checksec can be downloaded standalone from git but its functionality is also integrated into the pwntools framework which is highly recommended.
Function names
In unstripped binaries function names can sometimes be useful, especially if the programmer used helpful titles such as win_this_challenge().
You will still be able to learn the names of imported functions from stripped binaries.
Listing functions imported from shared libraries is simple:
$ rabin2 -i <binary>
$ nm -u <binary>
Listing just those functions written by the programmer is harder, a rough approximation could be:
$ rabin2 -qs <binary> | grep -ve imp -e ' 0 '
This leaves some lint but gives you something more readable than listing all symbols.
In the ROP Emporium challenges, keep an eye out for the usefulGadgets
symbol, which marks the position of gadgets that were added to the binary to help make your life easier.
Strings
Counterintuitively, don't use the "strings" command to search for strings in the challenge binaries, use rabin2 instead.
For example running strings on the "split" challenge will yield many lines of output, most of which are irrelevant. On the other hand, running:
$ rabin2 -z split
will yield very few lines, all of which are strings the programmer purposefully placed in the binary.
Here are some issues you may encounter, the causes of which aren't immediately obvious.
The MOVAPS issue
If you're segfaulting on a movaps
instruction in buffered_vfprintf()
or do_system()
in the x86_64 challenges,
then ensure the stack is 16-byte aligned before returning to GLIBC functions such as printf()
or system()
.
Some versions of GLIBC uses movaps
instructions to move data onto the stack in certain functions.
The 64 bit calling convention requires the stack to be 16-byte aligned before a call
instruction but this is easily violated during ROP chain execution,
causing all further calls from that function to be made with a misaligned stack.
movaps
triggers a general protection fault when operating on unaligned data,
so try padding your ROP chain with an extra ret
before returning into a function or return further into a function to skip a push
instruction.
/usr/bin/bash
When debugging a ROP chain with GDB, if you've successfully called system()
but the string you've passed is not a valid program, GDB will still tell you that it has started a new process "/usr/bin/bash".
This can be particularly confusing when you're trying to drop a shell.
Check out the system()
manpage for the reason behind this.
Too much
Consider the length of your ROP chains, each binary will only read a specific number of bytes before processing your input.
This is an attempt to guide players towards the desired solution to the challenge, if you're exceeding the input length for a challenge you're probably making life harder for yourself.
If in doubt, check that your entire chain has made it into memory intact.
You can use ltrace to check the exact number of bytes a challenge binary is attempting to read.
Stack alignment
If you've moved the stack pointer, ensure it's still correctly aligned for the architecture you're targeting, odd things can happen otherwise (see the movaps
issue above).
Stack location
Think about where your chain resides as it’s being processed.
If you've pivoted and placed a chain in the .data section of process memory for example, then consider how much stack space is needed by a function you've called.
If that function requires a lot of space, it may move the stack pointer into non-writable memory or overwrite memory that will have a detrimental effect on program stability if corrupted.
Look ma No-eXecute
Get on your way by clicking the "ret2win" challenge link on the homepage.
It features a short intro on how to approach the binary, including the use of some of the tools mentioned above.
This section attempts to explain how lazy binding works. Examples are from a simple 64 bit ELF compiled with GCC for x86_64 architecture, running on Ubuntu Linux.
Lazy binding is a technique used by the dynamic linker to decrease program startup time, in which symbol lookups for function calls into shared objects are deferred until the first time a function is actually called.
Two program sections are used to achieve this effect; the procedure linkage table (.plt
) and part of the global offset table (.got.plt
).
Lazy binding may be disabled by setting the LD_BIND_NOW environment variable to a nonempty string or using the RTLD_NOW flag when calling dlopen()
.
Calls
The first time an external function is called it, must be resolved. After that, all calls to it will be passed straight through to the desired function. The calling convention does not change.
The figure below shows a call from the example program. The call to puts()
here lands in the .plt
section, as do all external function calls.
The .plt section
The figure below shows the .plt
section from the example program. In this case 3 function stubs reside here, they take the form jmp; push; jmp;
.
Above the stubs a push; jmp;
sits at the head of the .plt
section.
Each stub jumps to the address residing at that function's .got.plt
entry, which before resolution points straight back into the .plt
one instruction below: the push; jmp;
.
Each function stub push
es its .got.plt
entry's offset, then jmp
s to the head of the .plt
.
The push; jmp;
at the head of the .plt
push
es the 2nd entry of the .got.plt
, which is the address of the linkmap head,
then jmp
s to the 3rd entry: a resolved function named _dl_runtime_resolve_avx()
which will patch the appropriate function's .got.plt
entry with the correct address of the desired function, then call it.
After this first call, all future calls to the function's .plt
stub will jmp
straight to the appropriate function.
The .got.plt section
The .got.plt
section shown below contains 6 entries; the first 3 will in all cases be the address of the program's .dynamic
section, the address of the linkmap head, and the address of _dl_runtime_resolve_avx()
.
All entries after these are functions to be resolved at call time.
__libc_start_main()
has been resolved and its entry points to the actual function address in the libc shared object.
puts()
and __printf_chk()
have not been resolved yet and their entries point back into their respective .plt
stubs.
Once these functions have been called for the first time, their entries will be updated to reflect their actual addresses.
If you're new to ARMv5 & MIPS, this section can help you get started pwning the ROP Emporium challenges for those architectures. It assumes your host OS is an x86_64 Linux platform, and the examples are for Ubuntu 20.04 LTS. If you're using a Linux distro with a different package manager, the package names may differ.
Debugging ARMv5 challenges with this setup is slightly different to how you would normally do it.
You'll need the appropriate version of GDB:
$ sudo apt install gdb-multiarch
If you're launching GDB from within a pwntools script, it will detect that you're trying to debug for a foreign architecture and take the appropriate measures.
If you have pwntools installed but want to debug from the command line, you can use:
$ pwn debug --exec <path_to_challenge_binary>
If you don't have pwntools installed, then you'll have to do the following to debug the ARMv5 challenges:
$ qemu-arm -g 1234 <path_to_challenge_binary>
Where the -g
flag is telling qemu to start a GDB debugging stub and 1234
is the port it should listen on.
Then from a 2nd terminal, start gdb-multiarch and attach to the session:
$ gdb-multiarch
(gdb) file <path_to_challenge_binary>
(gdb) target remote localhost:1234
Note that as of writing, pwndbg doesn't play nice with programs launched from qemu-user in this way. If you're debugging the ARMv5 or MIPS challenges, it's best to disable pwndbg for now.
Debugging the MIPS challenges involves the same steps documented above for the ARMv5 binaries using gdb-multiarch. The same issues currently exist with pwndbg so watch out.