From Patch To Exploit: CVE-2021-35029

A brief introduction
- The Target
Firmware Analysis
Joern - The Bug Hunter’s Workbench
- Modelling Complex Code Patterns With Joern
Conclusion
References

A brief introduction

This article explains the process of identifying and exploiting a known flaw on Zyxel USG devices, taking into consideration the following CVE:

CVE-2021-35029 - Authentication bypass & remote code execution, spotted in the wild on July 2021.

An authentication bypasss vulnerability in the web-based management interface of Zyxel USG/Zywall series firmware versions 4.35 through 4.64 and USG Flex, ATP, and VPN series firmware versions 4.35 through 5.01, which could allow a remote attacker to execute arbitrary commands on an affected device. - CVE Mitre.

Currently, there is no published exploit available for this vulnerability, so we decided to delay publishing this blog post.

Furthermore, this blog post aims to show how to find such vulnerability in two different ways:

With the standard approach, by diffing patched and unpatched firmware versions.
With Joern, a valuable tool for vulnerability discovery and research in static program analysis.

The Target

First, let’s introduce the target to the reader, Zyxel Usg.

According to the website, it’s a firewall solution designed for small and medium-sized businesses with plenty of features¹. Under the hood, the device is powered by a Cavium (now Maxwell) Octeon3 Big Endian MIPS64 SoC.

Unfortunately, Ghidra and QEMU do not fully support this specific architecture. At least IDA Pro seems to support it.

Firmware Analysis

The firmware can be downloaded from Zyxel’s official website, and to extract it, you can’t use the binwalk tool. In this case, there is an obstacle, the firmware is encrypted, so you need to find a method to bypass this protection. There is currently no publication explaining the firmware decryption process, and it is not the purpose of the document to explain how we managed to decompress it. Once extracted, you will have access to the classic LINUX filesystem layout.

The device presents many interesting things inside, such as geoblocking features, anti-botnet logic, Kaspersky antivirus, HTTP parser implemented in the kernel, etc. Keep in mind that more features mean a larger attack surface, but for now, we will restrict our analysis to the webserver since the vulnerability seems there.

The Web Server

The installed web server is an apache HTTPd, with custom CGI binaries, written in C and Python: that’s precisely the right place to look for weakness. First of all, let’s start by looking at the Apache httpd’s configuration file, /etc/service_conf/httpd.conf.

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

#LoadModule  auth_pam_module modules/mod_auth_pam.so
#LoadModule php4_module modules/libphp4.so
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule auth_zyxel_module    modules/mod_auth_zyxel.so

Include /etc/service_conf/httpd_zld.conf

TypesConfig conf/mime.types

DefaultType text/plain

<IfModule mod_mime_magic.c>
    MIMEMagicFile conf/magic
</IfModule>

DirectoryIndex weblogin.cgi
AuthZyxelRedirect /
AuthZyxelSkipPattern /images/ /weblogin.cgi /I18N.js /language
AuthZyxelSkipUserPattern 127.0.0.1:10443 /images/ /I18N.js /language /weblogin.cgi /access.cgi /setuser.cgi /grant_access.html /user/ /cgi-bin/ /RevProxy/ /Exchange/ /exchweb/ /public/ /Socks/ /CnfSocks/ /cifs/ /uploadcifs/ /epc/ /CnfEpc/ /frame_access.html /dummy.html
ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin/"

As we can see from the above lines, there is a custom apache module mod_auth_zyxel.so, where they define AuthZyxelRedirect / AuthZyxelSkipPattern and ‘AuthZyxelSkipUserPattern’ directives. By looking at these lines, we deduce the core endpoint reachable before the authentication. We are talking about weblogin.cgi.

Hunting for the vulnerability

The vulnerability we are looking for is an authentication bypass with command injection.

So let’s look for the entry points in the code with the following characteristics:

are reachable without authentication;
leads to exec calls with user-controllable input.

By observing httpd.conf we can quickly narrow the circle.

The flaw will likely be in:

mod_auth_zyxel.so: A custom apache httpd module is responsible for giving or denying access to endpoints based on the authtok cookie. This code runs on every request.

weblogin.cgi: The main binary that handles authentication.

Now, let’s understand how the login process is implemented by analyzing weblogin.cgi.

Analyzing weblogin.cgi

First step, load it into ghidra, don’t forget to change the ABI to “n32”. The analysis engine will fail to recognize functions with more than four arguments if you forget to change it.

The library responsible for parsing the CGI request coming from STDIN is libgcgi, an ancient library. You can find the sources on github² to get function signatures to improve analysis.
Here is the list of functions prone to executing commands with potential user input under control:
```
// Libraries functions
execv
execl
popen
// Internal wrappers
__execv
__get_ret_exec
get_ret_exec
```
Now we can adopt two different approaches to find the vulnerability.
Manual code review: for each abusable function, follow the cross-references to find the command injection
Bindiff:
- Export the binaries with BinExport³
- Load them on bindiff
- Sort for similarity

The vulnerability

The scenario becomes more evident, and something changes in these functions. Specifically, the user input regex filtering improved in the need_twofa_admin.

Why the strpbrk is not sufficient? Simply, it doesn’t check for spaces and double quotes in the username. To visualize better what might go wrong, let’s test this with python:

def vulnerable_check(username: str):
    cmd = "/bin/zysh -p 110 -e \"configure terminal _two-factor-auth admin-access _auth_need user %s\""
    
    if any(ch in ";`\'*!%^|&#$" for ch in username):
        print ("FAIL, you used a forbidden character")
    else:    
        print (f"OK looks good, i will execute: {cmd % username}")

Intended usage:

vulnerable_check('admin') -> OK looks good, i will execute: /bin/zysh -p 110 -e "configure terminal _two-factor-auth admin-access _auth_need user admin"
Close the current command by injecting a double quote:

vulnerable_check('admin"') -> OK looks good, i will execute: /bin/zysh -p 110 -e "configure terminal _two-factor-auth admin-access _auth_need user admin""
Inject a new command after closing the current one:

vulnerable_check('admin" -e "injection') -> OK looks good, i will execute: /bin/zysh -p 110 -e "configure terminal _two-factor-auth admin-access _auth_need user admin" -e "injection"

Joern - The Bug Hunter’s Workbench

Well, ok, we found and exploited the vulnerability. Let’s move on now and understand how we could’ve seen it in an automated way with Joern⁴.

Joern is a platform for analyzing source code, bytecode, and binary executables. It generates code property graphs (CPGs), a graph representation of code for cross-language code analysis. Code property graphs are stored in a custom graph database. This allows code to be mined using search queries formulated in a Scala-based domain-specific query language. Joern is developed with the goal of providing a useful tool for vulnerability discovery and research in static program analysis.

You can look at the documentation to get more information. ⁵

Modelling Complex Code Patterns With Joern

Let’s imagine we don’t know anything about the code injection vulnerability path. How hard is it to find it by modeling the vulnerable code pattern with joern?

The first step is to import the decompiled code into joern, and run the interprocedural data flow analysis commands:

Importing the code in joern

importCode("./src/vuln-weblogin.cgi.c")
// ossdataflow: Layer to support the OSS lightweight data flow tracker
run.ossdataflow

Note that it is easy to forget the line run.ossdataflow, beware that the data dependency will not be populated, so all the reachable functions won’t return anything!

Identifying sources and sinks

Next, we have to identify our inputs (the sources), and the functions which execute the command passed as input (the sinks).

The sources in this case are the GET and POST parameters set by initCgi() function and retrieved with gcgiFetchStringNext:

// Our input will be copied in the buffer pointed by ret:
gcgiReturnType gcgiFetchStringNext(char *field,char *ret,int max);

We can easily do it in joern with this single line of code

def src = cpg.method("gcgiFetchStringNext").callIn.argument(2)

For the sinks the story is a bit different, but it’s still easy thanks to the facilities offered by Joern’s DSL. There are different commands whose purpose is to execute the command passed as input, we can group them like that:

def sink_exec = cpg.method.name(".*exec.*").callIn.argument // all the arguments
def sink_popen = cpg.method.name("popen").callIn.argument(1) // restrict to argument 1
def sink_system = cpg.method.name("system").callIn.argument(1) // restrict to argument 1

Finding vulnerable codepaths

At this point, we can find the paths which put the sources into the sinks argument with this simple query:
```
sink_exec.reachableByFlows(src).map( _.elements.map( n => (n.lineNumber.get,n.astParent.code) )).l
```

Conclusion

This blog post shows how it is possible to identify a vulnerability in two ways: starting from its patch and exploring new tools to identify new vulnerabilities with a modern approach.

In CYS4, we care a lot about research; this is a small example.

If something was not clear enough, don’t hesitate to contact us.

References

exploit reverse-engineering

Nicola Vella

Computer Engineering student at University of Pisa. Passionated about Exploitation and Reverse Engineering.

Alessio Dalla Piazza

Security Consultant always passionate about Cyber Security and technology. I am currently CTO of CYS4 where with our activities we aim to increase the safety of our customers. Some of my public finding includes 0days in commercial products such as Skype, Safari, VMWare and IBM-WebSphere.