Security is hard to get right. Between Cross-Site Scripting (XSS) and SQL Injection (SQL) alone, there are more ways to make mistakes than any developer can possibly be expected to keep track of manually -- and those are just the two most well-known types of vulnerabilities. Most developers have never even heard of more obscure attacks, like XML External Entity Injection (XXE), and yet a well-placed attack can be just as devastating as the most egregious XSS injection.
We’ll explain what exactly an XXE attack is later, but first it’s important to have a basic understanding of the anatomy of an XML document. XML, or Extensible Markup Language, is a format used to describe the structure of documents, such as web pages. For example, the following XML document might describe a blog post:
In the above document, there’s a few key pieces of terminology to keep track of. Firstly, a tag is a pair of angle brackets surrounded a name. Both <author> and </title> are examples of tags. More important are the logical components of the document, known as elements. One such element above is <author>Aleph One</author>.
A slightly more complicated document might look like the following:
In this case we’ve defined an entity: essentially a mapping from some name to a value. When this XML document is processed, any instances of “&author;” are going to be expanded to “Shane Wilton”. This is known as internal entity processing, and it is typically used to allow for the modular design of XML documents.
An XXE attack works by taking advantage of a little-known feature of XML -- external entities. The concept is the same as in internal entity processing, but the attack vector lies in being able to use external resources as the replacement text. For example, consider the following document:
When the above document is parsed, the “passwd” element is going to be expanded to contain the contents of “/etc/passwd”.
If a web application accepts user-created XML documents as input, or input which is otherwise used in the creation of XML documents, an attacker is able to use XML entity expansion to load files or other URI-referenceable resources into the web application. If this information is then displayed back to the attacker at a later point, then they’ll find themselves able to exfiltrate possibly privileged information.
Furthermore, by loading a stream of infinite data, like /dev/urandom, an attacker is able to consume all of a system’s resources, denying access to other users.
In some rare cases, it may be possible to gain remote code execution by loading executable code (Such as PHP), or by using the XXE attack as a beachhead to access other, more insecure, internal services. This was exactly the case last year, when a Brazilian engineer used an XXE attack to gain remote code execution against Facebook, earning their largest bug bounty payout to date. His impressive write-up can be read here.
XML External Entity Processing is by no means a complicated bug, but it is difficult to test for. There’s so many variables involved in launching a successful attack, that software engineers simply don’t have the time to invest in performing a full audit of their XML parsing capabilities, if they’re even aware of the possibility of XXE in the first place. That’s why we’re proud to announce that Tinfoil Security now supports automated scanning for XXE attacks, and for the next month, we'll also be scanning all of our free members, at no charge.
Sign up today, so your engineers can spend their time building your product, and we can spend our time worrying about the minutiæ of XML parsing.