Sandboxing JS and HTML. A lession Learned

Sandboxing HTML pages with API wrappers as acountermeasure to malicious 3rd party code attacks.A lesson learned.

Stefano Di Paola, CTO & Chief Scientist Minded Security

$ Whoami Security Since 1999 OWASP Italy Director of Research Server Side (HPP, Expression Language

Injection...) Client Side (UXSS, SWFIntruder, DOMinator)

Work @ Minded Security CTO & Chief Scientist

Agenda

Intro The Problem Attempts Drawbacks Conclusion

Browser Features History

Same Origin Policy1995

XMLHttpRequest2002

Subverting SOP – Traditional way

Solution is easy Encode ALL dangerous inputs to HTML Entities, right? Not quite.

<html>..<script>evilJs</script>..</html>

taintedInput=<script>evilJs</script>

The Problem

Browser SOP

Industry needs to: Safely Allow external 3rd Party code (Advertising et al)

Let users customize pages(Facebook et al.)

Industry Needs - Server Side Filtering

Social portals want their users to be free to customize their home page.

SOP is too loose.

Solution? Create a server side filters that allow only a HTML subset

It's 2005MySpace is the most used Social Site

40 Millions of unique Visitors

Server Side HTML Filtering - MySpace

MySpace Approach: Whitelisted Tags/Attributes

only img,embed,div (blocked <script>,on* etc )

Style Allowed? Yes Word blacklisting:

E.g.: Javascript

MySpace Worm – Samy is my hero Samy bypasses:

JavaScript: stripped from EVERYWHERE. Replaced with '..'

http://namb.la/popular/tech.html

Server Side HTML Filtering - OWASP Anti Samy

Attempts to solve the MySpace approach on the server side. Author: Arshan Dabirsiaghi

Translates HTML to Well Formed XML Whitelist of Tags and Attributes Everything else is encoded One bypass so far (usual problem) – Fixed:

<![CDATA[ .. ]]> ← AntiSamy Expects this, but ... <![CDATA[ .. ]> ← works on every browser.

https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project

Server Side JS/HTML Filter - Caja

~2008 Author: Google The 3rd party code problem Aka

Sandboxed Advertising

[...The Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website. ..]

https://developers.google.com/caja/docs/about/

Google Attempt – Server Side Filter

Takes JS/ES5 strict mode, HTML, and CSS input and rewrites it into:* safe subset of HTML & CSS* JavaScript function with no free variables. (Code Rewrite)

Google Caja – Some Bypass

2009 Issues: Arbitrary code execution via DOM wrappers. Flaw in JSON parsing. JavaScript URLs in style attributes not

sanitized (Samy anyone?). 2013:

JavaScript parsers differ on whether they interpret escaped sequences of letters spelling a reserved word, such as "de\u006Cete", as an identifier or a reserved word.

https://github.com/google/caja/wiki/SecurityAdvisories

Server Side Filtering – Lession Learned

Negative security models are error prone

Different Browsers behave differently: Hard to make general assumption

Browsers do not always strictly implement RFC and siblings

Filtering something that is going to be parsed by a different parser is hard (Models Impedance Mismatch)

Client side HTML Filtering - HTML Data Bindings

2007 HTML Data Binding – Author: Stefano Di Paola Tries to overcome the server side problem HTML Sanitizer using JS + SQL Prepared Statements approach

and uses the browser native parser.

http://www.wisec.it/sectou.php?id=46c5843ea4900

// URL Native Parsingvar a = document.createElement('A');return a.protocol

var el=document.getElementById('someid')var val = document.createTextNode(binding['id'])el.appendChild( val );

HTML Data Bindings - Bypass Variable Width Charset could mess up every tag <plaintext>

tag! Leading to bypass the binding area (Half-Fixed)

(Fixed...sort of...)

Had to face browser complexity + some wrong assumption! Anyway, it was just a proof of concept.. :)

http://www.wisec.it/ph/test.php

Client Side JS Filtering - JSReg.* ~2010 Author: Gareth Heyes JS Sandbox. Uses a JS Tokenizer based on complex RegExp. Code:

https://code.google.com/p/jsreg/source/browse/trunk/JSReg/JSReg.js

Client Side Filtering - JSReg.* Code Rewriting Approach + Sandboxed Checks

http://www.businessinfo.co.uk/labs/jsreg/jsreg.html

https://code.google.com/p/jsreg/

JSReg.* – Bypasses 2010-2011

(/[/]/)[/(\/\))\/+alert(top)+"\/"/i]

.. first is the failure to strip the single line comment which then fools the regex rule into thinking that the code is a regex object and not function calls ..

http://marc.info/?l=websecurity&m=126855547523766http://www.thespanner.co.uk/2010/10/31/jsreg-bypasses/

Client Side JS Filtering - MentalJS 2011 Author: Gareth HeyesES5 JS Sandbox Parser based No RegExp this timeRewrites the JS code.

https://github.com/hackvertor/MentalJS

MentalJS - Bypasses 2014-2015

whitelisted attribute innerHTML for Script

insertBefore with null/undefined, bypasses unexpected Browser behavior.

Other very interesting bypasses:

x=document.createElement('script');x.innerHTML='alert(location)';document.body.appendChild(x);

s=document.createElement('script');s.insertBefore(document.createTextNode('alert(location)'),null);document.body.appendChild(s);

http://www.thespanner.co.uk/2015/05/03/how-i-smashed-mentaljs/

Client Side JS Filtering – Evel (Secure Eval)

2013 Author: Nathan Vander Wilt JS Sandbox using Browser Parser and Environment

redefinition (runtime sandbox).

https://github.com/natevw/evel/

Evel - Bypasses 2013

All of them defeated it by trying to access the window object. If one reach Function() this object is always the window. Function is the constructor of all functions.

http://perfectionkills.com/global-eval-what-are-the-options/#indirect_eval_call_theory

Client Side HTML Filtering - DOMPurify 2014 Author: Mario Heiderich Uses internal browser parser to create a DOM model and

then sanitize untrusted HTML:

https://github.com/cure53/DOMPurify

Client Side HTML Filtering - DOMPurify Sanitize over a whitelist of tags and attributes:

Uses JS to access the DOM: As seen in Data Binding there are several browsers

subtelties.

https://github.com/cure53/DOMPurify

DOMPurify – Bypasses DOM Clobbering checks bypass:

Attack:

https://soroush.secproject.com/blog/2014/04/how-did-i-bypass-everything-in-modsecurity-evasion-challenge/

http://www.thespanner.co.uk/2013/05/16/dom-clobbering/

AngularJS - Sandbox 2013 - Author: Google

AngularJS – Bypasses Blacklisted Functions call prevention bypass:

https://code.google.com/p/mustache-security/wiki/AngularJS

JS Sandbox Approach - Attacks Fool the parser into a wrong state (Impedance

Mismatch) (different parsers) – code rewrite

Fool the sandbox via unexpected Browser behavior. Eg. DOM Clobbering et al.

Access unsanitized members (constructor, prototype etc)

HTML Sandbox Approach - Attacks Fool the parser into a wrong state (Impedance Mismatch)

(different parsers) – different parser

Fool the native parser (bad assumptions) as browsers 5+ parsers. Eg: is createHTMLDocument implemented as expected?

Fool the sandbox via unexpected Browser DOM behavior. Eg: are attribute names and values correctly normalized?

Access unsanitized members (constructor, prototype etc)

Client Side – Lession Learned It may really be the right direction but:

JS: Code Rewrite it's hard in the JS context as it requires a separate parser.

HTML: Using browsers parsers allows to automatically identify, with no particular effort, the right context , at the right time, with the right charset

HTML: Browser still have 5+ (!!) parsers (HTML, URL,CSS,JS,SVG,...). Need to apply them at the right time!

HTML + JS: Intricate echosystem (DOM Clobbering)

Disclaimer The bypasses of the sandboxes where all fixed

Not all sandboxes are still maintained, but most of them are.

It's still unproven that the proposed solutions are completely safe

..but without breakers feedback, no bypass would have been found.( The world needs good brains to break things – and the fix them!)

Authors are brave and smart people trying to solve a complex problem with passion and reasoning

Conclusions The filtering approach to sanitize/sandbox untrusted code is a

hard problem

Several attempts during the years have been made

Using a different layer to sanitize code that'll be interpreted in complex environment is usually a bad idea. (Eg. Server Side → Client Side)

A fully functional, unbreakable solution is yet to be released

Browser vendors and Sandbox builders should join together to solve the problem

Questions?Stefano Di Paola

Mail: [email protected]

Twitter: @WisecWisec

Blog: blog.mindedsecurity.com

Site: www.mindedsecurity.com

Thanks!

Sandboxing JS and HTML. A lession Learned

Technology