Fuzzing CORB for breaking browser side channel defense
Origin: In browser, if two pages have same protocol, port (if specified), and host, then they are at same origin.
Side channel in this context: Here we are not talking about XS-Search or other higher level timing attack. Instead, we put our sight on the level of optimization-induced side channel attack (e.g. Spectre and Meltdown).
0x01 Same Origin Policy to Site Isolation
In short, same origin policy (SOP) states content from the different origin cannot interact with each. For instance, UCLA's webmasters can be super evil and want to leak students' schedule at UCSB. Without SOP, the evil webmaster could put following line of code on ucla.edu so that when a student opens ucla.edu in the browser after they log into UCSB's portal, their privacy is leaked.
// At https://ucla.edu // add evil jQuery first so less code needed to be written $.get("https://my.sa.ucsb.edu/gold/StudentSchedule.aspx") // deal with response
However, luckily in real world, accessing response here from UCSB would trigger following error:
Access to XMLHttpRequest at 'https://my.sa.ucsb.edu/gold/StudentSchedule.aspx' from origin 'https://ucla.edu' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
There are ways to allow access at the side of UCSB (e.g. setting CORS header) but this is not at our scope of discussion.
Everything is perfectly secure until one day some researchers discovered that the optimization at low level could indeed leak information. This means that even SOP is there, if we load response from UCSB, UCLA can somehow leverage side channel attack to exfil data from UCSB. What's more than that, high level implementations like cache, JSArray, etc. are not garunteed to be safe from data exfiltration between sites at different origins (e.g. CVE-2020-6442). Thus, a sandbox for each site could help a lot. Smart brains from Google then propose Site Isolation (SI). Quite similar to SOP, SI states that content from different origins should be loaded on different processes.
0x02 Cross Origin Block Reading (CORB)
However, there is a huge blocker for implementation of SI. Content from different origin could still somehow be loaded into malicious process due to some legacy reasons. Taking evil UCLA and good UCSB again as an example:
<!-- At ucla.edu--> <img src="//ucsb.edu/super-secret.json" /> <script src="//ucsb.edu/super-secret.json"></script> <link rel="stylesheet" href="//ucsb.edu/super-secret.json"></script> <!-- etc.. -->
Here, SOP would not prevent content from UCSB being loaded since people in the past don't see how this could be a threat. (Maybe a threat for the script one: it is called XSSI but this is not in the scope of discussion) Yet, it is not the same for SI. SI should not allow these secret JSON files to be loaded into the evil UCLA process. As a result, smart brains from Google introduced something named as Cross Origin Block Reading (CORB). In short, this would prevent flow of information that is likely related to secret from a different origin. Some may ask why not we ban all these kinds of legacy use of HTML at the browser side. Hmm. we can't. It is because banning these would end up having millions of websites without styling and millions of programmers spending days and nights fixing their websites.
After all, here is a basic implementation of CORB from Chromium project:
Block = replace the whole response to null in the process of original origin // https://source.chromium.org/chromium/chromium/src/+/master:services/network/public/cpp/cross_origin_read_blocking.cc;l=251 if MIME type in header is in this list (gzip, msexcel, etc.), then block // https://source.chromium.org/chromium/chromium/src/+/master:services/network/public/cpp/cross_origin_read_blocking.cc;l=461 if MIME type in header is JSON, check with function if it is indeed JSON, then block; if MIME type in header is HTML, check with function if it is indeed HTML, then block; if MIME type in header is XML, check with function if it is indeed XML, then block;
0x03 Breaking CORB
Notice that everything here is still evolving and this is more like a implementation based on heuristic. As a result, there is likely a lot of secret responses not being identified (being blind). Here are a few examples I have discovered:
Response MIME type = JSON, but returned content is HTML Discovered in:
- A Chinese government website’s weird news API
Response MIME type = JSON, returned content is also JSON, but PHP’s error_reporting function is called Discovered in
- Wordpress under specific circumstance
- WeCTF 2020 web challenge
0x04 Attack Plan
Fuzzing PHP applications to identify possible ways to blind CORB
- Report blinding CORB as deficiency in CORB
- If PHP application SELECT from database and database content affects response, then CORB being blind is not accepted
- Since CORB being blind is mostly under unexpected scenarios, prioritize input with most constraints