Monday, June 8, 2009

Browser architecture; Chrome;

Basic questions:
* Where does V8 live in Chrome?
How does it interact with the rest of the browser?
* Where do the JS engines (interpreters) live in the architecture
of IE and Mozilla Firefox?

The browser is divided into two separate functional units, which have different responsibilities and different trust levels. There is the rendering engine (which represents the web protection domain) which contains the HTML parser, the JavaScript VM, and the DOM handler. This is sandboxed (has restricted access to the underlying OS) and treated as a black box, the input to which is HTML and the output of which is "rendered bitmaps" (to be displayed to the screen). Then there is the browser kernel (which represents the user protection domain) which manages persistent resources (e.g., cookies and the password database) and interacts with the OS to receive user input, draw to the screen, and access the network. The browser kernel is not sandboxed. Goal: if the rendering engine is compromised (or when), contain that compromise to the rendering engine; do not let this result in compromising the user's entire browser, much less his entire computer.

More on the Rendering Engine (RE):
  • Interacts with untrusted web content.
  • Converts HTTP responses and user input events (as received from the browser kernel) into rendered bitmaps.
  • Services calls to the DOM API.
  • Uses the browser kernel API to interact with the user, the underlying machine, or the network.
  • Does most of the parsing involved in browsing and interacting with web content, including: HTML parsing, image decoding, and JavaScript parsing.
  • Contains the most complicated and larger amount of code; hence, likely to be future location of security vulnerabilities.
  • A separate instance is used for each tab (or site even in the case that one tab is navigated from site A to site B). Also use a separate instance to display trusted content that is generated by the Browser Kernel or by another RE; this separate instance "does not handle content from the web." The exception to this rule is the Web Inspector which both displays trusted content and is rendered by an RE that contains web content.
  • Proceeds in stages:
  1. Parse.
  2. Build in-memory representation of the DOM.
  3. Lay out document graphically.
  4. Manipulate document in response to script instructions.
More on the Browser Kernel (BK):
  • Manages multiple instances of the RE.
  • Implements the browser kernel API.
  • Manages persistent state (such as bookmarks, cookies, saved passwords).
  • Interacts with the network.
  • "Mediates between the RE and the OS's native window manager"
  • Keeps track of which privileges has given to which RE; uses this to implement a security policy that defines how exactly the RE is sandboxed. (Is this saying that the BK implements the sandbox that supposedly contains the RE? Or merely that the BK keeps track of the values used in sandboxing the RE, i.e., provides parameters to the whatever code actually performs the sandboxing?)
How were functions split up between the RE and the BK? It sounds as though the most vulnerable tasks (i.e., those functions that have historically been responsible for the largest number of security vulnerabilities) were assigned to the RE. So if some task or function has historically been error-prone, make the RE perform that task/function. Why? Cause the RE is sandboxed whereas the BK is not.
Above is a picture of how the rendering engine is effectively treated as a black box; its input is some HTML page and its output is a bitmap which should be displayed on the user's screen. Presumably, the RE gets that HTML page via a call to the BK and provides the bitmap via another BK API method. In any case, the exact mechanics are less important than the idea that the BK actually reads to / writes from the network, provides any HTML pages to the rendering engine, then interacts with the OS to actually display the rendered bitmap. So all of the privileged stuff (network interaction, receiving user input, writing to the screen) is done by the BK, not the RE. This separation
helps prevent an attacker who knows an unpatched vulnerability in the image decoder from taking control of the browser kernel
One exception is noted: the BK parses HTTP headers in order to extract the HTML or zipped content. If the content is compressed, the BK invokes a decoder (e.g., gzip, bzip2) to decompress the HTTP response then passed the decompressed enclosed HTML (or whatever) to the RE.

Plug-ins: e.g., Adobe Acrobat, Flash player, Real player, Quicktime, ...
* Each runs in a separate host process and runs outside of the sandbox
* Runs with the user's full privileges; e.g., Flash player can access the user's
microphone and webcam, can write to the file system, ...

The sandbox
The goal of the sandbox is to prevent any rendering engine process from interacting directly with the file system (and presumably the network too, but perhaps some other code is responsible for restricting network access), including the registry (we want all such accesses to be via the Browser Kernel, which exports an interface for doing these things). Even generally, want to funnel all interactions with the OS through the BK interface. The enforcement of the sandbox boundaries actually consists of several techniques:
  • Have RE run with a restricted security token (rather than the user's Windows security token): causes the Windows Security Manager to interpose on each access by the RE to a securable object and determine whether the RE has sufficient privileges to access that object in that way. In most cases, the RE's privileges are such that access will be denied.
  • Run RE in a "separate desktop": for accesses on non-securable objects or for Windows API calls for which the privilege check is insufficient, this limits the extent that a given RE can have an effect on. Rather than running the RE on the user's desktop (where the RE could call SetWindowsHookEx or broadcast a message to all windows, including to those windows unrelated to the browser), the RE runs in its own little desktop. A bit unclear is whether each RE instance runs in its own desktop.
  • Run RE in a Windows Job Object: restricts its ability to create new processes, read or write the clipboard, access user handles.
  • Have RE downgrade its privileges prior to "rendering web content" (which I'm taking to mean prior to receiving and parsing HTML content). This is somewhat reminiscent of how the setuid system call is used on *nix systems though I'm not sure that this downgrade is irrevocable or if, instead, the RE can still upgrade its privileges back to the level it was before the downgrade.
The Browser Kernel Interface
If the sandbox above works properly then in order to access OS functionality (e.g., user interaction, persistent storage, and the network), the RE must go through the BK API.

* Chromium Developer Documentation (including design documents)

To do (to make this better/more thorough):
* Add notes about section 5, the BK interface
* Read Chromium Developer Documentation
* Talk about Google Gears and maybe more about the Plug-in architecture too

No comments:

Post a Comment