Thursday, June 11, 2009

V8: JavaScript Virtual Machine

The following are some notes from talks given about V8, which is Google's JavaScript virtual machine. These talks focus on a few main features which led to V8's performance advantages over the competition: (1) hidden classes, (2) generation of native code, and (3) inline caching. Hidden classes are V8's way of enforcing some order on the chaos that is JS dynamism. These notes are offered as-is; any mistakes are most certainly mine.

The three talks are:
* Lars Bak on V8 (September 2008)
* Kevin Millikin on V8 (September 2008)
* Lars Bak on V8 (April 2009)

Miscellaneous facts about V8:
  • Functions compiled JIT-style; if never invoke a function, it will never be compiled (lazy compilation).

  • It generates machine code for two architectures: x86 and ARM (the latter of which is used in mobile phones).

  • It's open-source.

  • The V8 JavaScript library is implemented in JavaScript (in contrast to being implemented in the VM); this has several benefits: (1) easier to change and extend the library, (2) don't have to know VM internals to read/understand/modify JS library function implementations, (3) enables them to capitalize on the performance of the JS compiler (and on improvements therein). The drawback is that startup time is 30 ms (for a new tab); they have some solution which entails dumping the heap contents. Not sure how this works but it brings startup time to 4 — 8 ms.

  • In V8's hidden classes, the following two functions (or object prototypes) are not treated as identical:

    function Point( x, y ) { this.x = x; this.y = y; }
    function Point( x, y ) { this.y = y; this.x = x; }

    That is, the order in which properties are defined matters.
Lars Bak on V8 (September 2008)

V8 compiles JavaScript to native machine code, in contrast to previous implementations of JS (interpreters). Lars Bak talks about V8 in a couple different videos. Summarized here are the presented advances. V8...
  1. Uses hidden classes.

    In most JS implementations, if there are two objects of the same type, the representations of those objects are independent of one another; i.e., it's not clear that the objects share lineage. In V8, they introduce hidden classes, where a hidden class captures the property/method structure of an object so that all instances of that object point to that hidden class. Conceptually, this is not dissimilar from how classes with an inheritance relationship are represented in C++ (as discussed here).

    Their use of hidden classes enables them to generate native machine code for a JavaScript pointer dereference (i.e., accessing an object property) by regularizing the structure of how objects are stored. The alternative (as is done in precursor JavaScript implementations) is to resolve that property through a comparatively expensive run-time lookup.

  2. Generates native machine code.

    Their generation of native machine code to perform method and property lookup for an object results in fast execution. They compound the effects of this by caching such native machine code translation (of a line in a JavaScript program) and re-using the cached machine code rather than retranslating that line every time. This is the inline caching technique discussed here.

  3. Does garbage collection (memory management generally) more efficiently.

    Their goal was fast allocation and to eliminate garbage collection pauses, which can interfere with a program's interactivity. They achieved this by implementing a garbge collector that reclaims objects (from the global object heap) incrementally as the script executes. More on garbage collection here.
Kevin Millikin on V8 (September 2008)

Motivation for Google Chrome and creating a new JavaScript implementation: in 2006, JS implementation didn't scale in the # of lines of code or in the # of objects created. The basic design ideas discussed (the same as those introduced above):
* Hidden classes and hidden class transitions
* Inline caching
* Compilation to native code
* Efficient memory management system

The basic intuition on how to make JS faster was to bring order to chaos, in particular to the objects portion of JS, which enables objects to be created on the fly, properties/methods added/removed on the fly, etc. Another complicating factor was JS's support of duck typing, which meant that when a line of JS entailed an access of an object property (e.g., val.x), the actual object used as val (at run-time) could be any object at all (in contrast to languages which have stricter typing policies). This complete lack of information meant that every access of a property or method involved hash table lookup(s), which are expensive compared to array dereferences. So the idea was to find a way to categorize JS objects into classes; this would allow them to apply well-known performance optimization techniques from class-based language implementations.

In statically-typed object-oriented languages — those for which most type checking is done at compile time — (e.g., Java and C++), we know the layout of an object at compile time. So, accessing an object property value is just like an array access. Same is true for methods; the address of the actual method may be unknown (at compile time) but we know the location of a pointer which will hold the address of that method at run-time. By contrast, in JavaScript, there are no classes. So when see a particular object access at some point in a program, don't know (a) if that object even has that property, (b) if it does have the property, where that property is located. Resolving these questions in previous JS implementations entailed performing hash table lookup(s). Their solution: hidden classes.

V8's Hidden Classes (minutes 10:22 to 26:50)
We've discussed the motivations for representing the shared structure of objects (e.g., two instances of the same Point object). Now let's look at how this is done. Imagine we have the following code (explanation starts at minute 13:15):

function Point( x, y ) { // line 1
this.x = x; // line 2
this.y = y; // line 3
} // line 4
var p = new Point(0,1); // line 5
var q = new Point(2,3); // line 6

When V8 encounters line 5 — which is the first instantiation of a Point object — it will look for a hidden class associated with Point. If it doesn't find one (which it won't in this case), it will create one. V8 will create a structure representing p and it will create a series of hidden classes. The first hidden class (class 0) is created after executing just the first line of the Point class; hence, class 0 has no properties. Then V8 encounters line 2 and so it creates a new hidden class (class 1) which has property x; there is a transition (pointer) from class 0 to class 1. Then V8 encounters line 3 and so V8 creates a new hidden class (class 2) with properties x and y; then V8 creates a transition (pointer) from class 2 to class 3. Finally, the p object has its values for properties x and y populated in its own data structure; then there is a pointer from the p data structure to the hidden class representing Point (which is class 2).

Two natural questions arise when considering the above:
  1. Why create separate classes (class 0, 1, and 2) rather than just creating a single class and appending all properties to it?
  2. Why keep the earlier versions of the Point class (0 and 1) around?
(I don't have any answers for these questions, yet.)

V8's Inline caching
Every property access (reading a property, writing a property, invoking a function) "is governed by an inline cache stub." An inline cache stub is a little block of machine code; control is transferred to such a stub whenever there is a property access. My sense is that each of the stubs used is different from the others; i.e., the stub code is not generic but rather is tailored to the particular property accessed at some site. And for the actual story...

An inline cache stub is in one of four states:
  1. Uninitialized: this is a generic stub.

    We're in this state when we've never seen any object at a property access site (which is just another way of saying: when this particular property access at this particular code location has never been exercised). This stub does a dynamic lookup; once that lookup is complete, this stub rewrites the call to the inline cache stub to be a call to a different stub, the pre-monomorphic one.

  2. Pre-monomorphic: this is a generic stub.

    This also performs a run-time lookup then rewrites the call to be to a different stub: a customized monomorphic one, which is written just in time. Why even bother with this state? Because in JS, lots of code is executed exactly once (setup, initialization). Don't want to go to the trouble of generating customized, optimized code (the monomorphic stub) for a property access if it only occurs once.

  3. Monomorphic: this is a tailored/customized stub.

    This stub records the hidden clas of the object we saw there. If we ever see an object at this call site (i.e., property access) with a different hidden class then rewrite the stub to be megamorphic. By "customized," we mean that the stub will contain the address of the expected hidden class, which will be compared to the hidden class of the object being accessed, as well as the known offset (to the accessed property).

  4. Megamorphic: this is a generic stub.

    This always does a run-time lookup but, unlike the uninitialized and pre-monomorphic states, a megamorphic stub will never replace itself with anything. Always stays megamorphic (though note that the garbage collection engine will delete all stubs, effectively reverting them to the uninitialized state, periodically; so, it is possible to transition from megamorphic to some other state (namely, uninitialized)).
Recall that the backdrop here is that all previous JS implementations (i.e., interpreters) did a dynamic lookup for every property access in a program every time that property was accessed. Of course it's still possible to cause V8 to revert to that pathological state (by forcing the hidden class of the object accessed to change back and forth between different object types; in that case, one would need to do a dynamic lookup anyway or at the very least come up with a stub that was polymorphic but didn't do a lookup — and perhaps this doesn't happen enough or isn't expensive enough to warrant the additional complexity).

Other things that makes V8 fast
* They compile directly into native machine code thus avoiding the overhead of interpretation.
* They compile functions JIT; so if a function is never invoked, it will never be compiled.

V8's Efficient Garbage Collection
Minutes: 26:50 til 31:30 (approximately).

Lars Bak on V8 (April 2009)
Video of Lars Bak interview. Some high bits: V8 processes JavaScript 56X faster than "the most used version of IE." 2.5 years ago (when V8 project started), JS performance had stagnated. JS is very dynamic: properties and functions added on the fly as the program executes. With all of this dynamism, naive implementation would be to do a hash lookup whenever an object property (including a function) is accessed. But that's slow. In JS there are no classes; everything is object-based.

Discussed starting at minute 4:45. In order to get more structure, they create hidden classes as illustrated by pictures and descriptions below. In the below, the currently executing line of code is surrounded by a light blue box. Also, what is described below is the same as what Kevin Millikin described above, only here we've taken it a bit more slowly.

Slide 0:

Our initial code; prior to executing the variable declaration of p.

Slide 1:

As we begin to handle the object instantiation, we see keyword new which
causes an object to be allocated (the data structure for which is shown below).


Slide 2:
Here, since no Point object has been created yet,
we create a new class (C0) and we make p point to C0.
There will also be a pointer created from the Point function to C0.

Slide 3:
Now we execute the first line of the Point function,
which causes a few things to happen:
(1) create a "backing store" (in yellow) which will hold
p's values for the Point object's properties;
(2) create a new class C1 which notes that property x is stored at offset 0;
(3) change p to point to C1 instead of C0;
(4) Have C0 point to C1; what this reference says is: "If ever have a class
C0 and add the property x to it, transition to C1
."

Slide 4:
Now we execute the second line of the Point function which causes
the creation of a new class C2 which is C1 plus the y property.
p is changed to point to C2 and its value of y is written to the
backing store at offset 1 (as noted in C2). Then we add a pointer
from C1 to C2 (as we did above from C0 to C1 and for the same reason).

So at about minute 7:30 in the video, Erik Meijer asks whether hidden classes are canonicalized; e.g., whether the above would be equivalent to a class for which the two assignment statements were swapped (i.e., this.y's value set first). The answer is "no," two such classes would not be equivalent because it's part of the JS spec that the order in which properties are declared matters (referred to as "creation order"). Then another question: what if delete a property from an object? Lars replies that what happens in that case is not a reversion from C2 to C1 (for example) but instead — since deletion is very uncommon — the backing store is converted to a hash map; p's reference to the backing store is swizzled to instead refer to this hash map; and p's pointer to C2 is changed to point to "a sentinel class" Cx, which presumably encodes that accessing properties of the new object should be via a hash lookup. This lets the VM be simpler.

Another thing that improves performance: we discussed previously how the only numeric type in JS is "Number" (as contrasted to other languages which have int, float, double, ...), which is a double. It turns out that operations on floating point numbers are slower than operations on integers; in part because, when manipulating doubles, have to allocate a new heap object for every operation performed (to store the intermediate result). V8's improvement in this regard is two-fold: (1) a single bit is introduced which identifies whether the numeric value stored is a "small int" (i.e., can be represented with 31 bits) or not. If so, the property stores the actual integer value. Otherwise, the property stores a pointer to a heap object which contains a big integer, float, or double. (2) If operating on ints (using +, -, ...), will use integer operations, which don't allocate anything on the heap and hence run much more quickly than float operations.

One can run multiple JavaScript scripts on the same V8 instance by providing each such script with its own execution context. The motivation for this is that one script might modify the global object in a way that incorrectly affects another unrelated script's behavior. Practically, each window and iframe on a given HTML page (i.e., within a tab) is given its own JavaScript environment.

The behavior of each script is isolated from every other by using a separate execution context (or environment) for each. An execution context consists of built-in JS functions and objects, custom JS functions, and custom JS objects. (It's curious that each script must have its own version of built-in JS functions rather than all scripts being able to share this code; presumably, this is due to JS's dynamism wherein a script can redefine the behavior of a built-in function.)

Creating a context differs depending upon whether it's the first one or the 2nd through n'th one. For the first context, built-in objects must be created and the built-in JS code must be parsed and compiled. For the second through nth contexts, only built-in objects must be created (not sure why don't need to reparse and recompiled? because using version obtained in creating first context?). V8 contains an additional optimization which speeds up creating the first execution context. Namely, one can use a snapshotted version of the already-compiled built-in JS code. In particular, a snapshot contains a serialized heap which contains the already-compiled version of the built-in JS code. The implication is that code generation for the built-in JS code is deterministic; it results in the same compiled code each time.

V8 Origin
Recall that the Same-Origin Policy defines an origin as a combination of scheme (e.g., http or https), domain, and path. In V8, an origin is an execution context; where each execution context corresponds to a single iframe or window.

Other references


1 comment: