Monday, June 8, 2009

JavaScript; Interpreter; VM; etc.

Below are my quick and dirty notes on JavaScript (JS): how to construct basic programs using it, how people use the language, what is interesting about it. I looked at JavaScript to understand execution of gadgets and social-networking apps and, in particular, where that execution takes place (i.e., which part of app execution occurs server-side and which occurs client-side).

Topics covered:
JavaScript (JS) was introduced to enable "doing stuff" within HTML code. HTML is only nominally a programming language because it doesn't have functions or the ability to compute. Rather, HTML is mostly a formatting or presentation language (i.e., a markup language). Folks wanted to do interesting things while a user viewed a webpage (or after the user had submitted a form); this necessitated creating a scripting language that could be used within HTML and that enabled computing stuff, defining and invoking functions, etc. JS was designed to run in a host environment (e.g., the browser) which was supposed to provide mechanisms to communicate with the outside world.

The HTML code in a page can refer to a JavaScript function that should be invoked given some event. E.g., we might have an HTML tag: <body onload="f1( )">. Then in the head portion of the HTML page we might define the function f1():

<script type="text/javascript">
function f1( ) { ... }
</script>

So, as the HTML body is loaded, f1( ) — as defined in the above embedded JavaScript — will be invoked. This is a common idiom for form validation (which is a common use of JavaScript) as well; an HTML page will define a form and a JavaScript function to invoke when that form is submitted. By default, JavaScript code which appears in the body of an HTML page will be invoked as that page is loaded. In addition to embedding a JavaScript program within an HTML page, one can also include an external JavaScript program. E.g.,

<script type="text/javascript" src="http://www.scriptsRus.com/blah.js">

JavaScript can be used to do stuff in response to particular browser events, such as: a mouse click or key press, a webpage or image loading, mousing over a spot on the page, submitting a form, ... What kind of stuff can JS do? Invoke a function, present an alert or prompt window, calculate, compute, modify the given HTML page (via document.write(...) or interacting with the DOM), create cookies, etc. One particularly nifty thing one can do with JavaScript is call eval(...) on a String str which causes str to be executed as JavaScript code. A pretty common function invoked in JavaScript programs is document.write(...) which takes a single String argument and replaces the page's current content with that argument. Of course it's possible to have lots of JS scripts in the same HTML page. They'll all be invoked. If each script called document.write(...), what each wrote would be concatenated together on the same page in the order in which the scripts executed (as opposed to one script's text overwriting another's). Play with JavaScript here.

When you declare a variable, you don't provide a type; instead, you merely say "var " and optionally assign an initial value to the variable. In fact, you don't even have to provide the "var" part; an assignment to a previously-undeclared variable (such as z below) causes that variable to be implicitly declared. The type of a variable is the type of that variable's current value; hence, a variable's type can change as its value changes. This is referred to as dynamic typing.

var x = 5; // x is presently a Number
var y = "Hello"; // y is presently a String
z = 42; // implicit declaration; treated the same as: var z = 42;
var x; // redefine x; x still has the same value as above (5).

Weirdly enough, if you declare a variable and give it a value then subsequently redeclare the variable (providing no initial value), that variable's value will be the previously assigned one. The reason for this is that the redeclaration of x occurs in the same scope as the initial declaration. And so the name x still maps to value 5 in that scope. You might think that you could use the delete keyword to remove the mapping of x to 5 (e.g., between the initial declaration of x and the redeclaration of x) but this doesn't work; supposedly it works for variables that are implicitly declared but this depends on the JS engine (it doesn't seem to work that way for chrome).

There are standard arithmetic, logical, conditional, ... operators (to apply to variables/constants) and precedence rules across those operators. Also have standard control-flow constructs such as: if/else, if/else if/else, switch, for, while, do...while. Can use try ... catch to handle exceptions. Just as with other programming languages, if you want a string to contain special characters (e.g., quotes), use a backslash to escape the special character.

The built-in types are: Boolean, Number (a double), String. Above we saw that variables are declared without providing a type. The JS interpreter/VM figures out what a variable or value's type is automatically. JavaScript supports duck typing. In a non-duck-typed language, the type of each function argument is provided. Allowed arguments will either have that type or a subtype of that type. By contrast, for duck typed languages, function arguments have no specified type. For example, we might have the following function:

function calculate( a, b ) { return a + b; }

Then calculate can be called on any two objects for which the "+" operator is defined; the method's return value depends on the types of a and b (this is polymorphism: the same method can be applied to objects of different types). If a and b are numbers then calculate returns their sum; if a and b are strings, calculate returns their concatenation; and so on. There needn't be an inheritance relationship among classes that define the "+" operator; thus, there needn't be an inheritance relationship among types of arguments (to calculate) that are allowed. Hence, the observation that duck-typed languages (e.g., JavaScript) support polymorphism without inheritance.

There are really two types of polymorphism:
  1. The first type is referred to as parametric polymorphism or generic programming and occurs when the behavior of a method doesn't depend on the types of the method's arguments. For example, if you have a list of objects and you want to remove the first one, presumably this can be done without caring about what type the objects are. Hence, the remove method is generic — it could operate on a list of objects of any arbitrary type. By contrast, to sort this list requires some way of comparing objects which likely depends on the object's type.
  2. The other type of polymorphism is referred to as ad-hoc polymorphism, wherein there is one name that has multiple meanings or implementations associated. For example, the "+" operator above has one meaning for one type of operand (integers) and another meaning for another type of operand. (Strings) The implementation of "+" is not agnostic on the types of its operands (and hence not generic).
An object is a collection of name-value pairs (represented as a hash table), which are properties of the object; e.g., objBob = { name:"Bob", grade:"A", level:3 }. So, objBob.name == "Bob" and objBob.level == 3. Can also add to this collection via the "." as in, objBob.newProp = "new Property Value". An object also has methods. One can define an object via a template, which is just a function that takes arguments and assigns to object properties (think of it as akin to a constructor). Once you have a template of an object, you can instantiate it multiple times using keyword new. The new operator creates a new empty object then calls the specified function (person(...) in the below). The this value for that function call is the new empty object. E.g.

function person( fName, lName, bDay ) {
this.fName = fName;
this.lName = lName;
this.bDay = bDay;

// declare and define the method inside of the template
this.fullName = function() {
return this.fName + ' ' + this.lName;
}

// declare the method inside of the template and define it outside
this.newLastName = newLastName;
}

function newLastName( newName ) {
this.lName = newName;
}

// declare and define the method outside of the template
person.prototype.astroSign = function() {
if ( bDay >= December 22 && bDay <= January 22 ) { ... }
else if ( bDay >= January 23 && bDay <= February 22 ) { ... }
}

var sue = new person( "Sue", "Smith", "03/03/73" );
document.write( sue.lName + "\n" );
sue.newLastName( "Jones" );
document.write( sue.lName + "\n" );

Note that above we have associated three functions with person: fullName, newLastName, and astroSign (all in blue). They differ in where they are declared (i.e., associated with the person object) and where they are defined (i.e., where the function's body is located). When we declare and define a method inside of the template (as with fullName) that means that every instance of the object (i.e., every person) will have its own copy of that method (which is a waste of space!). When we declare a method inside of the template but define it outside (as with newLastName), that facilitates sharing the method with other objects. Finally, the third approach (declare and define the method outside of the template) as with astroSign causes the method to be added to the object's prototype, e.g., to the person prototype. All instances of the person object share the same prototype, and the prototype is consulted when trying to resolve an object's properties and methods. Hence, the prototype forms part of the lookup chain. The prototype syntax enables adding to any object, even a built-in one. E.g.,

> var s = "Simon";
> String.prototype.reversed = function() { ... // code for reversing a string ... }
> s.reversed()
nomiS

JS supports "first-class functions," which means that it treats functions as first-class objects, which means that functions can be used without restriction (pretty much in all the ways that objects can be used). We see this in JS with:
  • The ability to store functions within a data structure (e.g., a template)
  • Can pass a function as an argument to another function
  • A function's return value can be another function
  • The ability to define functions on the fly (during program execution)
  • The ability to have nested functions (define a function within another function)
  • Anonymous functions; e.g., (function (x,y) { return x+y; })
Can have anonymous functions, which might make sense to use as a callback; e.g., setTimeout( function() { do x, y, z; ... } ). The specified callback function has no name; instead, its body is just provided. A function can also take a variable number of arguments (i.e., JS supports variadic functions). The function can access its arguments via: functionName.arguments[index]. For example, sumAll() below has no arguments listed; it can take an infinite number of arbitrary arguments and prints each (yes, I am aware that the function is poorly named).

function sumAll() {
int numArgs = sumAll.arguments.length;
for (int i = 0; i < sumAll.arguments.length; i++)
document.write("The " + i + "'th argument is: " +
sumAll.arguments[i]);
}
}
sumAll( 7, "liz", 63.5, true, "randomString..." );
sumAll( 4, 3, 5 );

In JavaScript, an object is basically a hash table. The keys into the table are the names of the object's properties and methods. JS has late binding and so determining whether a given object has some property or supports some method occurs at the last second. Note that in JavaScript, a method that is "part of" an object (i.e., declared to be part of an object's template) isn't necessarily called in that object's context (i.e., with that object as the this value). In particular, if a method is invoked using the "." notation then the object preceding the dot becomes the this value. However, if a method is invoked without the "." notation then the global this object is used. An example (lifted verbatim from here):

function Car( brand ) { this.brand = brand; } // line 1
Car.prototype.getBrand = function() { println(this.brand); } // line 2
var foo = new Car("toyota"); // line 3
var brand = "not a car"; // line 4
var bar = foo.getBrand; // line 5; sets bar to be the getBrand function
bar(); // line 6

So in line 5, we set the function pointer bar to point to the function getBrand (as defined in line 2) but getBrand is not invoked in this line. That happens in line 6 where bar is invoked without the "." notation and hence without explicitly providing a this object. Consequently, the global this object will be used. And for that object, the brand is "not a car" as in line 4.

Another good point from Kevin Millikin: What this means is, the guy who writes some function with some particular object in mind may have that function applied to a very different object. By contrast, in Java, if a method takes an object argument, that object has a type. Knowing the object's type enables the body of the function to perform certain tasks with confidence, e.g., access certain of the object's fields. Now, the function body can no longer assume anything about an argument (or about the structure of the this operator).

  • void alert(String str): Creates a mini pop-up whose contents are the given str.
  • String prompt(String str): Creates a mini pop-up whose contents are the given str and which contains a line where the user can enter input, which the function returns.
  • Regular expression matching.
  • Navigator object which holds info about the browser or the application running this JS.
  • Getting cookies, setting cookies.
  • Timeouts: Ability to execute code after some period of time passes.
  • Modify the current page via accessing its DOM. Replace the current page via document.write(String str).
  • A new type of object that allows downloading content from a www URL: XMLHttpRequest. XMLHttpRequest allows the client and server to exchange data asynchronously without a full page reload; it's part of AJAX. Apparently there are no restrictions on the set of hosts with which one can communicate using XMLHttpRequest (that is, no restrictions based on the origin of this script vs. the target location and so on). Communication is asynchronous; the XMLHttpRequest object takes a callback method which will be invoked when the requested network action has completed (e.g., when the target page has been successfully downloaded). XMLHttpRequest enables:
web applications can retrieve data from the server asynchronously in the background without interfering with the display and behavior of the existing page
  • Create a pop-up using window.open(...). Can be used with arbitrary target URL. After opening pop-up window pointing to an initial URL, can change what page is loaded:
// The NY Times site is initially loaded in the window.
var myWin = window.open("http://www.nytimes.com/","win");

// Now replace that content with that at CNN site.
// Assigning to the location causes the window to reload (from CNN this time).
myWin.location = "http://www.cnn.com/";

Scope etc.
Variables defined in functions have local scope (per usual); i.e., such a variable can only be accessed in the function in which it is defined (or in any functions defined within that function, as appropriate). Can define a block within a pair of curly braces (as with other languages) but this does not define a separate scope (as in other languages).

Typically, a JavaScript script would be executed on a JS engine, that is, interpreted. Mozilla's SpiderMonkey (written in C) is one such interpreter; Rhino (written in Java) is another. The Google Chrome browser takes another approach; it uses a JS Virtual Machine (VM), called V8.

Taking a step back, most programs are executed in one of three ways:
  1. Compiled to native machine code then run directly on underlying hardware; e.g., C, C++. So there are instruction-set-specific compilers (e.g., for x86, Alpha, MIPS, SPARC, and so on) which take a C program and output a binary which uses the proper hardware instructions.
  2. Compiled to virtual machine code then run on a VM; e.g., Java and .NET. In this case, a translation is done to the program source in order to obtain byte code. At run-time, the VM will interpret the byte code or compile it to native machine code (as with just-in-time (JIT) compilation). The VM provides support for various run-time activities, such as dynamic dispatch and garbage collection.
  3. Interpreted by an engine which implements the language; e.g., Perl, Tcl, and most JavaScript implementations. In this case, the language engine runs and executes each line of the input program step-by-step. Note that some interpreters compile the source code to an intermediate representation then interpret that.
The basic jist is that any program written in a high-level language results in execution of some set of machine code instructions. There is a spectrum defining when each line of the input program is reduced (or converted or translated) to some set of native machine code instructions. At one end are programs which are compiled up-front to native machine code. On the other end are programs where each line of the program causes execution of some part of an interpreter which itself runs directly on the underlying hardware. So we can think of an interpreted language as taking a statement such as: var x = 3 + 5 and causing a routine within the interpreter called add(...) to be invoked with arguments 3 and 5. The interpreter keeps track that the variable x is assigned to with the output of that add operation.

There are tradeoffs between compiling and interpreting in terms of performance as well as in ability to perform run-time checks (and hence functionality). Clearly, with an interpreter, our language run-time can do anything we want it to (including dynamic dispatch, run-time type checking, etc.) however at the cost of increased (interpretive) overhead.

Run-time method binding or late binding is also referred to as dynamic dispatch and entails identifying the code that should be invoked in response to a given method invocation or, in programming language parlance, "mapping a message to code." The idea is that there may be more than one implementation of some method m() and the type of the object on which m() was invoked determines the correct implementation to call (this is single dispatch, which is contrasted to multiple dispatch below).

For example, consider a class Point, which defines a draw() method, and a class ColorPoint, which extends Point and redefines that method. If an object p invokes draw(), we must determine which version of that method should be invoked — Point's or ColorPoint's. That decision depends on p's type which is — for dynamically typed languages — determined at run-time. Hence, we usually see dynamic dispatch with dynamically typed languages; dispatch must be deferred til run-time since that's when we'll know an object's type. Languages which perform dynamic dispatch include: Simula, Smalltalk, C++ (for virtual methods), and JavaScript.

Both C++ and JavaScript do dynamic dispatch but there is a key difference. In particular:
When we wish to select from a set of classes at runtime, C++ requires that those classes be related by inheritance. From: DDJ
The consequence is, for C++ polymorphism, at compile time, we may not know the exact type of a given object but we'll know what family of objects it's part of. And we'll know which classes in that family redefine a given method. The upshot is that if an object in some family invokes a virtual method m() and m() is redefined by multiple classes in the family, we'll be able to identify all of the candidate implementations of m() at compile time. Virtual tables can only be used when the target method is one of a set of methods known at compile time; hence, vtables can be used for dynamic dispatch in C++. By contrast, JavaScript supports duck typing and hence cannot use vtables. Dynamic dispatch in JS consists of a table lookup. We noted above that there is actually a chain of tables that are consulted when attempting to resolve a property or method; that lookup chain includes the prototype's table.

As Kevin Millikin notes (in a 2008 talk), when one sees a property access in JavaScript, say val.x occurs at line Y, then that line may be executed multiple times and there are no constraints whatsoever on the kinds of objects (the types) that that are used as val on each execution of that line. In one case, val might be a Point object; in another case, it might be a Pizza, a ThingaMagigger, or a Blob object. Any object that provides the x property can be used as val. This is the essence of duck typing.

Single dispatch: which method is invoked depends on the type of the object on which the method is invoked. E.g., if invocation is x.foo(y) then x's run-time type determines which implementation of foo(...) is invoked. Multiple dispatch: which method is invoked depends on the type of the object on which the method is invoked as well as the type of arguments to the method. In this case, which implementation of foo to invoke also depends on y's type.


Simula: an object is an activation record (AR), in particular, it's the activation record (AR) produced by calling a class. A class then is a procedure which returns a pointer to its AR. An AR is just a data structure that contains values and pointers to methods. As far as inheritance goes, if we define a parent class and a child class then we create an instance of the child class then that instance will point to two activation records: it will point to the AR for the child class and on top of that will be the AR for the parent class. When instantiating an object of a child class, the parent class's initialization code will be executed then the child's. Still trying to answer: how does dynamic lookup work in Simula? Presumably, we follow the object's pointer to its AR then locate the target method there. Is the parent method also called? If the child doesn't redefine the method, we presumably go straight to invoking the parent method? And so on.

Smalltalk: everything is an object; all operations == messages to objects. An object in Smalltalk is a data structure that contains a pointer to the class object as well as the values of the class's variables or fields (e.g., for the Point class, the fields are x and y, which represent the Point's coordinates; for ColorPoint, there is an additional field which identifies the ColorPoint's color) . For example, in the figure below we see an instantiated Point object which has coordinates x = 2, y = 3 as well as an instantiated ColorPoint object with coordinates x = 4, y = 5 and value red for the color field. Every instantiated Point object contains a pointer to the same Point class object (akin to every instance of a prototype in JS containing a pointer to the Prototype). The class object contains a pointer to the object's method dictionary as well as to its parent class object (as appropriate). Hence, invoking a method foo() on object o entails following o's pointer to its class then following the class's pointer to the method dictionary and looking up foo(). If cannot find foo() in that dictionary then follow the class's pointer to its parent class and look in the parent class's method dictionary...

C++: For C++, run-time method binding is done for methods defined with the virtual keyword. Such methods might be defined in the parent then redefined in a child class. Hence, for any object that invokes a virtual method at run-time, that object will be one of a set of types where the types in that set have an inheritance relationship among them. At compile-time, a virtual function table is laid out for each class that defines or redefines a virtual method; this vtable is populated at run-time with the appropriate code addresses. Special care is taken that a particular method f() occurs at the same offset in each vtable in which f() occurs (given that the classes defining f have an inheritance relationship among them). So if f() is the 3rd entry in the Point vtable, f() will also be the 3rd entry in the ColorPoint vtable. Note that each object contains a pointer to its corresonding vtable, as depicted below. Consequently, at compile-time, if an object o invokes f() and we know that o is a Point or a ColorPoint then we can insert code that will make an indirect call through o's vtable without knowing which vtable o will point to (i.e., the Point vtable or the ColorPoint vtable) and without knowing the vtable's contents (i.e., the specific address pointer values).

JavaScript: For all intents and purposes, dynamic dispatch in JS entails a table lookup (and maybe multiple table lookups). So we're trapping to the VM which performs the table lookup. Hence, it's expensive.

The above sets the stage for conditions under which inline caching is helpful (in summary: where method binding happens at run-time but vtables can't be used). Inline caching entails remembering the results of a previous method lookup directly at the call site. So if the same call site is visited more than once during the program's execution, the cost of the initial lookup can be amortized over these repeat calls. The careful reader will wonder whether it's possible for a binding of a method (e.g., quack()) at line X to an implementation (e.g., codeXYZ) is necessarily the intended binding everytime quack() at line X is invoked? I.e., isn't it possible that quack() has one binding one time it's called from line X and another binding another time? Yes! It's possible. For this reason, when using inline caching, extra code is inserted into the preamble which is executed prior to transferring control to the resolved method. That code checks whether the current object on which the method is being invoked is of the expected type; if not, method resolution is performed anew and the new result is cached.

According to Kevin Millikin (of Google), around 90% of all property access sites only see objects that have the same class. That means, if at some property access site, we cache the output of looking up where a property lives within some object (i.e., at what offset), we'll be able to use that value in subsequent executions of this line of code since we'll be looking at objects of the same type.

The "same origin policy" (first introduced with Netscape Navigator 2.0) prevents a document or script loaded from one "origin" from getting or setting properties of a document from a different "origin". The term origin is defined here as a combination of domain name (www.example.com), protocol (http or https) and port (for example, www.example.com:81 is not the same as www.example.com). All three must match for two webpages to be considered to have the same origin. Without this protection, a malicious web page could compromise the integrity of another web page. Reference.
Scripts received from one website do not have access to usernames, passwords, cookies of another web site. There is a SOP for accessing the DOM and one for accessing cookies. The policy has to do with how we define "site." For example, if one website has the same domain as another website but a different path (e.g., site1="http://www.google.com/a", site2="http://www.google.com/ig"), do we let the second website access the first's cookies? For site2 to access site1's DOM, site1 and site2 must have the same scheme (the scheme is the first few letters of the URL and is most commonly http but could also be https, ftp, or file), domain, and port. For cookies, the policy is that the domain and the path must be the same in order for one site to access another site's cookies (ref).

Sidebar: Using frames
If want to host content from multiple providers on a single webpage, might use frames as an isolation mechanism. Content from site A loaded into frame 1, site B into frame 2, and so on. Same origin policy applied at frame level so site A JS cannot learn cookies for site B etc. A frame is given a portion of screen real estate which it cannot draw beyond. So a single webpage e.g., www.google.com, might contain multiple frames. The browser's address bar contains the URL for the container page. The address corresponding to each frame is not visible.

References
* JavaScript primer (w3 schools)
* More about binding -- CHRISTOPHE PORTENEUVE

Ways to improve this post
* Talk about prototypal inheritance from Self, also dynamic objects
* Be more clear and thorough on Same Origin policy; discuss security generally.
* Get clearer sense of dynamic dispatch (run-time binding) in JS, how exactly it works in various interpreters etc. (as I assume this is an implementation decision).


No comments:

Post a Comment