Last week I was looking at closures and continuations. I wanted to follow on to see exactly what the underlying model for closure was in the case of javascript. "Javascript closures" was really helpful in decoding the spec.
This post attempts to go one step further in explaining the execution model in an accessible way.
It doesn't aim at being 100% technically complete and accurate. Instead it tries to provide a good enough picture of the model to allow you to read and understand the spec if you need more details.
This is based on the ECMAScript Language Specification or ECMA 262 spec (revision 3). It corresponds to Javascript 1.5, which is available in Firefox and IE6. There are other implementations of ECMA 262, such as ActionScript (in Flash) and JScript , but I'll use javascript for this discussion.
I'll cover the concept of prototype and the role it play in object creation and looking up of properties. We'll see what objects, scopes and execution contexts are; as well as the roles they play when functions called as a function and functions called as a constructor (with new). This will explain what the the this keyword means and what closures are and how they work.
XXX table of content
A data and execution model is a specified set of rules and conventions that need to be implemented and respected by the runtime environment and the compiler (in the case of a compiled language).
In the case of x86 executables, the model includes having a stack and a heap, having the OS initialize them properly and loading the executable in memory; the compiler also needs to follow certain conventions for using the stack (to access the input parameters or calling functions).
ECMA 262 specifies its own set of rules, with defines what objects are, how execution environments are created and stacked, as well as what the initial conditions for the execution are.
Objects in javascript are property bags. For example, if you have an object bar, it's properties can be accessed via two syntaxes bar.property or bar["property"].
There are also some internal properties that play a role in the execution model, but that are not visible to the programmer. The double bracket notation ([[...]]) is used to represent such internal properties.
For example, most objects have a [[prototype]] property, which can reference another object (which may in turn have a [[prototype]] reference). When you try to access a property, that property will be looked up not only in the object that you are querying, but also iteratively down its prototype chain (see figure below). That means that any property set on bar will hide the same properties set on the objects down bar's prototype chain.
If you create an object with var bar = {x: 1; y: 2};, its prototype chain will only contain a reference to a native object: Object. There are other built-in objects, such as String, Function or Array.
The interesting thing is that although the property is looked up the prototype chain of object bar when it is accessed, setting it would set it directly on object bar. And the next time it is looked up on bar, it will be found directly there, without having to walk the prototype chain.
This allows two objects to share a common prototype chain, but get modified independently.
See section 8.6 for details of what an object is, and sections 8.6.2.1 and 8.6.2.2 specifically for the difference between [[Get]] and [[Put]].
The internal [[prototype]] property can only be set indirectly, via the prototype property. We'll see how that is done and how objects a created a bit later.
Execution starts with a global object and a first execution context, as illustrated in the above diagram.
The execution context holds a common set of references that are needed to run any code. For example, an execution context has a reference to a scope chain and a reference for to a "this" object.
Execution contexts are stacked. Whenever a function is called a new execution context will be created, initialized with the proper references that it needs and pushed on top of the stack. It will later be popped off the stack, when the method exits.
Section 10 covers the functioning of execution contexts. In particular, section 10.2 Every function and constructor call enters a new execution context, even if a function is calling itself recursively. Every return exits an execution context. A thrown exception, if not caught, may also exit one or more execution contexts. When control enters an execution context, the scope chain is created and initialised, variable instantiation is performed, and the this value is determined.10.2 Entering An Execution Context
The initialisation of the scope chain, variable instantiation, and the determination of the this value depend on the type of code being entered.
The scope chain is a chained list of objects, each representing a different level of scope. The scope chain for newly created execution contexts is composed of the parent execution context's scope chain extended with a new object representing the local scope for this execution context.
Accessing a property like alert (to run code like alert("test")) will trigger that property to be looked up down the scope chain. As each object in the scope chain gets looked up in turn (until that property is found), that object's prototype chain will be looked up iteratively as previously described (also until that property is found).
The scope chain for the initial execution context only lists the global object, which, in the case of javascript running in a browser, is the window object. There is a unique global object (15.1), which is created before control enters any execution context. Initially the global object has the following properties: As control enters execution contexts, and as ECMAScript code is executed, additional properties may be added to the global object and the initial properties may be changed.
Because of that, any property lookup in our example will occur on the global object. For simplicity, the global object's prototype chain was omitted from the figure though.
Section 10.1.510.1.5 Global Object
We'll see the details for creating execution contexts and manipulating the scope chain when we look at how functions get called.
var foo = function(x) {...} and function foo(x) {...} are two equivalent ways of defining a function creating a function object. A function object differs from regular object by some properties. Mainly, [[class]] is set to "function", [[scope]] copies the scope chain reference from the running execution context when the function is created, [[prototype]] is set to the original Function prototype and prototype is set to a new Object() reference.
In the following figure, we have a property foo in the global object, which references the newly created function object.
Things are now set up to illustrate the key mechanisms of the javascript execution model: calling a function as a function, and calling a function as a constructor.
See Section 13 for all the details on function definitions.
Let's look at how this code executes: foo(3);.
The foo property is first resolved in the scope of the current execution context. The resulting function object is then invoked as a function.
- activation object
- variable object (note that properties are set in the variable object at the beginning of the execution. Simple example)
- new execution context
... XXX
XXX I'm still not sure how methods called on an object get access to that object ("this"). For example, if you do req.send().
The [[scope]] property is the key to having closures in Javascript.
11.2.3 Function Calls (foo.bar(x)).
Another way of calling a method is Function.prototype.call. The first argument will be used as the "this" property.
XXX Section 10
XXX Section 13
XXX Also, when a method is added to an object, "this" isn't seem to be closed like other referenced properties. That is because it's actually not a property.
I mentioned built-in objects, such as Object or Function. These are actually constructors, set as properties on the global object, as listed in section 15.1.415.1.4 Constructor Properties of the Global Object
Object, Function, Array, String, Boolean, Number, Date, RegExp, Error, EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError. That is why they are always in scope.
Also, as any other constructor, they can be used to extend the behavior of their corresponding instances. For example, you can add functionality to all strings by adding properties to String.prototype.
XXX Convention that constructors' names start with an upper case letter.
XXX Does a created object get its [[prototype]] from the constructor's [[prototype]] or prototype?
Prototype based language: no notion of class. Only objects/instances. "Inheritance" behavior only uses instances too.
Scope chain only contains the global object and activation/variable objects. But actually you can push any object there with the with keyword.
XXX question: How to wrap a function without knowing its interface?
XXX note: variables can be added to a scope after they are referenced/captured
xxx Section 4.2.1 Objects, prototype and prototype-based inheritance
Prototype-based language: no class, only object instances.
http://www.manageability.org/blog/stuff/prototype-based-programming/view
Prototype-based programming
http://en.wikipedia.org/wiki/Prototype-based_programming
Javascript 2.0 and ECMA 4
http://www.mozilla.org/js/language/es4/
http://www.mozilla.org/js/language/js20/
setTimeout(f.call, 600); f.call not only captures a pointer to call, but the specific one on f.
Practical advice
for loops don't create scopes
variables not declared with "var" go in the global scope!