July 30, 2005

Javascript data and execution model

Last week I was looking at closures and continuations. I wanted to follow on to see exactly what the underlying model for closure was in the case of javascript. "Javascript closures" was really helpful in decoding the spec.

This post attempts to go one step further in explaining the execution model in an accessible way.

It doesn't aim at being 100% technically complete and accurate. Instead it tries to provide a good enough picture of the model to allow you to read and understand the spec if you need more details.

What is covered?

This is based on the ECMAScript Language Specification or ECMA 262 spec (revision 3). It corresponds to Javascript 1.5, which is available in Firefox and IE6. There are other implementations of ECMA 262, such as ActionScript (in Flash) and JScript , but I'll use javascript for this discussion.


I'll cover the concept of prototype and the role it play in object creation and looking up of properties. We'll see what objects, scopes and execution contexts are; as well as the roles they play when functions called as a function and functions called as a constructor (with new). This will explain what the the this keyword means and what closures are and how they work.

XXX table of content

What is a data and execution model?

A data and execution model is a specified set of rules and conventions that need to be implemented and respected by the runtime environment and the compiler (in the case of a compiled language).
In the case of x86 executables, the model includes having a stack and a heap, having the OS initialize them properly and loading the executable in memory; the compiler also needs to follow certain conventions for using the stack (to access the input parameters or calling functions).

ECMA 262 specifies its own set of rules, with defines what objects are, how execution environments are created and stacked, as well as what the initial conditions for the execution are.

Objects and the prototype chain

Objects in javascript are property bags. For example, if you have an object bar, it's properties can be accessed via two syntaxes bar.property or bar["property"].

There are also some internal properties that play a role in the execution model, but that are not visible to the programmer. The double bracket notation ([[...]]) is used to represent such internal properties.
For example, most objects have a [[prototype]] property, which can reference another object (which may in turn have a [[prototype]] reference). When you try to access a property, that property will be looked up not only in the object that you are querying, but also iteratively down its prototype chain (see figure below). That means that any property set on bar will hide the same properties set on the objects down bar's prototype chain.

If you create an object with var bar = {x: 1; y: 2};, its prototype chain will only contain a reference to a native object: Object. There are other built-in objects, such as String, Function or Array.

The interesting thing is that although the property is looked up the prototype chain of object bar when it is accessed, setting it would set it directly on object bar. And the next time it is looked up on bar, it will be found directly there, without having to walk the prototype chain.
This allows two objects to share a common prototype chain, but get modified independently.


See section 8.6 for details of what an object is, and sections 8.6.2.1 and 8.6.2.2 specifically for the difference between [[Get]] and [[Put]].

The internal [[prototype]] property can only be set indirectly, via the prototype property. We'll see how that is done and how objects a created a bit later.

Starting point for the execution

Execution starts with a global object and a first execution context, as illustrated in the above diagram.
The execution context holds a common set of references that are needed to run any code. For example, an execution context has a reference to a scope chain and a reference for to a "this" object.

Execution contexts are stacked. Whenever a function is called a new execution context will be created, initialized with the proper references that it needs and pushed on top of the stack. It will later be popped off the stack, when the method exits.

Section 10 covers the functioning of execution contexts. In particular, section 10.2

10.2 Entering An Execution Context

Every function and constructor call enters a new execution context, even if a function is calling itself recursively. Every return exits an execution context. A thrown exception, if not caught, may also exit one or more execution contexts.


When control enters an execution context, the scope chain is created and initialised, variable instantiation is performed, and the this value is determined.
The initialisation of the scope chain, variable instantiation, and the determination of the this value depend on the type of code being entered.

offers an overview of what happens upon entering an execution context.

The scope chain is a chained list of objects, each representing a different level of scope. The scope chain for newly created execution contexts is composed of the parent execution context's scope chain extended with a new object representing the local scope for this execution context.

Accessing a property like alert (to run code like alert("test")) will trigger that property to be looked up down the scope chain. As each object in the scope chain gets looked up in turn (until that property is found), that object's prototype chain will be looked up iteratively as previously described (also until that property is found).

Section 10.1.4

10.1.4 Scope Chain and Identifier Resolution


Every execution context has associated with it a scope chain. A scope chain is a list of objects that are searched when evaluating an Identifier. When control enters an execution context, a scope chain is created and populated with an initial set of objects, depending on the type of code. During execution within an execution context, the scope chain of the execution context is affected only by with statements (see 12.10) and catch clauses (see 12.14).


During execution, the syntactic production PrimaryExpression : Identifier is evaluated using the following algorithm:


  1. Get the next object in the scope chain. If there isn't one, go to step 5.
  2. Call the [[HasProperty]] method of Result(1), passing the Identifier as the property.
  3. If Result(2) is true, return a value of type Reference whose base object is Result(1) and whose property name is the Identifier.
  4. Go to step 1.
  5. Return a value of type Reference whose base object is null and whose property name is the Identifier.

The result of evaluating an identifier is always a value of type Reference with its member name component equal to the identifier string.

details how the scope chain is used for identifier resolution.

The scope chain for the initial execution context only lists the global object, which, in the case of javascript running in a browser, is the window object.
Because of that, any property lookup in our example will occur on the global object. For simplicity, the global object's prototype chain was omitted from the figure though.
Section 10.1.5

10.1.5 Global Object


There is a unique global object (15.1), which is created before control enters any execution context.


Initially the global object has the following properties:


  • Built-in objects such as Math, String, Date, parseInt, etc. These have attributes { DontEnum }.

  • Additional host defined properties. This may include a property whose value is the global object itself; for example, in the HTML document object model the window property of the global object is the global object itself.

As control enters execution contexts, and as ECMAScript code is executed, additional properties may be added to the global object and the initial properties may be changed.

provides an overview of the global object.


We'll see the details for creating execution contexts and manipulating the scope chain when we look at how functions get called.

Creating a function object

var foo = function(x) {...} and function foo(x) {...} are two equivalent ways of defining a function creating a function object. A function object differs from regular object by some properties. Mainly, [[class]] is set to "function", [[scope]] copies the scope chain reference from the running execution context when the function is created, [[prototype]] is set to the original Function prototype and prototype is set to a new Object() reference.

In the following figure, we have a property foo in the global object, which references the newly created function object.

Things are now set up to illustrate the key mechanisms of the javascript execution model: calling a function as a function, and calling a function as a constructor.

See Section 13 for all the details on function definitions.

Calling the function as a function

Let's look at how this code executes: foo(3);.
The foo property is first resolved in the scope of the current execution context. The resulting function object is then invoked as a function.
- activation object
- variable object (note that properties are set in the variable object at the beginning of the execution. Simple example)
- new execution context
... XXX


XXX I'm still not sure how methods called on an object get access to that object ("this"). For example, if you do req.send().


The [[scope]] property is the key to having closures in Javascript.


XXX section 10.1.6

10.1.6 Activation Object


When control enters an execution context for function code, an object called the activation object is created and associated with the execution context. The activation object is initialised with a property with name arguments and attributes { DontDelete }. The initial value of this property is the arguments object described below.


The activation object is then used as the variable object for the purposes of variable instantiation.


The activation object is purely a specification mechanism. It is impossible for an ECMAScript program to access the activation object. It can access members of the activation object, but not the activation object itself. When the call operation is applied to a Reference value whose base object is an activation
object, null is used as the this value of the call.


XXX Section 13.2.1

13.2.1 [[Call]]


When the [[Call]] property for a Function object F is called, the following steps are taken:


  1. Establish a new execution context using F's FormalParameterList, the passed arguments list, and the this value as described in 10.2.3.

  2. Evaluate F's FunctionBody.

  3. Exit the execution context established in step 1, restoring the previous execution context.

  4. If Result(2).type is throw then throw Result(2).value.

  5. If Result(2).type is return then return Result(2).value.

  6. (Result(2).type must be normal.) Return undefined.


11.2.3 Function Calls (foo.bar(x)).
Another way of calling a method is Function.prototype.call. The first argument will be used as the "this" property.


Calling the function as a constructor

Similar to 4) but an object will be created and returned
Section 13.2.2

13.2.2 [[Construct]]


When the [[Construct]] property for a Function object F is called, the following steps are taken:


  1. Create a new native ECMAScript object.

  2. Set the [[Class]] property of Result(1) to "Object".

  3. Get the value of the prototype property of the F.

  4. If Result(3) is an object, set the [[Prototype]] property of Result(1) to Result(3).

  5. If Result(3) is not an object, set the [[Prototype]] property of Result(1) to the original Object prototype object as described in 15.2.3.1.

  6. Invoke the [[Call]] property of F, providing Result(1) as the this value and providing the argument list passed into [[Construct]] as the argument values.

  7. If Type(Result(6)) is Object then return Result(6).

  8. Return Result(1).


XXX Section 10
XXX Section 13


XXX Also, when a method is added to an object, "this" isn't seem to be closed like other referenced properties. That is because it's actually not a property.


I mentioned built-in objects, such as Object or Function. These are actually constructors, set as properties on the global object, as listed in section 15.1.4

15.1.4 Constructor Properties of the Global Object

Object, Function, Array, String, Boolean, Number, Date, RegExp, Error, EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError. That is why they are always in scope.
Also, as any other constructor, they can be used to extend the behavior of their corresponding instances. For example, you can add functionality to all strings by adding properties to String.prototype.

XXX Convention that constructors' names start with an upper case letter.
XXX Does a created object get its [[prototype]] from the constructor's [[prototype]] or prototype?


Conclusion and pointers

Prototype based language: no notion of class. Only objects/instances. "Inheritance" behavior only uses instances too.
Scope chain only contains the global object and activation/variable objects. But actually you can push any object there with the with keyword.

XXX question: How to wrap a function without knowing its interface?
XXX note: variables can be added to a scope after they are referenced/captured

xxx Section 4.2.1 Objects, prototype and prototype-based inheritance

Prototype-based language: no class, only object instances.
http://www.manageability.org/blog/stuff/prototype-based-programming/view

Prototype-based programming
http://en.wikipedia.org/wiki/Prototype-based_programming

Javascript 2.0 and ECMA 4
http://www.mozilla.org/js/language/es4/
http://www.mozilla.org/js/language/js20/

setTimeout(f.call, 600); f.call not only captures a pointer to call, but the specific one on f.

Practical advice
for loops don't create scopes
variables not declared with "var" go in the global scope!

Posted by Julien. Permalink | TrackBack
Comments
Trackbacks
Post a comment









Your email address won't be published on the site if you also input a URL.

Remember personal info?