[Developer says] Gazing into the abyss of Javascript asynchrony pt.2

Click for part 1

Putting all together

put it together

Here is a more complex example:

var nextTask = 0, outOfOrder = false;
for (var i = 0; i < 1000; i++) {
  (function (i) { // Per task specific scope
    setTimeout(function task () {
      if (nextTask++ != i) {
        console.log("Out of order!", i, nextTask);
        outOfOrder = true;
      }
    },0);
  })(i);  // end of per task scope
}

if (!outOfOrder) {
  console.log("All functions were called in the order they were scheduled");
}

What is going on here:

  • 1000 tasks are scheduled and each one gets assigned an id (i) that represents the sequence in which it was defined.
  • Each task compares a shared value nextTask with it’s id.
  • nextTask is incremented thus predicting which will be the next task to be executed.
  • If all 1000 tasks can predict the next task correctly, it is a strong indication that Javascript has a very deterministic manner of scheduling tasks (spoiler alert: it does).

There are multiple points to be made about this piece of code. Notice first the pattern (function (i) {...})(i) in the for loop. What happens here is that we create a bunch of functions, each of them named task. Due to lexical scoping they all have access to the variables nextTask, outOfOrder, and ibecause they all have the same outer scope. When a function is defined it doesn’t keep a copy of the outer scope, it keeps a reference to it. Therefore not only are changes visible from inside the function but the scope is mutable by the function itself. This way we can read and write to the variablenextTask from each task and have that change be visible to the rest of the tasks. The same applies to i.

Each function gets a reference to i and then i is changed. Therefore by the time a task gets executed the value of i is different than it was when the function was defined. That is not what we want here. We want each function to have a separate value for i, but to share the value of nextTask. The way we get around this problem is by creating a scope specific to each one of the tasks. Before each task function is defined a local scope is created to accomodate it’s arguments and locally declared variables. Thus we do function (i) { /*scope*/ } and then we call it copying the current value of i to the argument list. Within the context created this way, we define our task that now has access to a copied version of i and a shared reference to nextTask. Once the task is defined and scheduled the copy we created for i gets out of scope and now the only references to the created =i=s left are the closures of each task. Voila! Each task has it’s own personal copy of i.

diagram2
The effect of closures to scope

Back to the scheduling issue: the output of this code snippet, as you may have guessed is

All functions were called in the order they were scheduled

Whenever the timer of a setTimeout defined task expires, it gets pushed to the back of the message queue. When the currently running code block finishes, a message is taken from the message queue and processed. In our case we push 1000 functions to the message queue and then each one of them gets executed in the order it was pushed there. The exact same thing happens when an event triggers a function: The function is pushed to the back of the queue and waits for it’s turn to be executed.

Wrapping up

Now onto some interesting implications and caveats of this model of concurrency:

Background tabs will clamp timers to 1s

If you don’t develop for the browser or if you only depend on timer precision for animations, then you probably won’t even notice that once a tab becomes inactive all setTimeout and setInterval calls clamp the timer to 1000ms. We will talk about how to get around that in a future post.

Scheduling tasks allows the system to be more responsive

When executing computationally intense tasks, it is usually better to break them up into different tasks and throw them individually into the queue to keep the rest of the system responsive. For example, you may want to consider replacing something like:

enormousText.replace(/[\.,?"':;!)( ]+/g, " ").split(' ').some(spellChek)

with something along the lines of

function checkText (enormousText, cb) {
  setTimeout(function checkFirstSentence () {
    var splitText = enormousText.split('.', 1);  //Separate the first sentence

    // If a word was misspelled stop
    if (splitText
        .replace(/[\.,?"':;!)( ]+/g, " ")
        .split(' ')
        .some(misspelledWord)) {
      cb(false);
      return;
    }

    // Check if that was the last sentence
    if (splitText[1] || splitText[1] == 0) {
      cb(true);
      return;
    }

    // Continue
    checkText(splitText[1], cb);
  })
}

The second version checks one by one the sentences of the text in separate tasks. It may be more complex, but if during the computation an event occurs (e.g. a mouse hover), it will be served after the current sentence spellcheck instead of waiting for the whole text to be processed. Another side effect is that checkText is a higher order function now, ie it accepts a function (cb) as an argument to emit the computation result.

Prefer callbacks to return values

Javascript functions are first class citizens in the sense that they can be used as data in the same way integers, strings and objects can. Using return values instead of callbacks is usually a bad idea. Even if the current implementation of your function can do all its work within a single task, you never know when you will want to delegate the computation to the server or apply the above technique to relieve responsiveness.

fakedrake