WebAssembly in Action

Author of the book "WebAssembly in Action"
Save 40% with the code: ggallantbl
The book's original source code can be downloaded from the Manning website and GitHub. The GitHub repository includes an updated-code branch that has been adjusted to work with the latest version of Emscripten (currently version 3.1.44).

Friday, August 6, 2010

A Deeper Look at HTML 5 Web Workers


In this post we will have a deeper look at some of the aspects of HTML 5 Web Workers than we did in my previous article 'An Introduction to HTML 5 Web Workers'.

If you are new to HTML 5 Web Workers then I recommend that you read my previous post which details the different types of web workers that are available, how to create the worker objects and how to communicate between the worker and its creator.


Multiple Shared Workers

When you have a web site (or web application) with multiple windows each needing access to a worker thread you don't really want to create a new thread in each window because it takes time and system resources to create each worker thread.

The ability to share a single worker thread among each window from the same origin comes as a great benefit in this case.

The following is the simplest way to create a SharedWorker thread that multiple windows from the same origin can make use of:

// Window 1
var aSharedWorker = new SharedWorker("SharedWorker.js");

// Window 2
var aSharedWorker = new SharedWorker("SharedWorker.js");

The SharedWorker object accepts an optional 2nd parameter in the constructor that serves as the name of the worker.

When you don't specify the 2nd parameter, it's pretty much the same as specifying a name that is exactly the same each time. With that in mind, the above example can be rewritten as follows and will do the exact same thing:

// Window 1
var aSharedWorker = new SharedWorker("SharedWorker.js", "SharedWorkerName");

// Window 2
var aSharedWorker = new SharedWorker("SharedWorker.js", "SharedWorkerName");

Most of the time having one shared worker will give the needed functionality. If you simply have a desire to add more parallel processing, the shared worker can always spawn web workers of its own (dedicated web workers can also spawn additional worker threads if need be).

What if you run into a scenario where you have a need for several windows to share several workers rather than just the one?

That's where the 2nd parameter of the SharedWorker constructor comes into play.

You can create several different SharedWorker threads by specifying different names when creating the worker objects.

The following is an example of two windows each sharing two worker threads 'Worker1' and 'Worker2':

// Window 1 - Shared Worker 1 & 2
var aSharedWorker1 = new SharedWorker("SharedWorker.js", "Worker1");
var aSharedWorker2 = new SharedWorker("SharedWorker.js", "Worker2");

// Window 2 - Shared Worker 1 & 2
var aSharedWorker1 = new SharedWorker("SharedWorker.js", "Worker1");
var aSharedWorker2 = new SharedWorker("SharedWorker.js", "Worker2");

When it comes to giving shared workers a name it's important to be aware that the names are case sensitive.

If you give one worker the name Worker1 and the other worker the name worker1 they may look the same but the case difference with the 'w' causes there to be two worker threads created rather than simply creating and reusing a single worker thread.


Scope of Shared Workers

Something that you may be curious about is what happens when the window that created a shared worker is closed? Is the worker thread still accessible by other windows from the same origin?

A shared worker will remain active as long as one window has a connection to it. For example:

Window 1 creates a shared worker and Window 2 connects to it.

Later Window 1 is closed but, because Window 2 is still connected, the shared worker thread remains even though Window 2 didn't originally create it.



Importing JavaScript into a Worker Thread

Most modern web development now includes several JavaScript files and, in a lot of cases, JavaScript libraries/frameworks.

Fortunately, worker threads have access to a global function called 'importScripts' which allows you to pull the necessary JavaScript files into the worker thread's scope.

The importScripts function allows you to specify one or more JavaScript files to be imported.

If multiple files are specified, they are downloaded in parallel but will be loaded and processed synchronously in the order specified.

Also, the importScripts function does not return until the JavaScript files specified have been loaded and processed.

The following are some examples of how the importScripts function can be used:

// Import just one file
importScripts("file.js");

// Import multiple files with one call (careful of their order if they have
// global variables that depend on other js files being included in the
// request...the JS files will be downloaded in parallel but loaded and
// processed synchronously based on their order here).
//
// Note: I stopped at 3 files to import but you can include any amount

importScripts("file1.js", "file2.js", "file3.js");

Technically, the W3C specification allows for the importScripts function to be used with no parameters but doing so has no effect.


Attaching to 'onmessage' Within a Shared Worker

In my previous post, I had shown the following method of attaching to a shared worker's onmessage event (within the thread):

onconnect = function (evt) {
var port = evt.ports[0];
port.onmessage = function (e) { OnControllerMessage(e, port); }
}

function OnControllerMessage(e, port) {
var sReturnMessage = ("Hello from the sharedworker thread!...This is what you sent my way: " + e.data);

port.postMessage(sReturnMessage);
}

Unfortunately, the above example was accomplished by creating a closure and I'm not a fan of using closures.

I did some digging and found a workaround that avoids the needed for a closures by using the target property of the event object. The target property, in this case, is the port.

The following example demonstrates connecting to the onmessage event and posting data back to the caller via the event's target property:

onconnect = function (evt) {
evt.ports[0].onmessage = OnControllerMessage;
}

function OnControllerMessage(e) {
var sReturnMessage = ("Hello from the sharedworker thread!...This is what you sent my way: " + e.data);

e.target.postMessage(sReturnMessage);
}



Timeouts

Worker threads support timeouts (setTimeout, clearTimeout, setInterval, and clearInterval) which is useful if you only want to process information at set intervals rather than be constantly processing.

One scenario that comes to mind could be if you had a need to check for updates from a 3rd party site and didn't want to constantly hammer the site with requests.

You could set up a timer to check the site every five minutes or so and if new information is available then pass it up to the UI layer so that it can be included in the display information.


The XMLHttpRequest Object

The XMLHttpRequest object can be used in a worker thread and can be used synchronously or asynchronously. It's totally up to you if you use it asynchronously since a worker thread is already separate from the UI thread so waiting for a synchronous return won't hurt the UI's responsiveness.

Something to be aware of about using the XMLHttpRequest object, from within a worker thread, is that the 'responseXML' property will always be null.

The Gecko-specific 'channel' attribute in the XMLHttpRequest instance will also be null when the XMLHttpRequest object is used within a worker thread.

The return data from an XMLHttpRequest, when within a worker thread, will always be in the 'responseText' property as shown in the following example:

var xhrRequest = new XMLHttpRequest();
xhrRequest.open("GET", sURL, false);
xhrRequest.send();

return xhrRequest.responseText;



Error Handling

Unhandled exceptions can be caught within the worker thread by attaching a function to the global 'onerror' event.

There are some browser differences to note here:
  • Chrome 5 and Safari 5 both just pass the error as a string to the error handler in the thread
  • Firefox 3.6.8 and 4.0 beta 2 pass in an ErrorEvent object to the error handler in the thread


The ErrorEvent object's attributes are:
  • message - A human-readable error message
  • filename - The name of the script file in which the error occurred
  • lineno - The line number of the script file on which the error occurred


The following is an example of attaching to the onerror event of a dedicated worker thread (the example will also work for shared workers with the exception that with shared workers postMessage needs to be called on a port):

// Attach to the global error handler of the thread
onerror = OnErrorHandler;

function OnErrorHandler(e) {
// In Chrome 5/Safari 5, 'e' is a string for both dedicated and shared
// workers within the thread

if (typeof (e) == "string"){
postMessage("Error Message: " + e);
} else { // Dedicated worker in Firefox...(Firefox does not yet support
// shared workers)
postMessage("Error Message: " + e.message + " File Name: " + e.filename + " Line Number: " + e.lineno);
}
}

// to test the error handler, throw an error
throw "This is a test error";

You can also attach to an error event of the worker object instance itself so that the creating thread can also have access to the error information.

All browsers (Chrome 5, Safari 5, Firefox 3.6.8 / 4.0 beta 2) implement the dedicated worker instance error event in the same way by passing in the ErrorEvent object.

When it comes to shared workers, however, I have not been able to get the shared worker object instance to trigger the onerror event in Chrome 5 or Safari 5. From my research, it appears that for shared workers the onerror event will only be triggered for the shared worker instance if there was a network error while the worker thread was being created.


The following is an example of attaching to the onerror event of a dedicated worker instance (you set up a shared worker the exact same way):

var aWorker = new Worker("DedicatedWorker.js");
aWorker.onerror = OnErrorMsg;

function OnErrorMsg(e) {
alert("Error Message: " + e.message + " File Name: " + e.filename + " Line Number: " + e.lineno);
}



Closing a Worker Thread

The creator of a worker can close the thread by calling 'terminate' on the worker instance as in the following example:

var aDedicatedWorker = new Worker("DedicatedWorker.js");

...

// We're done with the tread so let the browser release the system
// resources

aDedicatedWorker.terminate();

An alternative to using the terminate method on the worker instance is that the thread has the ability to close itself as in the following example:

// JavaScript of the thread itself

...

close();



In Closing

Between my previous post 'An Introduction to HTML 5 Web Workers' and this one, I hope I have been able to help you in your understanding of web workers so that you can make use of all the possibilities this exciting new technology brings to web development!

If you would like more information on the HTML 5 Web Workers specification you can click on the following links:


New: I have recently written an article for DZone.com which combines the Web Worker information presented in my blog posts and introduces some new information like transferable objects, inline workers, and includes a link to a sample project stored on github.com. The article can be found here: http://refcardz.dzone.com/refcardz/html5-web-workers

No comments:

Post a Comment