WebAssembly in Action

Author of the book "WebAssembly in Action"
Save 40% with the code: ggallantbl
The book's original source code can be downloaded from the Manning website and GitHub. The GitHub repository includes an updated-code branch that has been adjusted to work with the latest version of Emscripten (currently version 3.1.44).

Tuesday, November 13, 2012

HTML5 - WebSocket API


Before we start digging into the HTML5 WebSocket API, we need a bit of history to understand why this technology is even needed...

Normally, when you visit a web page with your browser, an HTTP request is sent to the server asking for the page. The server then responds with a result.

Depending on the type of information requested, stock prices for example, the information might be stale by the time you receive the response from the server.

So what do you do?

One common approach has been to have the web page poll the server to check for new information at a set interval as illustrated in the image below:


(click to view the image full size)

In the image above, polling is done every 50 milliseconds.

Polling the server unnecessarily has problems...
  • You're wasting the server's resources as it has to handle your request even if it has no data for you. Depending on how many web pages are checking back for new information, you can actually reduce the scalability of your server.
  • You're wasting network bandwidth as cookies and all kinds of header information has to be passed to the server and back with every request and response


Workarounds like long-polling have been tried where the client makes a request but, unlike with normal polling, the server doesn't return an empty response if it doesn't have any information. Instead, the server holds onto the request until it has some information to return.

When the server has new information, it sends it to the client, along with a complete response.

The client would then immediately make a new request to wait for more information.

Long-polling is basically just polling if the data changes frequently because there will be a bunch of connections made rather than one single connection.


Another workaround approach was to use streaming but some proxy servers and firewalls would actually buffer the messages which slows down the responses.

The result was that apps would either fall back to long-polling, if buffering was detected, or they would need to send the messages using TLS (SSL) but that adds even more overhead to the process because now the server and client also have to encrypt and decrypt the messages.


Workarounds to try and achieve full-duplex communication were also attempted by using two connections: one for the client to talk to the server and one for the server to talk to the client but that can be difficult to set up and coordinate.


Introducing Web Sockets

Web Sockets give you full-duplex functionality with a single connection between the client and server.

Full-duplex allows messages to be passed in both directions at the same time.

HTTP is half-duplex where data can only be passed in one direction at a time. For example, the server cannot respond to a client request until it has finished receiving the request.

A Web Socket connection can also remain open for the life of the session.

Based on tests run by Kaazing Corp, who have been closely involved in the specification process, depending on the amount of HTTP headers, Web Sockets have been tested to provide a 500:1, sometimes even a 1000:1 reduction in unnecessary HTTP header traffic and a 3:1 reduction in latency.

The other nice thing about Web Sockets is that they operate over port 80/443 so there are no firewall or proxy issues to contend with!


To open up a Web Socket, a client starts off with an HTTP handshake asking the server for an upgrade to the Web Socket protocol.

Once the connection is established, messages can be sent back and forth with a 0x00 byte to start the message and a 0xFF byte to end the message.

There are only 2 bytes of overhead compared to the large amount of header information that would usually exist with an HTTP request and response.


With the following example illustration comparing polling with Web Sockets, rather than the client constantly having to ask the server if there are any changes, the client no longer has to ask. The server simply tells the client when it has new information.


(click to view the image full size)


WebSocket object

The following is the current HTML5 WebSocket specification (from the September 20, 2012 W3C Candidate Recommendation):

[Constructor(DOMString url, optional (DOMString or DOMString[]) protocols)]
interface WebSocket : EventTarget {
readonly attribute DOMString url;

// ready state
const unsigned short CONNECTING = 0;
const unsigned short OPEN = 1;
const unsigned short CLOSING = 2;
const unsigned short CLOSED = 3;
readonly attribute unsigned short readyState;
readonly attribute unsigned long bufferedAmount;

// networking
attribute EventHandler onopen;
attribute EventHandler onerror;
attribute EventHandler onclose;
readonly attribute DOMString extensions;
readonly attribute DOMString protocol;
void close([Clamp] optional unsigned short code, optional DOMString reason);

// messaging
attribute EventHandler onmessage;
attribute DOMString binaryType;
void send(DOMString data);
void send(Blob data);
void send(ArrayBuffer data);
void send(ArrayBufferView data);
};


The WebSocket object accepts a url string as the first parameter to the constructor.

The readyState attribute of the WebSocket instance created will hold one of the constants: CONNECTING, OPEN, CLOSING, CLOSED.

The bufferedAmount property holds the number of bytes (UTF-8 text or binary data) that has been queued by the Send() method but has not yet been transmitted to the network.

The byte count does not include the framing overhead or any buffering the operating system or network hardware might do.

There are several event handlers that you can hook into: onopen, onerror, onclose, and onmessage.
  • onopen receives the standard DOM event object
  • onerror...I haven't been able to track down information on what event object the onerror handler receives. In my testing it appears to receive the DOM event object but some sample code on the internet shows a handler accessing a .data property for the error message.
  • onclose receives a CloseEvent object
  • onmessage receives a MessageEvent object

The extensions property is for future use where the server will indicate which extensions, if any, were selected. For now, this is always an empty string.

The protocol property returns the subprotocol selected by the server, if any.

It can be used in conjunction with the array, from the 2nd parameter of the constructor, to perform subprotocol negotiation.

The close method allows you to manually close the Web Socket connection.

It can accept two optional parameters.

If specified, the first parameter must either be 1,000 or in the range of 3,000 to 4,999.

The second parameter is the reason why the close method is being called and has a maximum of 123 bytes.

Note: I haven't been able to get this to work in either Firefox or Chrome. Both seem to ignore the values I'm passing them.

The binaryType property will hold a string with either "blob" or "arraybuffer" for a value.

"blob" is the default value for the binaryType property.

If the binaryType property is set to "blob" then the binary data is returned in Blob form.

If the binaryType property is set to "arraybuffer" then the binary data is returned in ArrayBuffer form.

There are several overloads for the send method available allowing you to send a string, Blob, ArrayBuffer, or ArrayBufferView to the server.


MessageEvent object

The following is the current definition of the MessageEvent object which is found in the HTML5 Web Messaging specification (from the May 1, 2012 W3C Candidate Recommendation):

[Constructor(DOMString type, optional MessageEventInit eventInitDict)]
interface MessageEvent : Event {
readonly attribute any data;
readonly attribute DOMString origin;
readonly attribute DOMString lastEventId;
readonly attribute WindowProxy? source;
readonly attribute MessagePort[]? ports;
};

dictionary MessageEventInit : EventInit {
any data;
DOMString origin;
DOMString lastEventId;
WindowProxy? source;
MessagePort[]? ports;
}

When you receive the MessageEvent object, in your WebSocket's onmessage event handler, the property you will be interested in is the data property.


CloseEvent object

When the Web Socket is closed, the onclose event is triggered and will be passed the CloseEvent object:

[Constructor(DOMString type, optional CloseEventInit eventInitDict)]
interface CloseEvent : Event {
readonly attribute boolean wasClean;
readonly attribute unsigned short code;
readonly attribute DOMString reason;
};

dictionary CloseEventInit : EventInit {
boolean wasClean;
unsigned short code;
DOMString reason;
};

If you call the WebSocket instance's close method without specifying any parameters, the onclose event handler will receive a CloseEvent object containing the following values:
  • wasClean will hold false
  • code will hold 1006
  • reason will be an empty string

For some reason, in my tests using Firefox and Chrome, even if I specify values for the close event parameters, I still end up receiving the default values.


Example JavaScript Code

Before using any technology that might not exist in other browsers you should check to see if the feature exists first.

For Web Sockets, checking to see if the browser supports them is as simple as an if statement check for window.WebSocket

Also, you may want to wrap your Web Socket code in a try/catch statement because the user agent (browser) is expected to throw exceptions if certain conditions are not met.

The following is some very simple JavaScript code showing how to create an HTML5 WebSocket object, how to wire up the events, and how to send a string to the server:

if (window.WebSocket) {
try {
// Create the WebSocket and wire up the event handlers
var wsWebSocket = new WebSocket("ws://echo.websocket.org/");
wsWebSocket.onopen = function (evt) { alert("Connection opened"); }
wsWebSocket.onclose = function (ceCloseEvent) {
alert("Connection closed");
}
wsWebSocket.onerror = function (evt) { alert("we had an error"); }

// The OnMessage event handler is called when we receive data
// from the server

wsWebSocket.onmessage = function (meMessageEvent) {
alert("message from the server: " + meMessageEvent.data);
}

// Send the server a message
wsWebSocket.send("hello from the client");

// You can manually close the WebSocket connection if it's no
// longer needed
// wsWebSocket.close();
} catch (err) { alert("error: " + err.message); }
} else { alert("Sorry. WebSockets are not supported by your browser"); }


In Conclusion

There are several issues to be aware off when working with the HTML5 WebSocket API...
  1. Some browsers won't allow a mixed content environment
    • Can't us a non-secure Web Socket connection (ws://) if the page was loaded with https://
    • Can't use a secure Web Socket connection (wss://) if the page was loaded with http://
  2. Some server errors will be masked with the close code of 1006 and will not specify the actual error in order to prevent a script from being able to distinguish the actual error while probing the network in preparation for an attack. Some of the causes of the 1006 error are:
    • A server whose host name could not be resolved
    • A server to which packets could not successfully be routed
    • A server that refused the connection on the specified port
    • A server that failed to correctly perform a TLS handshake (e.g. the server certificate can't be verified)
    • A server that did not complete the opening handshake (e.g. because it was not a WebSocket server)
    • A WebSocket server that sent a correct opening handshake, but that specified options that caused the client to drop the connection (e.g. the server specified a subprotocol that the client did not offer)
    • A WebSocket server that abruptly closed the connection after successfully completing the opening handshake


Additional Resources

The currently published W3C Candidate Recommendation of the HTML5 WebSocket API specification can be found here: http://www.w3.org/TR/websockets/

The currently published W3C Candidate Recommendation of the HTML5 Web Messaging specification, which contains the definition for the MessageEvent object, can be found here: http://www.w3.org/TR/webmessaging/

The following is a very handy echo server that can be used when testing out the Web Socket technology: ws://echo.websocket.org/