February 19, 2015

HTTP/2 finalized - a quick overview

The HTTP/2 specification has been finalized and will soon be released as an RFC. Every major browser will get support for it soon.

This is a major new release of the specification, and builds upon earlier work such as Google’s SPDY protocol (which is now being deprecated). It’s the first major release of the protocol since 1999, which is 16(!) years ago.

While E-mail may still be the most popular protocol on the internet, http certainly is the most visible and relevant to many developers day-to-day work.

Even the bbc is talking about it!

As many of you develop HTTP-based applications, here are a few things you should know:

HTTP/2 is an new way to transmit HTTP/1.1 messages

HTTP/2 does not make any major changes to how HTTP works. The biggest difference is in how the information is submitted.

HTTP/1.1 (and 1.0, 0.9) sent everything in plain text, HTTP/2 will use a binary encoding.

HTTP/2 still has requests, responses, headers, status codes and the same HTTP methods.

Subtle differences

HTTP/2 encodes HTTP headers as lowercase. HTTP/1.1 headers were already case-insensitive, but not everbody adhered to that rule.
HTTP/2 does away with the ‘reason phrase’. In HTTP/1.1 it was possible for servers to submit a human readable reason along with the status code, such as HTTP/1.1 404 Can't find it anywhere!. Or HTTP/1.1 200 I like pebbles, but in HTTP/2 this is removed.
A new status code! (HTTP geeks love status codes), 421 Misdirected Request allows the server to tell a client that they received a HTTP request that the server is not able to produce a response for. For instance, a client may have picked the wrong HTTP/2 server to talk to.

Upgrades to HTTP/2 can be invisible and transparent

Most browsers will start supporting HTTP/2 extremely soon. When a browser makes a normal HTTP/1.1 request in the future, they will include some information that tells the server they support HTTP/2 using the Upgrade HTTP header. For HTTPS, this is done using a different mechanism.

If the server supports HTTP/2 as well, the switch will happen instantly and this will be invisible to the user.

Everyone will still use http:// and https:// urls.

If a HTTP client already knows the server will support HTTP/2, they can start speaking HTTP/2 right from the get-go.

Many server-side developers don’t have to think much about this. If you are a PHP developer, you can just upgrade to a HTTP server that does HTTP/2, and the rest will be transparent.

HTTP/2 is probably faster

There are a few major features that improve speed when switching to HTTP/2:

A lot of bytes in HTTP/1.0 are wasted because of bytes being sent back and forward in the HTTP headers. In HTTP/2 HTTP headers can be compressed using the new HPACK compression algorithm.

A big feature that came with HTTP/1.1 was ‘pipelining’. This is a feature that allows a HTTP client to send multiple requests in a connection without having to wait for a response. Because of poor and broken implementations, this feature has never really been enabled in browsers. In HTTP/2 this feature comes out of the box. Only one TCP connection is needed per client, and a client can send and receive multiple requests and responses in parallel. If one of the HTTP responses is stalled, this doesn’t block the rest of the HTTP responses.

So for application this can mean:

Less HTTP connections open
Less data being sent
Less round-tripping

Server push

In HTTP/2 it’s possible to preemptively send a client responses to requests, before the requests were made.

This can seriously speed up application load time. Normally when a HTML application is loaded, a client has to wait to receive all the <img>, <script> and <link> tags to know what else needs to be requested.

Server push allows the server to just send out those resources before the client even requested them.

In the case of a server push, the server actually sends back both the HTTP response, and the actual request that the client would have had to send in order to receive the response. The request is sent in a PUSH_PROMISE frame.

Streams and frames

HTTP/1.1 has a very distinct “request” and “response” flow in it’s messaging. A new response can only be sent over the wire after the last one has completed.

In HTTP/2 multiple messages can be sent at the same time. Every message that’s currently being sent gets its own “stream”.

The messages get sent ‘interleaved’ by splitting them up in multiple “frames”.

This is not unlike video formats, which also is split up in multiple streams (video, audio, subtitles) and also gets split up interleaved frames.

There’s a few different frame types:

The DATA frame carries a list of bytes, for example to encode HTTP request and response bodies.
The HEADERS frame carries a list of HTTP headers.
The PRIORITY frame allows the sender of a message (client or server) to indicate that a certain stream should get a higher or lower priority. This is useful in situations where there’s limited capacity available.
The RST_STREAM frame is used to terminate a stream in case of an error.
The SETTINGS frame is used for senders to indicate how they wish to communicate. For example, a client can indicate that it does not wish to get resources preemptively pushed by the server.
The PUSH_PROMISE is a message that the server sends to the client with information about a server-push that will in the future be sent.
The PING allows a peer to measure the roundtrip time between client and server, and is a simple way to find out of the other peer is still alive.
The GOAWAY allows a peer to inform that no new streams should be started on the current connection, and instead open up a new connection.
The WINDOW_UPDATE frame may be used to implement ‘flow control’, which apparently may help peers with resource constraints, such as memory.
The CONTINUATION frame follows a HEADERS or PUSH_PROMISE frame and contains additional data that didn’t fit in the last frame.

Typical HTTP requests and responses therefore usually consist of:

A HEADER frames, followed by zero or more CONTINUATION frames
Zero or more DATA frames with the message body.

A response may have more than one HEADER frame at the start, as sometimes a server will send back more than 1 HTTP status line (Such as 100 Continue followed by 200 OK).

A response may also have additional HEADER frames after the data. Yes, HTTP headers may sometimes be sent after the body, even in HTTP/1.1.

After all frames for a request of response have been sent, the stream is no longer open.