The HTTP protocol is all around us. Can we work within HTTP/1.1 to speed up large file delivery?
What is going on in HTTP when a page is requested?
Could this be sped up? In HTTP/1.1 we are given two options, pipelining and byte serving. It is worth understanding each. In Fig. 1, the operations are presented serially. In reality that is mostly the way they are carried out. If the server supports pipelining, requests can be sent without waiting for a response within the same TCP session.
In Fig. 2, the same requests have been made in rapid succession. But as defined, the pipelining feature of HTTP/1.1 will return the resources in the order of the request. This is in essence a FIFO system and can suffer from head-of-line issues. If the request for logo.png in Fig. 2 results in a 200 MB file, that resource will be delivered before other resources can continue. Asynchronous delivery and other improvements are scheduled for HTTP/2.0, but they are not part of HTTP/1.1 and with Google’s withdrawal of SPDY, there aren’t a lot of improvements within pipelining that are available in browsers.
Byte Serving is another feature of HTTP/1.1. Some content delivered over HTTP can be read progressively (like HTML) while other content needs to be delivered completely before your computer can do anything with it (like a Microsoft Word *.doc file). PDF files fit into the former category. A PDF viewer that connects with your browser knows about Byte Serving. That is how it appears that PDFs stream for some combinations of browsers and PDF viewers. This is accomplished by the server supporting Ranges and the client making use of Range requests.
If the resource can be used in a progressive manner, then byte serving can get chunks of a file and make them usable to the consumer. Byte Serving can be combined with pipelining, but for reasons already discussed, there are only marginal gains through this approach.
Let’s go back and look at that large file in a simple delivery.
If you need the entire file and can’t do anything progressively, you will end up waiting for the entire payload to complete your request. Pipelining won’t help much nor will Byte Serving since you still need the whole file to finish. What if you could make multiple parallel requests of the server asking for portions of the file? We call this approach PBS or Parallel Byte Serving.
HTTP/1.1 200 OK
Date: Mon, 01 Jun 2015 20:04:15 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Fri, 29 May 2015 14:12:18 GMT
Expires: Wed, 15 Apr 2020 20:00:00 GMT
Time to retrieve using a simple download.
time curl -s http://REDACTED/public/p/byte.serve.bin -XGET -ouptut full.pdf
By making use of the HTTP/1.1 HEAD call, we know the file is 404380907 bytes. Now it’s simply a matter of configuring four distinct agents with their own TCP + HTTP session to the server to read four different ranges of the same file. Here’s an example of one agent.
curl -s -H Range: bytes=0-101095226 http://REDACTED/public/p/byte.serve.bin -XGET –output bin.part1
Three more agents are started up in rapid succession with differences in the Range Request Header and different output files. This was combined into a simple script.
From 22 minutes to seven minutes. That is promising. But this is a very naive setup. It assumes there is no authentication, TLS, or other expensive operations within the HTTP call. To be useful, PBS would need to be tested against our own production equipment. Inside of our platform, we’ve done a lot to optimize intra machine communication so delivering a large file faster would have to make big strides for us to want to change some of our software.
To test production resource, the scripts repeatedly requested a 40MB file – both as a single file and as four separate agents asking for 25% of the file. The PBS approach is faster, but not fast enough to take on the headaches of reassembling the parts and other changes in our agents. Perhaps if files were much larger, like 100MB or more PBS would showcase it’s advantages more clearly.
The graph shows our average simple delivery was .9964 seconds while PBS was .9381 seconds. Where are the enormous gains of the large file delivery outlined above? Well this service is multiplexed with a load balancer, handles authentication, TLS, and other pieces of code. The overhead for each agent session eats away at the gains of Parallel Byte Serving for smaller files.