Speeding Up Large File Transfer Over HTTP

The HTTP protocol is all around us. Can we work within HTTP/1.1 to speed up large file delivery?

What is going on in HTTP when a page is requested?

TCP + HTTP overview of a typical request to view a homepage.

Fig. 1: Simplified TCP + HTTP overview of a typical request to view a homepage.

Could this be sped up?  In HTTP/1.1 we are given two options, pipelining and byte serving.  It is worth understanding each.  In Fig. 1, the operations are presented serially.  In reality that is mostly the way they are carried out.  If the server supports pipelining, requests can be sent without waiting for a response within the same TCP session.

Fig. 2: Pipelining in HTTP/1.1

Fig. 2: Pipelining in HTTP/1.1

In Fig. 2, the same requests have been made in rapid succession.  But as defined, the pipelining feature of HTTP/1.1 will return the resources in the order of the request.  This is in essence a FIFO system and can suffer from head-of-line issues.  If the request for logo.png in Fig. 2 results in a 200 MB file, that resource will be delivered before other resources can continue.  Asynchronous delivery and other improvements are scheduled for HTTP/2.0, but they are not part of HTTP/1.1 and with Google’s withdrawal of SPDY, there aren’t a lot of improvements within pipelining that are available in browsers.

Byte Serving is another feature of HTTP/1.1.  Some content delivered over HTTP can be read progressively (like HTML) while other content needs to be delivered completely before your computer can do anything with it (like a Microsoft Word *.doc file).  PDF files fit into the former category.  A PDF viewer that connects with your browser knows about Byte Serving.  That is how it appears that PDFs stream for some combinations of browsers and PDF viewers.  This is accomplished by the server supporting Ranges and the client making use of Range requests.

Fig. 3: Byte Serving through the use of Ranges in HTTP/1.1

Fig. 3: A simplified version of Byte Serving through the use of Ranges in HTTP/1.1

If the resource can be used in a progressive manner, then byte serving can get chunks of a file and make them usable to the consumer.  Byte Serving can be combined with pipelining, but for reasons already discussed, there are only marginal gains through this approach.

Fig. 4: Combining Byte Serving and pipelining is possible but doesn't make material gains in performance.

Fig. 4: Combining Byte Serving and pipelining is possible but doesn’t make material gains in performance.

Let’s go back and look at that large file in a simple delivery.

Fig. 5 - A simple delivery of a large file.

Fig. 5 : A simple delivery of a large file.

If you need the entire file and can’t do anything progressively, you will end up waiting for the entire payload to complete your request.  Pipelining won’t help much nor will Byte Serving since you still need the whole file to finish.  What if you could make multiple parallel requests of the server asking for portions of the file?  We call this approach PBS or Parallel Byte Serving.

Fig. 7: Agent 1

Fig. 7: Agent 1

Fig. 8: Agent 2

Fig. 8: Agent 2

The file


The meta

HTTP/1.1 200 OK
Date: Mon, 01 Jun 2015 20:04:15 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Fri, 29 May 2015 14:12:18 GMT
ETag: 19e197-181a5ceb-517391052f480
Accept-Ranges: bytes
Content-Length: 404380907
Expires: Wed, 15 Apr 2020 20:00:00 GMT
Cache-Control: public
Connection: close
Content-Type: application/unknown

Time to retrieve using a simple download.

time curl -s http://REDACTED/public/p/byte.serve.bin -XGET -ouptut full.pdf

real 22m2.913s
user 0m3.413s
sys 0m12.991s

By making use of the HTTP/1.1 HEAD call, we know the file is 404380907 bytes.  Now it’s simply a matter of configuring four distinct agents with their own TCP + HTTP session to the server to read four different ranges of the same file.  Here’s an example of one agent.

curl -s -H Range: bytes=0-101095226 http://REDACTED/public/p/byte.serve.bin -XGET –output bin.part1

Three more agents are started up in rapid succession with differences in the Range Request Header and different output files.  This was combined into a  simple script.

time ./multi.sh

real 7m2.332s
user 0m3.722s
sys 0m11.659s

From 22 minutes to seven minutes.  That is promising.  But this is a very naive setup.  It assumes there is no authentication, TLS, or other expensive operations within the HTTP call. To be useful, PBS would need to be tested against our own production equipment.  Inside of our platform, we’ve done a lot to optimize intra machine communication so delivering a large file faster would have to make big strides for us to want to change some of our software.

To test production resource, the scripts repeatedly requested a 40MB file – both as a single file and as four separate agents asking for 25% of the file.  The PBS approach is faster, but not fast enough to take on the headaches of reassembling the parts and other changes in our agents.  Perhaps if files were much larger, like 100MB or more PBS would showcase it’s advantages more clearly.

Fig. 9: Comparing PBS and regular HTTP delivery of a 40 MB file

Fig. 9: Comparing PBS and regular HTTP delivery of a 40 MB file

The graph shows our average simple delivery was .9964 seconds while PBS was .9381 seconds.  Where are the enormous gains of the large file delivery outlined above?  Well this service is multiplexed with a load balancer, handles authentication, TLS, and other pieces of code.  The overhead for each agent session eats away at the gains of Parallel Byte Serving for smaller files.