I run SignalLeaf.com – a podcast audio hosting service. Recently, I detailed an issue and solution for SignalLeaf.com’s episode and podcast tracking reports. The gist of the problem is that mobile devices will make use of byte-range request headers to download files in very small chunks. This is useful for unreliable networks, as a device can request a small amount of data, record where it left off and pick up from there when the network comes back online. Unfortunately, it also caused a problem for me. In the worst case, I had more than 63,000 download tracking entries created by a mobile device that was making one request per second over an 11 hour period.
What should have been a single entry in the tracking for a podcast that averages 1,200 listens per episode showed up as 63,000 listens for that one episode!
Correcting The Problem
The solution to the problem is to only tracking 1 download entry no matter how many requests are made for the episode, by a given device. The easy way to make this happen is to just look at the byte range request header in the HTTP request for the media file.
A byte range request generally looks like this:
There are variations for this that tell you if it’s requesting something other than bytes, or it may request a start with no end in the range, meaning give me everything after this. But either way, the code is the same and is fairly simple. Grab the “range” HTTP header from the request. If that header is not there (no “range” requested), or if the range is there and it starts with a 0 (zero) for the range, then track the download. Otherwise – if the range has something other than a 0 (zero) start – don’t track the download.
The code is then called, passing in the “range” http header:
If you’d like to know more about the problem that i ran into, see the SignalLeaf blog post on correcting download tracking for mobile devices. If you want to know more about byte range requests… well, it seems to be somewhat difficult to find good info for this, other than the HTTP spec. I managed to piece together the info I needed from various blog posts, stack overflow questions and use of
curl -r for range requests, to test it out.