On March 31st, 2018, Cloudflare pulled the ultimate April Fool’s prank. Cloudflare, the world’s largest and leading content delivery network, partnered with APNIC (Asia-Pacific Network Information Centre) to create two new public DNS resolvers, on 1.1.1.1 and 1.0.0.1. Only thing is, this wasn’t an April Fool’s prank. This was real. Cloudflare pulled this stunt to the everyone’s shock and is now in the leagues of Google DNS, OpenDNS, and other DNS providers. Cloudflare claims that their DNS service will speed up your internet, but let’s put that claim to the test.
What is DNS?
For the uninitiated, DNS, or Domain Name Service, is the service that translates your typical web URL (like google.com) into an IP address, that your web browser can actually use to find the website. When you want to access a website via the URL that your computer does not know already, it has to asks its DNS server what the IP address is, before it can actually do anything. This typically takes around 20-50ms depending on your DNS server and internet speed. Thus, theoretically, if your DNS server was faster, your internet would be somewhat faster as well.
The reason consumers don’t see DNS frequently however, is that most of them are already set to either your ISP’s DNS (typical) or an existing public DNS server, like those offered by Google (8.8.8.8). Now with Cloudflare on the scene, we’ll look at whether it may be actually beneficial to change your existing DNS server over.
By the Numbers
To test Cloudflare’s claims, I used the power of the power of RIPE Atlas, a distributed network of over ten thousand “probes” around the globe, each capable of running their own independent internet benchmarks, from real-world networks. Many of these probes are placed in residential networks, in people’s houses. There are also very many in data centers around the globe. To gather data, I enlisted 1000 random RIPE Atlas probes across the United States (sorry international folks!) to run their built-in DNS measurement against two DNS servers, 1.1.1.1 (Cloudflare) and 8.8.8.8 (Google). Each of these probes returned a RTT (round-trip-time) for the DNS request, essentially the amount of time a typical DNS request would take.
Random Sample
Given the way RIPE Atlas structure’s their network of probes, the amount of probes is weighted by population in each geographical location. This means that the sample used will have more probes in places like Los Angeles and New York, than somewhere like Kansas. RIPE Atlas simple has more probes in more densely populated areas, and therefore, a random sample within the United States will reflect that. As well, each probe may be hooked to a different type of network (data center vs. residential, cable vs. fiber, etc…). Unfortunately, this can not be realistically controlled for and still receive enough data for statistical analysis. This, however, is offset by the sheer number of data points gathered. As well, this will allow for a more holistic analysis, as opposed to merely looking at a single market segment.
Results
We can compare the results of these 1,000 probes below. Note that the totals do not add up to 1,000 as some probes failed to respond to the instructions given, and were therefore excluded from the data set. Displayed below is a frequency distribution of RTT’s for each DNS server, in intervals of 10ms. As well, major outliers (those with RTT > 200) were eliminated for comparability. Both data sets (Google and Cloudflare) had a total of 963 observations after elimination.
The frequency distribution is broken down below in tabular form.
Lower Bound | Upper Bound | Cloudflare | |
---|---|---|---|
0 | 10 | 227 | 241 |
10 | 20 | 346 | 326 |
20 | 30 | 142 | 180 |
30 | 40 | 66 | 74 |
40 | 50 | 16 | 25 |
50 | 60 | 24 | 25 |
60 | 70 | 39 | 32 |
70 | 80 | 17 | 20 |
80 | 90 | 7 | 15 |
90 | 100 | 5 | 9 |
100 | 110 | 2 | 2 |
110 | 120 | 1 | 1 |
120 | 130 | 4 | 1 |
130 | 140 | 3 | 1 |
140 | 150 | 0 | 0 |
150 | 160 | 1 | 3 |
160 | 170 | 1 | 1 |
170 | 180 | 0 | 0 |
180 | 190 | 0 | 0 |
190 | 200 | 2 | 1 |
Further Breakdown
Looking at the data, we can see there are two “discrete” distributions at play, from range, 0 to 30, and from 40 to 90. These distributions act somewhat independently and can be broken down further into their own frequency distributions. Specifically, we will focus on the lower end of the distribution, with RTT’s less than 40ms, as that appears to be the bulk of our data. Note that the number of observations for Cloudflare is 781, where for Google, it is 821.
Let’s equate the sample sizes of Cloudflare and Google, using the number of observations from Cloudflare as the baseline. Essentially, we are taking the first 781 observations.
Initial Insights
Looking at the initial 1,000 samples, there does not appear to be a clear difference between Google and Cloudflare, as their overall performance is roughly the same. The initial sample has frequency buckets of 20ms, and it was pretty unlikely that the performance difference between the two DNS servers exceeded 20ms. However, when we broke the data down further, looking only at the interval from 0ms to 40ms, we see a very different story. From the number of observations alone, we see that Google had more observations below 40ms (approximately 4%). This already suggests that Google’s performs somewhat more consistently. Normalizing both data sets to 781 observations (the number of Cloudflare observations below 40ms), we finally get some great insight. Clearly, Google’s DNS had many more observations in the 18ms to 32ms range. This essentially is compressing Cloudflare’s data set and shows that Google’s may have much less spread in variability. However, Cloudflare did have more samples in the under 18ms range, so it may still be faster overall.
Descriptive Statistics
Now that we have laid out the initial frequencies of the results, we can dive deeper into the descriptive statistics. Looking first at the overall data set, we arrive at the following data.
Statistic | Cloudflare | |
---|---|---|
Observations | 963 | 963 |
Mean | 23.073 | 23.540 |
Standard Dev. | 23.772 | 23.104 |
Minimum | 0.629 | 0.627 |
1st. Quartile | 9.925 | 9.922 |
Median | 15.694 | 16.578 |
3rd. Quartile | 26.444 | 27.477 |
Maximum | 199.535 | 198.336 |
Skewness (Fisher) | 2.839 | 2.558 |
Kurtosis (Fisher) | 11.557 | 9.444 |
Next, looking at the results with a RTT <40ms, we have the following summary, which is much more useful.
Statistic | Cloudflare | Google Norm | |
---|---|---|---|
Observations | 781 | 821 | 781 |
Mean | 15.349 | 15.790 | 14.717 |
Standard Dev. | 9.146 | 9.338 | 8.236 |
Minimum | 0.629 | 0.627 | 0.627 |
1st. Quartile | 8.594 | 8.977 | 8.653 |
Median | 14.201 | 14.437 | 13.912 |
3rd. Quartile | 20.926 | 21.965 | 20.639 |
Maximum | 39.790 | 39.913 | 33.011 |
Skewness (Fisher) | 0.556 | 0.481 | 0.264 |
Kurtosis (Fisher) | -0.296 | -0.433 | -0.759 |
Comparing Cloudflare and Google normalized, we see that Google’s has much less variability and overall being faster than Cloudflare.
Inferential Statistics
While descriptive statistics may be fun, the real value from this data set comes from inferential statistics. We’ll be skipping inferential statistics on the overall data set, as I promise you, it is absolutely useless and gives no additional value. You are free to run it yourself if you wish. We will be focusing on the first 781 observations on both Google and Cloudflare instead.
To compare these two data sets, we’ll run the alternative hypothesis that Cloudflare is faster than Google, $\mu_{cf} < \mu_{g}$ versus the null hypothesis that Cloudflare is not faster than Google, $\mu_{cf} \not< \mu_{g}$. Performing this 2-sample t-test on the data sets, we obtain the following results.
Value | |
---|---|
Difference | 0.631 |
t (Observed) | 1.434 |
t (Critical) | -1.646 |
D.o.F. | 1560 |
p-value (one-way) | 0.924 |
Initially, the ridiculously high p-value jumps out, suggesting that we have actually inverted this test. We can clearly conclude that we can not reject the null, $\mu_{cf} \not< \mu_{g}$. However, for insight, we ran the opposite test, that Google was faster than Cloudflare, $\mu_{cf} > \mu_{g}$, against the same null. Results are displayed below.
Value | |
---|---|
Difference | 0.631 |
t (Observed) | 1.434 |
t (Critical) | 1.646 |
D.o.F. | 1560 |
p-value (one-way) | 0.076 |
To anyone versed in statistics, this result should surprise no one. The p-value for this alternative hypothesis is merely $1 - p_{previous}$. We can see that the p-value is significant at $\alpha = 10\%$. This is a surprising result, given that Cloudflare claims to be faster. However, just looking at the data we have here, it appears that Google is actually doing better. Whether we attribute this to a fluke of the data or Google actually being slightly faster is left as an exercise to the reader.
Additional Comments
Given the popularity and general usage of Google’s DNS, it is clearly being hit many times harder than Cloudflare’s resolvers. This may be impacting and obscuring Google’s true speed, but even with the load, it is still ever so slightly faster. Unfortunately, this is really impossible to control for, unless Cloudflare attains the same popularity, which is highly doubtable. Therefore, we must simply extrapolate that Google may be even faster, based on intuition alone here.
Conclusions
From all the examination of the data we did, we really arrive at an anti-climatic conclusion of sorts. In the frequency distributions, we saw both DNS resolvers performning similarly if looking at the entire spectrum. However, broken down, it appears that Google’s resolvers fair better and are more consistent on the lower end of the spectrum. Our 2-sample t-tests do confirm this belief that Google is actually faster on the lowest 781 observations. However, our results are tempered by the fact that Google’s resolvers are only 0.6ms faster on average. This is an extremely neglible difference and is basically invisble to the user in the grand scheme of things.
Benefits of Cloudflare’s DNS
Now that we ultimately concluded that both resolvers are practically equally fast from the data, we can example the more quantitative aspects of each. Cloudflare’s DNS offers a few key “advantages” in that it:
- Keeps records for only 24 hours of queries run
- Offers DNS-over-TLS
- Offers DNS-over-HTTPS
While most of these benefits sound nice, most of them don’t really impact much anyways. For those who are privacy conscious, the log rotation is nice. But, if you are truly privacy conscious, you should be using a VPN anyways. Some even offer their own DNS resolvers to increase privacy of their users. Cloudflare’s offerings of DNS-over-something is also nice to have. However, most, if not all, implementations of either of the two standards offered is poor or nonexistant on the popular operating systems. Currently, Windows and Mac both require third party software to take advantage of those protocols, and even then, support is rather poor. Google’s DNS servers also offer DNS-over-HTTPS, so that feature is also not exclusive to Cloudflare. Support for it is still poor though. Finally, while Google has been victim to DDOS attacks in the past, those are few and far in the between. As well, users typically do not only have a single DNS server listed, so DDOS’s on one would not matter too much.
Cloudflare, why?
One thing we haven’t touched on yet is Cloudflare’s unique choice of IP address for one of their primary resolvers, 1.1.1.1. Previously, 1.1.1.1 was owned by APNIC (and still is, actually). However, nothing ran on 1.1.1.1, since the inception of the World Wide Web, and assuming the trend would continue, many systems and network adminstrators assumed the IP as “unroutable.” This is in violation of the actual IP standard though, which did not actually preclude 1.1.1.1 from being routable.
This presents itself with an interesting debacle. Currently, there is large amounts of junk traffic being directed at 1.1.1.1 as admins all over use it for scripts or whatnot. Now that Cloudflare actually is using 1.1.1.1, some admins, and even equipment manufacturers, are scrambling to change their configurations. While the admins have technically violated the existing IP standard by asssuming the address would never be used, should Cloudflare really have chosen that IP to work with? While the technical standard is there and set in stone, the effective and widely-believed standard is that the address is unusuable. Cloudflare just came and upturned this entire convention. Cloudflare could have chosen any other IP it wanted, but they opted to choose the one which would cause the most headache. Cloudflare, why?
Disclaimer
For those who want to try their own hand at analysis, the results from the RIPE measurements can be downloaded at the following locations:
Cloudflare: Here
Google: Here
As well, you can download the Excel spreadsheet used for analysis here. The Office 365 Add-On XLSTAT was used for the all analysis performed.