The rise and rise of “not provided” keywordsPrivacy and Accuracy, SEO & Analytics
If you are active with search engine optimisation (SEO), then you will be aware of the issue of “not provided” showing in your Google Analytics reports for organic visits. To quickly summarise, in October 2011 Google implemented a change to how searches performed on their web properties can be tracked by website owners receiving the subsequent click-through traffic (see original post).
Essentially, the change was that if a visitor is logged in to their Google account when performing a search (for example, logged into their GMail or any other Google service), Google takes this as a signal to request privacy – and therefore encrypts the session via SSL. The result is that when a visitor clicks on an organic result, no referral detail is passed to the receiving website i.e. the keyword information is lost and not provided shows in your Google Analytics reports – see technical note at the end of this post for more information.
Note: This effects ALL web analytics tracking tools – including Google Analytics.
Last updated 05-Aug-2013:
- Collected data up to 31-Jul-2013
- Added new data source (Global charity)
- Note the convergence at 46% for the control group (Figure 3)
- Now three inflection points are showing (i-1, i-2, i-3 of Figure 3).
The following chart plots the scale and growth of the “not provided” issue for 10 English language websites I work with. The total organic traffic analysed over the period is ~120 million visits. The legend is in order – to match the graphed lines.
Figure 1 - Percent of organic visits with “not provided” set
Figure 1 indicates there are 3 different zones – corresponding to 3 different types of users. I label these as A, B and C on Figure 2
Figure 2 - same as above chart with three zones identified
- Zone A (>70%):
I consider zone A an outlier. It represents this website, which we know is targeted to Google users. Given that, I am surprised it is not 100%. That said, it does not represent other non-Google specific websites (yet!).
- Zone B (50-70%):
I consider these to be tech-savvy users that are most probably logged into a Google service, or search directly via the browser’s omnibox.
- Zone C (<50%):
This is the inverse of Zone B i.e. non-tech-savvy users.
Figure 3 – adding a “control” to the data
In Figure 3, I add data from two highly respected universities – represented by the gold lines for US and UK (the higher historical line is the US university). For these universities, I am assuming visitors to these websites are a broad mix of tech savvy users that are most probably logged into a Google service (or search directly via the browser’s omnibox), and non-tech-savvy users (the inverse group).
The golden lines fit nicely between zones B and C and hence I use this as my benchmark. Figure 3 shows that above the golden line(s), the audience can be described as more tech-savvy, while below it they are less tech savvy. At present (31-July-2013), the “not provided” benchmarks have converged at 46% for US focused and UK focused websites respectively.
The inflection points (i-1 and i-2) correspond to the launch of “not provided”in the US and rest-of-world respectively.
The Latest Changes To Affect “not provided”
Clearly loosing the keywords visitors use to find your website is a big loss to any digital marketer. And although +Matt Cutts originally commented the change would only effect a small proportion of your traffic, clearly this was going to increase over time. After all, logging into a Google service is exactly what Google would like all its users to do…
Now the browsers themselves are also impacting “not provided”. In July 2012 Firefox announced a switch to SSL for all Google searches, and Safari follwed in September 2012. On January 18th, Google announced the same change. That is, even if the visitor is not logged into a Google service, the Chrome omnibox uses SSL for the visitor’s session. So another step to seeing organic keyword detail all but disappear form your traffic reports. SEO is definitely getting harder…!
The inflection point (i-3) corresponds to Google’s change for the Chrome browser.
Figure 4 - Proportion of global web traffic by browser type
Technical Side Note
Although the SSL protocol strips ALL referrer detail from http headers, the referring domain can be retrieved by browsers that support meta referrer (currently only Chrome). That means, https://www.google.com/ will show as the referrer for example, though the query parameter containing the search term is not available. To cater for browsers that do not support meta referrer (currently IE, Safari and Firefox), Google redirects the visitor through a http referrer with the visit search terms removed (q parameter is null).
Privacy Side Note:
What I find odd (and disconcerting) is Google’s approach to AdWords traffic. That is, Adwords visits to your website are not affected – you still receive the keyword detail. It is strange to me that Google considers privacy important for organic searches, but not for paid searches.