Google Analytics Data Sampling : How to Achieve Data Accuracy from Sampled Data?

Sampling is a technique of selecting a subset of  data (which represents whole data) of your traffic to website. This subset is used to find the trends and derive the relevant metrics. It is evident that the analysis of subset of data gives the similar results and trends of analyzing complete set of data.

Google Analytics Data Sampling

Google Analytics Data Sampling

Sampling helps web analytics tools in 2 important ways 

1.It reduces the burden of computation on web analytics tool for a query with high magnitude of information belonging to long range of dates.

2.It increases processing time for query.

Google Analytics Samples the data broadly in 2 situations 

  1. When you generate a standard report for a long date range.
  2. When you query the Google Analytic Server to generate an adhoc report. ( A situation where Google Analytics has no report available, but generates a report by extracting different pieces of information from other reports). Advanced Segmentation, Custom Reports fall into adhoc reporting.

In above situations when following thresh holds are met,Google Analytics Samples out the data :

1.  If you query more than 1 Million or 10 Lakh unique dimension list for a particular report in Standard Reporting.

Imagine that you want to generate a landing page report in Site Content Report from Google Analytics Standard Reports. The report you ask for the GA is to fetch a report of all landing pages where visitors have landed for a given time period. In this case if Google Analytics finds unique landing page URLs across different sessions for the specified landing pages are more than 10 lakh or 1 million for your specified date range. Then Google Analytics  samples the data as follows:

If you have asked for the 1 month data : then Google Analytics Samples the data as 10,00,000 / 30 days If you have asked for the 1 month data : then Google Analytics Samples the data as 10,00,000 / 60 days

2. Request for 500000  or more Sessions, where data is not readily available to fetch.

This happens with only Advanced Segmentation, Custom Reports and when you apply secondary dimensions in Standard Reporting. Google Analytics does not have readily available data to fetch you. These are situations with special queries where Google Analytics has to calculate the data against special query.

3. Flow Visualization report are sampled after 1,00,000 Sessions


4. Multi Channel Funnel Report sample out data after 1 Million Conversions

How to configure Sample Rate in your Google Analytics Asynchronous Code?

Sample rate for your Google Analytics can be set in tracking code. The Javascript method to be used for Sample Rate configuration as follows :

ga(‘create’,’UA-XXXX-Y’,{‘sampleRate’:100}); — No visitor is sampled out

ga(‘create’,’UA-XXXX-Y’,{‘sampleRate’:75});   —  Every 3rd Visitor is counted

ga(‘create’,’UA-XXXX-Y’,{‘sampleRate’:50});  —  Every 2nd Visitor is counted

ga(‘create’,’UA-XXXX-Y’,{‘sampleRate’:25});  —  Every alternative Visitor is counted

‘SampleRate’ specifies percentage of visitors to be tracked. The 100 is default value where no visitors are sampled out. Where as 75, 50 and 25 means sample out 3rd,2nd and alternative visitor respectively.

Data Sampling and Web Analytics Data Accuracy

Web Analytics Data accuracy has always been a challenge.  The following are few reasons why  Web Analytics data is always inaccurate :

  1. Data Sampling
  2. Disabling Java script on user’s browser.
  3. Cookie rejection option as per PII guidelines
  4. Deleting cookies by the users.
  5. In efficient page tagging.
  6. One user many computers
  7. Many users one Computer
  8. Online visits converting off line.
  9. Property ID (UA-XXXX-X) copied illegally and place in other website to skew your data

Data Sampling directly affects the Data Accuracy . As long as you look for vanity metrics like users, sessions, bounce rate, page views, Unique Page views, exit rate data sampling does not affect you. But, what is the use of these Vanity Metrics ?

Data Sampling affects when you look for Matured metrics like : Goal Conversion, Ecommerce Conversion,  user engagement metrics

Sampling is self-imposed constraint by Web Analytics tools to lessen the computational burden. Sampling being a statistical technique to find sub set of data which represents whole set of data should be used to identify  trends / patterns to derive the insights about population( whole set of data is called as population in statistics). Along with ‘Data Sampling’ above mentioned unavoidable circumstances are only reason Web Analyst and Digital Marketers always have to look for the metrics which through light on patterns  not the absolute information.

For example :

20% decrement in bounce rate for page ‘X’ after adding video is fairly right interpretation and nearer to the actual truth.

30% of more revenue is generated after kick starting PPC campaigns is believable change in the pattern

So, What is the work around ? How can we solve the “Data Sampling” Constraints?

1. Up grade to Google Analytics Premium and enjoy Un sampled reports available. Premium version does not eliminate sampling issues completely, but gives you 200 times accurate data compared to Free Standard Google Analytics. The accuracy is due to the ability of Premium Google Analytics to handle websites which get 1 billion pageviews/month.

 2. Try out  ‘Analytics Canvas’ tool, a framework for data analytics. This tool helps you to eliminate Data Sampling. I like the way they have worked around for solving Data Sampling constraints. Visit     to try Analytics Canvas tool, it is free for 30 days.

3. Google Analytics Standard (Free) version gives largest possible data set or population to generate sampled reports where  you can choose  ‘Higher Precision’ as shown below.

Google Analytics Data Sampling Higher Precision option for more Data Accuracy.

Google Analytics Sampling Seek Bar

The larger data set of population being sampled , higher  Data Accuracy can be achieved.

Any other tool or technique if you know please share in the comments……………………..

Google Universal Analytics moved from Public Beta to Primetime

Google  Universal Analytics moved from public beta to prime time in April,  2014 . Now onwards all websites or any other digital devices would be using Universal Analytics as operating standard for data processing by default. The existing properties using classic analytics  either they have to upgrade to Universal Analytics  or Google will automatically upgrade them to universal analytics through Auto Transfer.

At present Google Universal Analytics moved from 3rd Phase i.e Universal Analytics public beta and has to sail through last and final phase i.e. Universal Analytics as  operating standard for Google Analytics.

Universal Analytics in 3rd phase has come out with 3 more interesting features like :

1.User ID association with anonymous user to better understand customers full journey across multiple devices.

2. Time zone specific speedy and fresh data processing capabilities.

3. Proxy data transfer from internal servers to Google Analytics Application Server.

Google Analytics has to go a long way to go for head on competition with enterprise level web analytics tool like Site Catalyst. But it is progressing towards capturing the lucrative enterprise level web analytics spectrum.

Universal Analytics is one such move towards initiating head on competition with Site catalyst. At present Google stands at par with Adobe Site Catalyst with various features like Custom Dimension and Metrics, Cross device user journey analysis, quick data processing capabilities and User ID association for deeper insights into customer journey.

Addition of Proxy data transfer from internal servers to Google Analytics Server is another such move towards making Google Analytics an Enterprise level web Analytics tool.

Google Analytics Diagnostic Tool is another feature recently rolled out in beta version. It is your trusted assistant working around 24/7 to monitor your Google Analytics account, to inform you about nitty–gritty account performance issues like tag malfunctions , improper filter implementations and goal conversion irregularities etc.,

Google Analytics Diagnostic Tool to Improve Data Quality

Google has recently announced release of ‘Google Analytics Diagnostic’ tool beta version. The tool is as a trusted Google Analytics assistant, who works for you 24/7 to monitor and report account performance critical  issues related to your Google Analytics Account to Improve Data Quality.

Google’s quest for making Google Analytics a feature rich web analytics tool is never ending. Indeed, web analytics tools are required to be feature rich to achieve a reliable data quality  to formulate and implement a robust Digital Marketing Measurement Framework. Like Adwords,Google Analytics did not have any such tool to alert the user about configuration and account performance errors.

Account Performance and critical issues:

1.Specific pages missing GATC tags
2.Improper tagging
3.Improper filter implementation.
4.Improper Event Tracking Code implementation
5.Faulty E Commerce tracking code implementation
6.Reasons for discrepancies between clicks and visits
7.Goal conversion irregularities
8.In accurate data for high cardinality dimensions due to Google Analytics system limits
9.Sampling issues
10.Any other warning related to your account performance.

To get access to try ‘Google Analytics Diagnostic tool’ you need to sign up the below form by visiting the following link

Google Analytics Diagnostic Tool

Online Form for Google Analytics Diagnostic Tool Beta Signup

At present Google can’t guarantee the availability of this tool to all sign ups.if your lucky Google may white list your request to use ‘Google Analytics Diagnostic tool’ ,  otherwise you need to wait till the tool will be released for public beta.

5 Universal Analytics features compared to Classic Analytics

Universal Analytics is an advanced and sophisticated way of data collection and organization in google analytics reports. It is the best way( for small business, who can’t afford Google Premium) to get meaning full insights into how visitor engage with your various properties like web sites, mobile apps and other digital devices like information kiosks, gaming kiosks, and Bank ATMs, Ticketing Kiosks. etc., Check out 5 Universal Analytics features compared to Classic Analytics. (more…)