Home » Analytics » Google Analytics and Why It Is Inaccurate

Google Analytics is, without any argument, the most popular analyzing tool today, with the ability to capture all user activity on the website and provide you with detailed reports. As with all kinds of tracking tools, Google Analytics can only provide you with numbers, data and what you can draw from the information is your business. The problem, however, is that so many people now understand incompletely the meaning of the data provided and this can lead to making wrong decisions when evaluating. This article outlines all the reasons why Google Analytics measurement tools are inaccurate or misunderstood, and how to address these issues.

How does Google Analytics work?

However, we should first understand how Google Analytics works and collects data. This is necessary so that we can understand why some data and parameters generate such deviations. Here’s a diagram of how Google Analytics works:

cach-hoat-dong-google-analytics.png

Google Analytics Activity Flow – by Conversion.vn

(1) When setting up Google Analytics you will need to install a Javascript code snippet to your website. This code is present on which page of your website then it can collect the data on that page.

(2) When a user accesses a website through a device such as a computer or a phone by a browser, some information such as where the user is from, what browser is used, what is the accessing device, etc. will be collected. Some websites will also leave cookies on the user’s device (a file that stores the activities and behavior of users on that site).

(3) The information collected from the user when they access the website is the raw data, which will be packaged and sent to the Google Analytics server.

(4) Once the information has been received at the server, then they will be processed, analyzed. This is the step that turns the raw data into information that can be useful to the user.

(5) Once the information has been analyzed, they will be brought into the database and applied to the filters, user-defined settings. Once the data has been entered into the database, it can not be changed. That is why every time you make some changes to filters or settings on Google Analytics, the old data will not change, only the new added data will be modified.

(6) At this point, the processed data will be sent to the Google Analytics report, which is what you will see on google.com/analytics.

Is it not that too hard to understand?

So now what can make the data you see in Google Analytics inaccurate? There are many reasons, both subjective and objective:

Incomplete understanding of the meaning of the traffic sources in the report

This is the most common problem that most people have. Reports on the traffic source are set up defaultly to help webmasters know where and how people come to the site. There are major traffic categories in the Google Analytics report, including:

kenh-traffic-google-analytics.png

The main traffic channels are classified in the Acquisition section of Google Analytics. Source: conversion.vn

These traffic sources as you can see include: Display, Paid Search, Organic Search, Direct, Referral, Social, Email and (Other). And here’s how these traffic sources are usually understood:

Organic Search: users coming to the website from the search engines (Google, Yahoo, Bing, etc.) through organic search results.

Paid Search: users visiting the website from the ads on the search results.

Display: users visiting the website from the banner ads on the website under display ad network.

Referral: users visiting the website by clicking on links from other sites.

Social: users coming to your website through social channels like Facebook, Google+, LinkedIn, etc.

Email: users visiting the website by clicking the link in the email.

Direct: users accessing the website by typing the web address into the browser or open a bookmark.

(Other): users visiting the website from other traffic sources not included in the above channels.

But perhaps you will be surprised to know that if you only understand these terms with the meaning above, you will most likely not understand enough what these data are giving and sometimes half the truth is no longer true. In fact, there are many sources of traffic that are mismatched, incomplete, and confused in Google Analytics reports.

san_dean_surprised.gif

Huh?

Yes, when viewing the Google Analytics reports and looking at traffic channels, here’s how you should understand:

Organic Search: Users come to the website from search engines through natural search results that Google Analytics can identify, and in fact the organic part of the organic traffic is direct traffic and vice versa.

Paid Search: Users coming from the search results if the tracking code is enough, part of the traffic can be on the referral and direct traffic.

Display: Users coming from banner ads on the website under display ad network if the tracking code is fully installed, part of the traffic can be in the referral and direct traffic.

Referral: People who visit your website by clicking on links from other sites can sometimes include traffic from social networks, email and advertising channels (or even paid search and display).

Social: Users coming to the website through social channels that Google can identify.

Email: users visiting the website by clicking the link in the email with tracking code.

Direct: sources with unidentified traffic will be included here.

(Other): traffic with source / medium but Google Analytics does not know where to classify will be included here.

Yes, it is clear that traffic sources are not exactly what you see and are not always the same as its name. When analyzing and evaluating based on the numbers provided, it should be understood that there will always be “grains” in the data. These ” grains” being large or small will depend on many factors and let’s look at them below.

Organic Search

Organic search traffic is indicated in Google Analytics as Organic Search (overview) or medium as organic (in source / medium). This traffic source has some of the following factors that you should be aware of.

1. Branded vs non-branded

Organic search traffic includes branded traffic, which is the keywords that users use to search for the brand name. For example, if the brand is Wall Street English and instead of searching for the generic keyword “learning English“, users can search using the keyword “learning Wall Street English” and click on the organic search results to the website. Technically, these traffic are still search traffic, but in essence, users have already known about Wall Street English and they search for the purpose of going to the website of this brand, not to search for choices. And branded traffic tends to increase as the awareness of the brand is increased (through advertising, branding, PR), not much related to the ranking results of keywords and SEO activities. For some of the clients and companies that I had the opportunity to consult, their organic traffic grew steadily, but after re-examination, the real growth was branded traffic and this section sometimes accounted for 80% of total organic traffic. At this point in time, the only keywords that these companies are ranking are probably just their brand names.

Solution

For the reasons mentioned above, branded traffic should be considered as direct traffic so that it will be more accurate in terms of evaluation and analysis. Google Analytics should set up a segment to measure branded traffic and non-branded traffic separately to give a more accurate assessment of the true state of organic traffic without being affected by users searching for the brand.

To set up a segment for Branded Traffic: go to the Organic Search category in Channels, above the graph will have the Add Segment line, click on it. Then click New Segment, name it Branded Organic Traffic or something like that, click on the Traffic Sources in the panel, in medium select contains and then type organic. Then in the Conditions section below in the first frame select “Keyword”, the second frame select “contains” then in the remaining frame fill in your brand. If the brand has many ways of naming or typing, it is best to include typo, eg “wall street english”, “wse”, “wallstreet english”, “wsenglish”. Then click Save.

branded-organic-traffic.png

You should set up a segment for tracking traffic related to branded keywords. Source: conversion.vn

2. Part of direct traffic can be organic traffic

Google Analytics often identify which source traffic the visits belong to based on the referrer but for some reason the referrer is lost resulting in those traffic being classified as direct traffic. So how many organic traffic is being calculated as direct traffic? A study conducted by Gene McKenna, Groupon Product Manager, found that up to 60% of your direct traffic is organic. That number may only be in the case of Groupon, a deal / e-commerce site, depending on the status of your website, the amount of organic traffic you get in direct traffic may be in the 20% – 80%.

Solution

Checking in the direct traffic section of the report and paying attention to URLs which is long, hard to remember and difficult for users to type directly into the browser or bookmark, they are more likely not direct traffic but organic traffic.

 long-tail-organic-traffic-trong-direct.png

Maybe they are organic traffic, not direct traffic. Source: conversion.vn

3. Not all search engines are equal

Google will by default identify organic traffic from Google (yeah, duh) and major search engines like Bing, Yahoo, Baidu, etc. However, some smaller search engines, especially domestic search engines such as Coccoc .com, Wada.vn or Laban.vn will not be in the organic part of the channel in the referral.

organic-traffic-in-referral.png

Even traffic from Google is sometimes and for some reasons confused with referral. Source: conversion.vn

Solution

You can set up to add search engines to the list of organic traffic sources. Go to the Admin section of Google Analytics> Property> Tracking Info> Organic Search Sources and click Add Search Engine then enter the corresponding parameters. For example: domain name is the domain of the search engine and the query parameter is the part before the “=” sign in the search engine results, the case of Wada.vn is “q”.

search-engine-add.png

Add wada.vn to your search engine. Source: conversion.vn

4. Most organic keywords are hidden

In Google Analytics you’ve previously been able to see what keywords people used to get to your site, but since 2011 Google had gradually overtaken these keywords for reasons of privacy and security of the users. Now if anyone logs into their Google account and search for a keyword related to your website, the keyword will be included in the “not provided”. Currently the number of keywords in the “not provided” group has reached more than 90% of the total number of keywords. This certainly makes it difficult for users to analyze the organic traffic of the website.

organic-keywords-hidden.png

Currently over 90% organic keywords are hidden in Google Analytics. Source: conversion.vn

Solution

There is no way to handle this problem except that you put money into Paid Search advertising to be able to find out what keywords people search to your website. You can also connect Google Analytics with Google Webmaster Tool (formerly Webmaster Tool) to get some information about search queries in the SEO section. Note that keywords and search queries are fundamentally different, and search queries are data obtained from Search Console, so they are not related to other data in Google Analytics.

Paid Search and Display are the two primary sources of keyword traffic that are the default in Google Analytics. Paid Search is the traffic that comes from the search engine results of search engines such as Google, Bing, Yahoo and Display as traffic from the Google banner ads. Display Network or other ad networks. Some metrics problems can be encountered with these two traffic sources:

1. Difference in the number between Google Analytics and the ad systems

If you compare, you’ll see that the number of clicks or conversions between Google Analytics and other ad systems is different, even AdWords is no exception. This sometimes makes it difficult for advertisers because they do not know what the index is and why the difference arises. This difference in data often comes from the dissimilarity in the Google Analytics tracking engine and ad systems. Here are some of the main reasons:

– Clicks and sessions are two different indicators: Ad systems usually count clicks, while Google Analytics counts the sessions. For example, a 10-minute session of a person may consist of two clicks. Adwords records it as 2 clicks, but Google Analytics only records one session.

– Tracking mechanism: Google Analytics often rely on last click to attribute a conversion, while Adwords and ad systems often use cookies to identify a conversion. For example, if a person clicks on the GDN banner then goes to the website with organic search and then performs a purchase, then Google Analytics will count that conversion for organic search while AdWords will count that conversion for the GDN.

– Filter clicks: Each system has a way to filter and prevent duplicate clicks based on the density of clicks, IPs, cookies or other factors, which can also lead to the differences.

– Different lengths of cookie duration can also affect the difference in parameters between the two systems.

data-discrepancies.jpg

The difference in the metrics of the two systems usually comes from the different tracking mechanisms. Source: pinterest

Solution

There is no way to eliminate the data differences in both systems. The usual solution is to select a reasonable maximum differentiation limitation between Google Analytics and ad channels for consistency before running to avoid billing issues after the end of the campaign (if working with a third party 3).

See also: How to optimize your ads more effectively

2. Some advertising traffic may fall into the referral, direct sections

Paid search and display traffic are sometimes likely to be attributed to referral or direct traffic for reasons similar to organic search: loss of referrer (why the referrer is lost – we will have the answer below) or because the UTM tracking code is missing, wrong.

paid-traffic-to-referral.png

Paid search and display are mixed in referral sources. Source: conversion.vn

Solution

If you run Adwords, you can connect AdWords to Google Analytics and enable auto-tagging. With other ad channels, you can use the URL Builder to embed the full tracking code for all ad links to your website. Note that this solution only minimizes the situation and does not solve the problem.

Referral

Referral basically contains all traffic coming from other domains to the website being measured. Therefore, referral traffic will include the following: advertising channels such as banners on websites, links on other websites, social networking channels, emails, search engines, and sometimes traffic from your site itself.

referral-traffic.png

Referral is mixed with many sources in it. Source: conversion.vn

1. Traffic from social networking sites

Now social networks play an important role in bringing traffic to your website and you will want to analyze the traffic indicators from the social network in a separate way. But now traffic from social networks without the full tracking code is still defaultly going to the referral channel or sometimes direct. Google is currently trying to better categorize social traffic, but it’s not perfect yet.

Solution

Links from social network sites should be fully tracked for tracking purposes. Some social networking tools like Hootsuite have the ability to automatically embed tracking code for all links when scheduling the content.

Go to Social Settings to add social resources and help Google categorize better.

social-settings.png

Setting up social channels to help Google Analytics segment channels more effectively Source: conversion.vn

2. Traffic from email on the browser

Traffic from emails opened in the browser, such as GMail, Yahoo, Hotmail, etc. will defaultly lie in the referral section, even though you may want to track these email traffic sources separately.

Solution

Attach tracking code to all links to the website in emails to ensure tracking traffic from this source. Some third-party email senders such as MailChimp and BenchmarkEmail support automatic tracking of these links.

3. Self-referral

Sometimes you will see traffic from your domain in the referral. Obviously there is something wrong because the referral actually contains the URLs from other domains to your website and not the traffic from the pages in the domain. This may cause your referral data to be inaccurate. This situation usually occurs when:

  1. Javascript redirect: If you’ve migrated your website and used Javascript to redirect the old domain to your new one, it’s likely to cause a self-referral.
  2. Google Analytics cookies are missing: if for some reason Google Analytics cookies are lost during the translation process between pages, it may also cause a self-referral status.
  3. Cross-domain: If your website has multiple sub-domains, for example abc.com has 1.abc.com and 2.abc.com, you will see these sub-domains appear in the referral.

Solution

All of the above issues mostly have a solution:

  1. Instead of using Javascript Redirect, the server-side redirect (301) will be better.
  2. Find out in which page and why Google Analytics cookies are missing, by going to Behavior> Site Content> Landing Pages and filter sources with the secondary dimension to find that page and find a solution.
  3. Set up cross-domain tracking to easily track traffic between domains. Guidance if you use Google Tag Manager and if you do not use Google Tag Manager. Not simple, but need to do.

Direct

If you think the other channels are already messy, then you probably do not know what’s inside the direct. Direct traffic is not just about a user entering the website address but also includes all other traffic sources that Google Analytics does not identify.

direct-traffic-den-tu-dau.png

The big question of the human kind, what kind of thing does the direct traffic contain? Source: conversion.vn

1. Traffic from social networking sites

As mentioned above, some of the traffic from social networks may also be included in the directories if for some reasons the referrer is lost when clicking from social networking sites.

Solution

There is no thorough solution because you can add the full tracking code for your social network post but you can not make all followers, fans and others to do so.

2. Traffic from mobile applications

Clicks from mobile applications will be included in in the direct form if the tracking code is not fully installed. Some of the traffic from the mobile applications will also be directed to the referrer.

Solution

Similar to the social, there is no radical solution because even if your application has a full tracking code tag, it does not guarantee that clicks from other applications will be the same.

3. Traffic from email clients like Outlook

Links in email clients such as Outlook or Thunderbird, if clicked, will go to direct traffic if they are not fully tagged with the tracking code.

Solution

Full tracking code for links in email templates.

4. Offline traffic such as pdf, powerpoint and word docs

Links in offline documents such as pdf, powerpoint and file words are also often part of direct traffic.

Solution

If the files belong to you, it should be tagged with a link with full tracking code.

5. Wrong campaign code

If for some reasons, the tracking code is wrong (wrong typing, typing), it will cause Google Analytics not to identify and that traffic will also be sent to direct traffic.

Solution

Do not make a mistake while creating the tracking code. Use the URL Builder to ensure the accuracy.

6. Google Analytics code is missing

If for some reason a page on your website is missing the Google Analytics tag, the traffic to that page and from that page to other pages will be considered direct.

Solution

Make sure all pages have tracking code. Screaming Frog SEO Spider can be useful to check all pages for missing code.

7. Traffic from yourself

Maybe you do not think of traffic from your own team, your company can sometimes influence the accuracy of direct sources. If your company has 500 people and every day people visit your website a few times, it may also cause some inaccuracies in the parameters. This in the long run can cause your parameters to deviate much.

Solution

Filter out incoming traffic from corporate network IPs. Go to Account> All Filters and create new filters to filter traffic from an IPs. If the company has multiple IPs, then a filter must be created for each IP.

filter-ips.png

Filter the company’s internal IPs for more accurate data. Source: conversion.vn

Other factors influence the tracking of Google Analytics

In addition to the traffic sources mentioned above, there are other issues that affect the Google Analytics data collection and tracking. Let’s take a look at what these are:

Javascript disabled = no data

Google Analytics uses Javascript to collect data, so if users disable Javascript on their browsers then they will become invisible with this measurement tool. According to a recent study, the number of people who disable Javascript while surfing the web currently accounts for about 2.4%.

Solution

If you really try pushing hard to get 2.4% more data from disabled Javascript users, there is a tutorial on how to set up tracking for this case.

Without cookies, there is no data

Google Analytics also uses cookies to track users more effectively. If for some reason cookies are lost (users clear the cookies or it’s blocked by the firewall) then the user will become a completely new session with this measurement tool.

Solution

No.

Referral spam / ghost spam

Referral spam or ghost spam is the fake traffic generated by spam systems aimed at Google Analytics. You will not actually receive any traffic from these pages, they are only created to show up in Google Analytics and the purpose is to make you curious and open up in the web to check and if you accidentally do that, you give traffic to these sites and profit them (by advertising) or worse, get infected with your computer with virus / adware / malware if you are not careful.

referral-spam-ghost-spam.png

The purpose of these spammers is to get you to open these pages and to accidentally bring traffic to those pages. Source: conversion.vn

Solution

The nature of these referrer spam sources is usually the invalid hostname, so just set up filters to filter the fake traffic to your website but having invalid hostnames.

1. Go to Admin> View> Filters> New Filters, select Custom> Include> Hostname

2. Enter the valid hostname filter of your website as the RegEx string. For example for the conversion.vn website, it will be: ^www.conversion.vn$ | ^conversion.vn$

3. Click verify filters to confirm that the filters are working properly and Save

Be sure to check the Real Time Report to see if this filter is giving traffic to your website.

http://conversion.vn/wp-content/uploads/hostname-filter.png

Filter by only accepting traffic with valid hostname. Source: conversion.vn

* If you are not familiar with these sections, it is best to ask the technical or IT teams to support setting up. And you should create a new View when setting up to test the filters to avoid affecting the data of the current account.

Watching but not actually

Nowadays, opening multiple web pages simultaneously on multiple browser tabs and reading sequentially is very common. Users can open your website on a tab and leave it for 10 minutes and when they return, they reads the  content for only 30 seconds and then exist. However, with Google Analytics, that session will be counted as 10 minutes 30 seconds. This tool can not tell whether an existing user is reading that tab or another tab.

Solution

No.

Multiple users on the same device

Imagine coming to a net rooms or computers in a library, school, a computer which can have multiple users accessing and they can access the same website. But with Google Analytics, all of those people will only be like one person if the session takes place within about 30 minutes (or whatever time you set for this tool).

Solution

Hahaha. No.

Multiple devices, one user

Today a person can own more than one device to access the internet such as a desktop computer, laptop, tablet, or phone. And if they visit the same site on four different devices, Google Analytics will have four sessions and four completely different users.

Solution

Not always. However, Google may have a solution for this situation in the future if they are able to connect user activity through logging into Google accounts on different devices.

The bounce rate is extremely low?

If your website’s bounce rate suddenly drops to a   1 letter number then you should be happy too soon because actually that is impossible and the problem is because you tag the Google Analytics code twice on your site. Or a second case is that the page has iframe added containing a different page that has the same GA code.

bounce-rate-thap.png

Don’t be celebrating so early because there is something wrong here. Source: conversion.vn

Solution

Remove one excess Google Analytics code on your website and not include an iframe containing one page with the same Google Analytics code.

Data is not real time

Another issue from Google Analytics that can affect your analysis and evaluation is the delay in updating your data. Normally, Google data may be 6 to 12 hours slower depending on the website. Sections like Search Engine Optimization data can be up to 48 hours slower.

Solution

Please be patient.

Data sampling when there is too many traffic

If your website has too many traffic (500,000 sessions a month or more for free users and 25,000,000 for Premium users) then it is possible that sometimes your Google Analytics load report will experience the data status sampled.

data-sampling.jpg

Data will automatically be sampled if the website has more traffic. Source: conversion.vn

Data sampled means that instead of having to load all of your session information to make a report that can make you wait too long and time consuming, Google Analytics will not automatically load all of your data, but only take up a portion of your data. and then calculated to give the relative index. So if you ever have a report and then look back later, then the indicators such as sessions, conversions are not due to some problems of your eyes, but may be due to the data sampling.

Solution

You can upgrade to Google Analytics Premium ($ 150,000 / year) to export unsampled data.

Summary

Google Analytics is a very good measurement tool and is constantly updated and getting better. There are a number of issues mentioned above that are subject to users’ behavior and technical limitations, but there are also issues related to users’ neglect, not Google Analytics. But tracking traffic has many challenges and the traffic sources are so diverse and increasingly complex, so it’s important to understand why tools such as Google Analytics can be inaccurate and weak. What causes that? This is necessary for you to anticipate the analysis of your advertising metrics.

And finally, do not forget that with measuring tools like Google Analytics (or any other tool) it’s not “what you see is what you get” but “what you interpret is what you get”. So, in the end, making the right decision based on the perfect numbers or not still depends a lot on the human factors.

If you have any comments, leave it below

Facebook Comments

About:

Hien are is Director of Marketing Store Vietnam Ringier AG. Digital Marketing and passionate about sharing knowledge as expected những scraped past tense as multiple users of coal, I was quyết create conversion.vn blog. If need to contact me, please email as through tu.bui@conversion.vn

Leave a Reply

Your email address will not be published. Required fields are marked *

Name *
Email *
Website