Ah, referral spam. A problem that affects everyone who has a Google Analytics account. These are the crazy sources you see in your analytics, such as get-traffic-now, darodar, and other junk sites which just hit your site over and over again, making it difficult to distinguish real visitors. When users see this in their accounts, they take a look at the referral source because they want to know about this new site sending traffic. When you visit, they may try to sell you something or are simply trying to promote a page which is why you never want to go directly to these sites. If you are curious about the site sending you traffic, do an internet search first to see comments from other people about the site. Since it doesn’t seem this spam referral problem will go away any time soon, you will need to use filters and segments to remove this data from your account. Remember, a filter will only affect your data moving forward so you will also want to use a segment to apply this to your historical data.
Filters- Include valid hostnames
Before you apply any filters, make sure you first create a new view of your data so you don’t have any permanent loss of the type of data you do want. If you are unclear of the difference between segments and filters, take a look at this summary article first. This is a tough one because there is constantly new crawler spam.
A valid hostname filter is one that will prevent future hits from ghost spam. To ensure you do not exclude good traffic, you first need a list of all valid hostnames for your site. You can find this in Audience > Technology > Network and make sure the primary dimension of Hostname is selected. What you are looking for here is all the places where you installed your Google Analytics tracking code, which would be the sites that you own. Any hostnames you do not recognize are invalid.
What you do from this point is create a filter to include only the valid hostnames that you discovered in the previous step. Make sure you verify this filter before you save it.
(An even more in depth explanation of these filters is available from ohow.com, which is where I first learned of this solution).
Filters – Exclude spam sources
With a campaign source filter, you can exclude all the known spam crawlers out there. Lone Goat had a running list of all the possible spambots which unfortunately just grew too big. As of his last update in August of 2015, users needed 25 separate spam filters to exclude all the known sources! (Each filter can hold 255 characters so it shows exactly how big this problem has become when it takes 25 filters to get rid of them).
He now refers people to a tool developed by Simo Ahava who is an absolutely genius in this space. With Simo’s tool, you “can automatically create and link referral spam filters to your Google Analytics profiles”. This is a huge time saver to setup your filters to exclude these referrals moving forward. You will want to have one view of your Google Analytics data with filters to include the valid hostnames and exclude the spam sources.
New filters will clean up the data moving forward, but it does not fix historical data. With a segment, you have an immediate fix since you can instantly view your legitimate data with this spam information removed. Like with filters, you’ll include valid hostnames and exclude the source of the crawler spam. However, instead of adding that as a filter in the Admin section of your account, you will choose the option to Add Segment. This segment will include legitimate hostnames and exclude spam sources.
What is even better is that most of this work for a spam referral segment has already been done by other users and shared in the Solutions Gallery. The gallery holds user-generated segments for a number of things and as of this writing, there’s a good one for spam referrals. The one I use in my Google Analytics accounts was created by Analytics Edge and last updated in August of this year. Once you import it into your account, you will want to modify it to include your hostnames. Also, you may need to add some sites for whichever spammers have popped up since August, but at least most of the work has been done already
Both segments and filters are an absolute must for anyone reporting out on Google Analytics data. On some sites, I’ve seen traffic reduced by almost half once these sites are removed. If you do not use segments and filters, your reports will not provide an accurate picture of your marketing efforts.