ROBOT IMAGE CREDIT TO PNG TREE.
In July 2020, I installed Hotjar on my website. This tool allowed me to record the screen of my site visitors. As such, I will have an idea how people are navigating my site—if I could see their behavior, I would know the things that I need to improve on.
Then I noticed something bad.
There is a robot from India that was crawling my site, and it was ruining my Google stats.
At first, I thought I should not pay much attention.
After a few days, I observed that the robot was still crawling my site. It was a bad bot traffic.
It took me a while to figure out what to do. When I did my research online, I followed many recommendations, but all of them did not work.
For someone like me who is not techie, I was desperate. But I fixed the problem. And this is what I will
share with you today.
It is important to note here that we do not use the same software. My solution may not apply to you, but I am betting my bottom dollar that the solution I used is applicable to you if you are operating a self-hosted website.
How I found out that it was a bad robot
Take a look at this:
My screenshot only shows two recordings because I already deleted them by the time I thought about sharing this experience.
But in total?
This robot crawled the SAME PAGE 106 times!
Take a look at this screenshot from my Google Analytics:
I knew it was a robot because if I watched the video, the screen was barely moving. If this was a person, he should have been scrolling.
In other occasions, the robot was scrolling too fast. Also, the screen kept on scrolling down even if there was a pop-up. No human does that.
Also, I noticed that many of logs lasted for less than a minute. This blog post is long, and anybody who reads it would take longer than a minute to do so.
So, just to recap, it is a robot because:
- It does not scroll
- It scrolls too fast
- It scrolls even if there is a pop-up
- It visited the same page 100 times in less than a month
- It stayed in the page for less than a minute
So, who was doing it and why?
I do not know. What I do know is that this is from India, as the flag indicated in the recording. Why is it doing it? Maybe it was scraping content. Or maybe it is from a malicious competitor who wants to ruin my stats.
So, how to stop bot attack on website?
Let us move on.
Why is this robot bad?
Now, why should you get worried if this happens to you?
Here are the reasons why you should pay attention:
- Increased bounce rate – because of this mother***er, my bounce rate increased from 80% in Jul 5 to 11 to 91% by July 25. A high bounce rate is bad for a blogger like me. It tells search engines that I am not providing value—that people who visit my site do not explore other sites anymore. If this happens, my pages will not get ranked.
- Skewed data – if I did not find out about this, I would have believed that my traffic has increased by 270%. Any blogger or e-commerce operator would think the same, not realizing that this is not the case. As you can see, if you see your traffic is increasing, you will also think that your action plans are working. The reality is it isn’t—a robot is causing your traffic to sike, not your efforts.
My average session duration is f***ed – Google loves it when people stay in your page for a while. Your average session duration tells Google that people are staying in your page because you are providing value, and it is likely to present your pages to more users. As you know, more impressions = higher click-through-rate.
See what this bastard did to my average session duration:
It reduced my average session duration from two minutes to nine seconds!
What is Google going to think of me now, that I am not providing value?!
I was furious! I really needed to know how to stop bots from crawling my site!
- Pages Per Session – this robot only visited one page 106 times. The page it visited was my blog post where I compared Ecwid with Shopify. Because if this, my pages per session stat was also affected. Normally, what you want is for your site visitors to visit at least more than one page per session. This tells Google that your site visitors loved your content that they wanted to read more.
And what did this robot do?
It dragged my pages per session stats from 2.67 to 1.21! Now, I need to do more work just to get back that. According to Spinutech, the ideal pages per session is 2, which I have achieved.
As you can see, this bot is wreaking havoc to my website. If I did not learn how to stop spam bots on my website, my blog is not going to grow.
I need to do something.
How to block robots from website: most common recommendations of experts
Since I am no expert about this, I did my fair share of research. Below are the things I read about. I followed them, and they did not work.
- Install Re-captcha – what I specifically installed is re-captcha v3. It is a verification software from Google where it asks a user to check a box to confirm that it is not a robot. Since I installed v3, it was supposed to be more than that. It is the version that asked a user to “select images that has a bus,” or things like that.
It did not work. A day after I installed the re-captcha, the robot from India was still at it.
- Plug-ins to block country – the next thing I do was to install plug-ins that were supposed to block all users coming from a country of your choice. I tried the following:
- iQ Block Country
- IP2 Location Countr Blocker
- AntiSpam Bee
None of them worked. The problem with most of these plug-ins is that they have a dependency on a database of IP Addresses. That database is not free. For WordFence and MalCare, they have their own databases, but you have to pay them, too.
- CloudFare – since I am using SiteGround to host my website, I have access to free CloudFare, but this version only supports CDN and caching to make the website load faster. I created an account on CloudFare and reached out to customer service.
And guess what the solution is?
Pay $200 a month to protect my website.
I cannot afford that.
- Block IP – I contacted SiteGround support, and I was told that I can block IP addresses and IP ranges. Great!
Here is how you can block the IP address from your SiteGround tools:
Here is the problem: India does not have one IP address. It has tens of thousands! Also, when I found IP ranges for India, there were about 485 IP ranges.
So, am I to input all IP addresses just so I can block this robot?
Here is the problem with Hotjar and Google Analytics—they do not tell you the IP addresses of the visitor. So, I cannot block that specific IP address of the robot.
There was also a recommendation to change the HTACCESS. Specifically, the instruction was to write a code in HTACCESS to block all bots except Google.
But I do not know how to do that.
So, back to square one.
I decided to unpublish that post.
And guess what.
The following day, the robot accessed the same page several times over!
Now, I can choose to redirect users to another page if that web page was accessed, but it is not going to change a thing. My stats will still suffer. So, I decided to re-publish the post.
I was desperate.
How to block robots from website: my solution
I kept on reading, and I found out that webhosts keep a log IP addresses of your site visitors. I no longer contacted customer support for that. I decided to do it guerilla style.
I found out that on SiteGround, there is a log of IP addresses of your users!
Take a look at this screenshot:
This log even shows what page the IP address accessed.
The first digits you see is the IP address. In this case, it is 18.104.22.168.
My next step was to confirm if this is really from India, so I went to https://whatismyipaddress.com/ and then typed that number.
Here is what it looks like:
If you use this, you have to click on GET IP DETAILS to get more data.
I did this, and here is what I got.
As you can see, the COUNTRY of this IP is INDIA.
Now, I have to check if this IP is blacklisted anywhere, so I clicked the orange button, and here is what I got:
Those that are green tagged this IP address as OK, or safe. As you can see, they are all dubious websites. Those that are marked with an exclamation point reported this IP address as a malicious IP address.
If you scroll down, you will see more results, buy I will now show them here anymore.
Now that I have confirmed this IP address, I went back to my SiteGround tools > Security > Blocked IP addresses.
I typed this IP address and hit the block button.
From then on, this robot can no longer access my page. And this is why you can only see two recordings of that robot. I took the screenshot BEFORE I blocked the IP addresses.
This is why I love SiteGround!
SiteGround is a webhosting provider. You only pay as little as $7 per month to have unlimited websites. I love the support and the tools offered-so easy to use even for non-tech people like me.
I wrote a blog about my experience with this provider, and I want you to read that.
You can read it here: SiteGround Review: My Experience with SiteGround
If you already have a website. I suggest you transfer to SiteGround. You can migrate one website for free. If you want to migrate several websites, the succeeding migration is paid.
I strongly doubt you can do what I just did if you are on a paid platform. I will not name names, but any non-self-hosted platform does not allow you to access any of this stuff.
Are there good robots that crawl websites?
Yes, there are good robots. Google is a robot, and you do not want to block this robot. Without good robots, search engines will never know the content of your website.
Here is a list of the top ten robots:
- Yandex Bot
- Soso Spider
- Google Feedfetcher
- ExaBot 35%
- Baidu Spider
- Sogou Spider
- Google Plus Share
- MSN Bot/BingBot
- Facebook External Hit
These robots crawl your web pages because they are search bots.
Bad bots, on the other hand, can attack you and destroy your website. Experts say that about 66% of robots are malicious robots. They are either scraping your content, or they are trying to steal your passwords.
Some are even used by companies to make sure that they ruin your stats. In some cases, bad robots are there to click on ads—the more they click your ads, the faster your ad budget gets depleted.
Summary: How to Block Robots from Website
If not for HotJar, I wouldn’t have known about this thing happening. You see, I spent several months creating content, and now that the third trimester of my blogging is over, I have produced more than a hundred blog posts.
My next goal is to do some serious optimization, and that includes observing a site visitor’s behavior and see what I can improve on a page.
For example, I saw that people were trying to click on some of my images, but these images had no links. So, now I am going to add links to these photos to give a better user experience.
5 Effective Ways on How to Promote a Website
Shopify Simple Theme Review: does minimalism work?
Top 5 Benefits of Email Marketing for e-Commerce
10 Best Lead Magnets to Build Your Email List
The 5 Best Ways to Advertise Your Dropshipping Store
Share this knowledge!
or Follow me on Social Media!
My Recommended Tools
Why waste time and spend thousands of dollars when I already did? Stop wasting your money testing tools that do not work. I already did that. Check my recommended tools so you get only the best.