Hey, i launched my site 2 days ago and connected it to google search console. However the sitemap verification didn’t went through (http error) I specifically allowed it in the robots.txt & doublechecked head code + that all pages are included.
Heureka! i didn’t put the https://www.yourdomain.com domain in the property and there was an cached version that prevented it from running. Thank you so much for your help!!!
Yes, Googlebot can be blocked using robots.txt file. If you use this file in your site so you can decide which part of your site can be scaned or not using robots.txt.
I thought I would share as this happened to us and this is how to resolve.
When hosting on Webflow your robots.txt is blank, we then entered this to stop the indexing whist the site was being built:
User-agent: *
Disallow: /
We then just removed the text/content and re published, but the site was still being blocked, so we then entered this to allow the crawl and re published but it was still being blocked:
User-agent: *
Disallow:
If you put any robots.txt in and publish you cant just remove the txt/code you have to have the correct format in, but the problem is Google can take upto 48hrs to correct the issue.
All you have to do is click the link below and force it to re index with the correct robots.txt and then re-submit to search console and all will be well!