Googlebot blocked (by robot.txt)

Constantin · March 2, 2020, 6:15pm

Hey, i launched my site 2 days ago and connected it to google search console. However the sitemap verification didn’t went through (http error) I specifically allowed it in the robots.txt & doublechecked head code + that all pages are included.

https://www.keinemaklerei.at/robots.txt
https://www.keinemaklerei.at/sitemap.xml

it looks like the site is blocked for bots - however I cannot explain why? I would be really greatful if someone could help me figure this out.

Thank you so much, cheers, Constantin

JanneWassberg · March 2, 2020, 6:29pm

Your robot.txt is wrong try

User-agent: *
Disallow:

Constantin · March 2, 2020, 6:40pm

Hi Janne, are you sure? Am i getting this wrong?

https://www.robotstxt.org/robotstxt.html

JanneWassberg · March 2, 2020, 6:42pm

Im very sure, but do not add /

Constantin · March 2, 2020, 6:59pm

ahh thank you - missunderstood that completely. I updated it and checked with the following url, however still no luck :-/ Any ideas why?

https://search.google.com/test/mobile-friendly

JanneWassberg · March 2, 2020, 7:02pm

That is a mobile test. Go to Google search console and ad your url

Constantin · March 2, 2020, 7:16pm

I did that too. in search console, I deleted the old sitemap, where the http error shows. Then I added it again, however the same error shows up.

i did the live test in google search console, it states the url is not available to google.

!“URL not available”
!“Crawling not allowed”

JanneWassberg · March 2, 2020, 7:23pm

Did you republish webflow after your change?

Constantin · March 2, 2020, 7:30pm

yes, the new robots.txt was published before…

JanneWassberg · March 2, 2020, 7:34pm

Have you verified your site for google search console

Constantin · March 2, 2020, 7:42pm

yes i did that 3 days ago. The test states that the robots.txt is the problem but before this i already tried various things.

JanneWassberg · March 2, 2020, 7:46pm

I always register the site.xml for each domain
Http://www,domain.con/site map.xml
Https://
Http://domain…com/site map.xml
Https://

Have you checked that the robot.txt is correct generated?

Edit,
Check that the sitemap is turned on in webflow.
Also a blank robot.txt allows browsing

Constantin · March 2, 2020, 8:57pm

Thank you for your help! I blanked the robots.txt now, It was correct before. The auto sitemap is on.

Constantin · March 2, 2020, 10:21pm

Heureka! i didn’t put the https://www.yourdomain.com domain in the property and there was an cached version that prevented it from running. Thank you so much for your help!!!

bracknelson · March 3, 2020, 10:32am

Yes, Googlebot can be blocked using robots.txt file. If you use this file in your site so you can decide which part of your site can be scaned or not using robots.txt.

Nosher · March 3, 2020, 8:02pm

Hi Guys,

I thought I would share as this happened to us and this is how to resolve.

When hosting on Webflow your robots.txt is blank, we then entered this to stop the indexing whist the site was being built:

User-agent: *
Disallow: /

We then just removed the text/content and re published, but the site was still being blocked, so we then entered this to allow the crawl and re published but it was still being blocked:

User-agent: *
Disallow:

If you put any robots.txt in and publish you cant just remove the txt/code you have to have the correct format in, but the problem is Google can take upto 48hrs to correct the issue.

All you have to do is click the link below and force it to re index with the correct robots.txt and then re-submit to search console and all will be well!

https://www.google.com/webmasters/tools/robots-testing-tool?pli=1

Hope that helps

Benjamin4 · January 26, 2023, 8:20pm

Amazing
THANKS A LOT

Topic		Replies	Views
Google indexing failing SEO	5	1234	September 22, 2022
Googlebot blocked by robots.txt General	6	5304	November 22, 2021
Website not indexing SEO	4	1105	October 12, 2020
Page Cannot Be Indexed: Blocked by robots.txt SEO	8	2242	July 28, 2023
Robots.txt is blocking Google Indexing Project settings	7	4112	January 22, 2021

Googlebot blocked (by robot.txt)

Related topics