I looked at Google Search Console today and I seem to have almost 20,000 more pages crawled on my site that I didn’t create and are not in my CMS. All of these new pages have these long Chinese slugs and lead to a 404. Wondering if anyone else has experienced this. My site is completely hosted on Webflow.
Rest assured that this doesn’t necessarily indicate that your site was hacked. The fact that the requested URLs are returning 404’s indicates that the pages do not exist on your site.
While i’m by no means an SEO Expert, my understanding is that this just means that a random site somewhere has created these random links to your site on one of their own pages, which Google has then crawled. But it doesn’t necessarily harm your site
This is a fairly common occurrence (especially on high-traffic sites), and Google have some great insight into this over on their Search Central Support hub:
Q: Most of my
404errors are for bizarro URLs that never existed on my site. What’s up with that? Where did they come from?
A: If Google finds a link somewhere on the web that points to a URL on your domain, it may try to crawl that link, whether any content actually exists there or not; and when it does, your server should return a
404errors reported in Webmaster Tools for URLs that don’t exist on your site, you can safely ignore them. We don’t know which URLs are important to you vs. which are supposed to
404, so we show you all the
404errors we found on your site and let you decide which, if any, require your attention.
Thanks Josh! That’s good to know. Do you know of a way to block these pages from being crawled and associated with my site through Google? I only ask because these 20,000 extra pages are drastically skewing my metrics.