How Google crawls my site

  • Using a robots.txt file
    • How do I request that Google not crawl parts or all of my site?
    • How do I use a robots.txt file to control access to my site?
    • How do I create a robots.txt file?
    • Where do I place my robots.txt file?
    • How do I block Googlebot?
    • I don't want to list every file that I want to block. Can I use pattern matching?
    • How do I test my robots.txt file?
    • If I change my robots.txt file or upload a new one, how soon will it take effect?
    • I don't want certain pages of my site to be indexed, but I want to show AdSense ads on those pages. Can I do that?

    Googlebot
    • Why doesn't Google index all of the pages of my site?
    • How often will Googlebot access my web pages?
    • Googlebot is crawling my site too fast. What can I do?
    • Why is Googlebot asking for a file called robots.txt that isn't on my server?
    • Why is Googlebot trying to download incorrect links from my server? Or from a server that doesn't exist?
    • Why is Googlebot downloading information from our "secret" web server?
    • Why isn't Googlebot obeying my robots.txt file?
    • Why are there hits from multiple machines at Google.com, all with user-agent Googlebot?
    • Can you tell me the IP addresses from which Googlebot crawls so that I can filter my logs?
    • Why don't the pages of my site that Googlebot crawled show up in your index?
    • What kinds of links does Googlebot follow?
    • How do I prevent Googlebot from following links on my pages?
    • How do I tell Googlebot not to crawl a single outgoing link on a page?
    • Why is Googlebot downloading the same page on my site multiple times?
    • What is Feedfetcher, and why is it ignoring my robots.txt file?
    • What can I do if Google is creating too high a load on my server?
    • Why did my firewall report unauthorized access from Google?

    Feedfetcher
    • How do I add my feed to the search results for Google's personalized homepage or Google Reader?
    • How do I request that Google not retrieve some or all of my site's feeds?
    • How often will Feedfetcher retrieve my feeds?
    • Feedfetcher is retrieving my site's feeds too frequently. What can I do?
    • Why is Feedfetcher trying to download incorrect links from my server, or from a server that doesn't exist?
    • Why is Feedfetcher downloading information from our "secret" web server?
    • Why isn't Feedfetcher obeying my robots.txt file?
    • Why are there hits from multiple machines at Google.com, all with user-agent Feedfetcher?
    • Can you tell me the IP addresses from which Feedfetcher makes requests so that I can filter my logs?
    • Why is Feedfetcher downloading the same page on my site multiple times?
    • Why don't the feeds from my site that Feedfetcher requested show up in your index?
    • What kinds of links does Feedfetcher follow?
 


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部