Discuz! Board

 找回密碼
 立即註冊
搜索
熱搜: 活動 交友 discuz
查看: 8|回復: 0

Robotstxt and not otherwise be

[複製鏈接]

1

主題

1

帖子

5

積分

新手上路

Rank: 1

積分
5
發表於 2024-3-7 12:56:45 | 顯示全部樓層 |閱讀模式
Your website indeed search engines crawl scan the web and on each page they follow the links present on it which allows them to move from one site to another but before browsing a web page search engine robots like google will read the robotstxt file to find out if they have authorization to visit it where is the robotstxt file the robotstxt file for your site must be located at the root of your website for example for the web horspiste site httpswwwwebhorspistecom it is located here httpswwwwebhorspistecomrobotstxt it is necessary that your robotstxt file be named careful its case sensitive so dont put capital letters in its name so




no robotstxt robotstxt or anything else if a crawler does not find a robotstxt file at the root of your site then it will assume that it does not have one and will proceed to crawl the site the advantages and disadvantages Hong Kong Phone Number  of the robotstxt file advantages spiders crawling robots have a predefined allocation of time or resource for each website this is what we call the crawl budget if your site has crawl budget problems to avoid wasting it on nonimportant pages such as the login page thanks or shopping cart of an ecommerce or others you can indicate to search engines not to explore these pages it can also prevent your sites internal search results pages from being crawled indexed or appearing in search results it can also prevent duplicate content from being detected by google disadvantages the robotstxt file cannot







prevent indexing in fact google can index a page without having crawled it if a significant number of links point to this url it will include it in the index it will just ignore the content of this page it will place a title according to the anchors and for the meta description it will display this meta description disallow robots txt if you want to block the indexing of a page prefer to use the meta tag robots and remove disallow because otherwise google will not be able to know that the said page is in noindex to block all crawlers meta name robots  content noindex  to block googlebot meta name googlebot  content

回復

使用道具 舉報

您需要登錄後才可以回帖 登錄 | 立即註冊

本版積分規則

Archiver|手機版|自動贊助|GameHost抗攻擊論壇

GMT+8, 2025-3-17 02:21 , Processed in 0.115734 second(s), 18 queries .

抗攻擊 by GameHost X3.4

© 2001-2017 Comsenz Inc.

快速回復 返回頂部 返回列表
一粒米 | 中興米 | 論壇美工 | 設計 抗ddos | 天堂私服 | ddos | ddos | 防ddos | 防禦ddos | 防ddos主機 | 天堂美工 | 設計 防ddos主機 | 抗ddos主機 | 抗ddos | 抗ddos主機 | 抗攻擊論壇 | 天堂自動贊助 | 免費論壇 | 天堂私服 | 天堂123 | 台南清潔 | 天堂 | 天堂私服 | 免費論壇申請 | 抗ddos | 虛擬主機 | 實體主機 | vps | 網域註冊 | 抗攻擊遊戲主機 | ddos |