丹麦网站后缀,如何自建一个网站,python 网站开发实例教程,网站为什么要续费自己网站的ROBOTS.TXT屏蔽的记录#xff0c;以及一些代码和示例#xff1a; 屏蔽后台目录#xff0c;为了安全#xff0c;做双层管理后台目录/a/xxxx/#xff0c;蜘蛛屏蔽/a/#xff0c;既不透露后台路径#xff0c;也屏蔽蜘蛛爬后台目录 缓存#xff0c;阻止蜘蛛爬静态…自己网站的ROBOTS.TXT屏蔽的记录以及一些代码和示例 屏蔽后台目录为了安全做双层管理后台目录/a/xxxx/蜘蛛屏蔽/a/既不透露后台路径也屏蔽蜘蛛爬后台目录 缓存阻止蜘蛛爬静态缓存文件 下载阻止蜘蛛爬下载目录若无用删除下载目录 编辑器阻止蜘蛛爬编辑器也防止编辑器目录被发现产生安全隐患 邮件阻止蜘蛛爬静态邮件模板 其他页面无收录价值页面屏蔽 图片阻止蜘蛛爬除JPG/jpg类文件之外的任何类型图片 核心文件目录阻止蜘蛛直接爬include及其子目录函数/类库/模型/模板等 媒体目录阻止爬播放类型媒体目录若无用删除该目录 附加参数页面阻止蜘蛛爬带参数的页面 RAR ZIP GZ文件类型 无效蜘蛛、恶意蜘蛛屏蔽 指定sitemap.xml位置 目录屏蔽 User-agent: * Disallow: /a/ Disallow: /cache/ Disallow: /download/ Disallow: /editors/ Disallow: /email/ Disallow: /extras/ Disallow: /images/ Disallow: /includes/ Disallow: /media/ Disallow: /pub/ Disallow: /nddbc.html Disallow: /page_not_found.php Disallow: /login.html Disallow: /privacy.html Disallow: /conditions.html Disallow: /contact_us.html Disallow: /gv_faq.html Disallow: /discount_coupon.html Disallow: /unsubscribe.html Disallow: /shopping_cart.html Disallow: /ask_a_question.html Disallow: /popup_image_additional.html Disallow: /product_reviews_write.html Disallow: /tell_a_friend.html Disallow: /pages-popup_image.html Disallow: /popup_image_additional.html Disallow: /login.html 阻止蜘蛛爬非jpg图片限制产品图片格式为jpg User-agent: Googlebot Allow: .jpg$ Disallow: .jpeg$ Disallow: .gif$ Disallow: .png$ Disallow: .bmp$ 阻止蜘蛛爬压缩文件 User-agent: * Disallow: .zip$ Disallow: .rar$ Disallow: .gz$ Disallow: .tar $ 制定sitemap地址 Sitemap: http://www.xxx.jp/sitemap.xml 其他无效蜘蛛、恶意蜘蛛屏蔽 User-Agent: almaden Disallow: / User-Agent: ASPSeek Disallow: / User-Agent: Axmo Disallow: / User-Agent: BaiduSpider Disallow: / User-Agent: booch Disallow: / User-Agent: DTS Agent Disallow: / User-Agent: Downloader Disallow: / User-Agent: EmailCollector Disallow: / User-Agent: EmailSiphon Disallow: / User-Agent: EmailWolf Disallow: / User-Agent: Expired Domain Sleuth Disallow: / User-Agent: Franklin Locator Disallow: / User-Agent: Gaisbot Disallow: / User-Agent: grub Disallow: / User-Agent: HughCrawler Disallow: / User-Agent: iaea.org Disallow: / User-Agent: lcabotAccept Disallow: / User-Agent: IconSurf Disallow: / User-Agent: Iltrovatore-Setaccio Disallow: / User-Agent: Indy Library Disallow: / User-Agent: IUPUI Disallow: / User-Agent: Kittiecentral Disallow: / User-Agent: iaea.org Disallow: / User-Agent: larbin Disallow: / User-Agent: lwp-trivial Disallow: / User-Agent: MetaTagRobot Disallow: / User-Agent: Missigua Locator Disallow: / User-Agent: NetResearchServer Disallow: / User-Agent: NextGenSearch Disallow: / User-Agent: NPbot Disallow: / User-Agent: Nutch Disallow: / User-Agent: ObjectsSearch Disallow: / User-Agent: Oracle Ultra Search Disallow: / User-Agent: PEERbot Disallow: / User-Agent: PictureOfInternet Disallow: / User-Agent: PlantyNet Disallow: / User-Agent: QuepasaCreep Disallow: / User-Agent: ScSpider Disallow: / User-Agent: SOFT411 Disallow: / User-Agent: spider.acont.de Disallow: / User-Agent: Sqworm Disallow: / User-Agent: SSM Agent Disallow: / User-Agent: TAMU Disallow: / User-Agent: TheUsefulbot Disallow: / User-Agent: TurnitinBot Disallow: / User-Agent: Tutorial Crawler Disallow: / User-Agent: TutorGig Disallow: / User-Agent: WebCopier Disallow: / User-Agent: WebZIP Disallow: / User-Agent: ZipppBot Disallow: / User-Agent: Xenu Disallow: / User-Agent: Wotbox Disallow: / User-Agent: Wget Disallow: / User-Agent: NaverBot Disallow: / User-Agent: mozDex Disallow: / User-Agent: Sosospider Disallow: / User-Agent: Baidupider Disallow: / 转载于:https://www.cnblogs.com/zsqx5e/p/3996553.html