ego008
ego008
2743 2 0

好多Spider 不遵守 robots.txt 规则

一个GAE 的应用最近Datastore Read Operations 总超出免费配额,看log 原来是很多蜘蛛爬的结果。
于是在robots.txt 上写着“狗和白痴不许入内”,但还是没用,只能记录ip屏蔽了。

0

See Also

Nearby


Discussion (2)

苦逼码农
苦逼码农 2013-04-03 05:07

User-agent: Baiduspider
Disallow: /
User-agent: sogou spider
Disallow: /
User-agent: Googlebot
Disallow: /
User-agent: Slurp
Disallow: /
User-agent: ia_archiver
Disallow: /
User-agent: MSNBot
Disallow: /
User-agent: Robozilla
Disallow: /
User-agent: googlebot-image
Disallow: /
User-agent: *
Disallow: /

0
ego008
ego008 2013-04-04 11:51

@苦逼码农 有些蜘蛛压根就没看robots.txt

0
Login Topics