Eclipse Community
https://board.eclipse.cx/

Crawlers
https://board.eclipse.cx/viewtopic.php?t=744
Page 1 of 1
Author:  Duke [ 30 Aug 2024, 10:42 ]
Post subject:  Crawlers

Considering the nature and the content of this forum, it might be a good idea to block the search engines crawlers like Ahrefs [Bot], Bing [Bot], Google [Bot], Semrush [Bot] from browsing this forum ;)

An example of robots.txt from another forum:
User-agent: Amazonbot 
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: SemanticScholarBot
Disallow: /

User-agent: PetalBot
Disallow: /

User-agent: YandexBot
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: AhrefsBot
Crawl-delay: 10
Disallow: /ajax.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /global.php
Disallow: /image.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

User-agent: *
Disallow: /ajax.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /global.php
Disallow: /image.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

Author:  the_r3dacted [ 30 Aug 2024, 11:27 ]
Post subject:  Crawlers

And kill discoverability? no lol

Author:  Duke [ 30 Oct 2024, 14:23 ]
Post subject:  Crawlers

Google [Bot] ok, but AdsBot [Google] and Amazon [Bot] :evil:

Are you sure you want those ? They don't bring you discoverability at all but crap :thumbdown:

Author:  the_r3dacted [ 30 Oct 2024, 15:04 ]
Post subject:  Crawlers

too busy and too lazy

Author:  the_r3dacted [ 12 Nov 2024, 18:40 ]
Post subject:  Crawlers

Since the server likes to die every day, I wanted to try to tackle some of the potential reasons. That included finally implementing some sort of robots.txt

I didn't feel like outright blocking search engines like Yandex or AI bots, other than those from companies I hate like Meta. Also some SEO tools seem useless and could spam my shit so I blocked them outright. For example Ahrefs is blocked both in robots.txt and on a firewall level now.

So using your robots.txt as a base and with the help of a friend, I came up with this:
https://board.eclipse.cx/robots.txt

Author:  Duke [ 12 Nov 2024, 23:13 ]
Post subject:  Crawlers

Well done! :thumbup:

Author:  Compa [ 13 Nov 2024, 04:12 ]
Post subject:  Crawlers

Duke wrote: *  12 Nov 2024, 23:13
Well done! :thumbup:
It took me about an hour to convince him to do a proper job of it.
Thanks for providing a nice template for phpBB though, that really helped us. :)

Author:  Duke [ 20 Mar 2025, 23:23 ]
Post subject:  Crawlers

AI crawlers attacks and abuse:
https://news.ycombinator.com/item?id=43422413

Author:  Duke [ 25 Feb 2026, 14:06 ]
Post subject:  Crawlers

About server slowness, downtime, and other issues maybe you should really consider using some filter like Anubis:
https://github.com/TecharoHQ/anubis

It's been used on Mozillazine.org but there are other ones.
Many forums are experiencing the same slowness or access issue these days because of AI crawlers, whatever and wherever the hosting is.

Author:  the_r3dacted [ 27 Feb 2026, 15:09 ]
Post subject:  Crawlers

Duke wrote: *  25 Feb 2026, 14:06
About server slowness, downtime, and other issues maybe you should really consider using some filter like Anubis:
https://github.com/TecharoHQ/anubis
Not going to do that. https://github.com/Eclipse-Community/r3dfox/issues/30

Author:  Duke [ 27 Feb 2026, 20:27 ]
Post subject:  Crawlers

the_r3dacted wrote: *  27 Feb 2026, 15:09
Not going to do that.
Your choice. But that really helped many forums from being overloaded.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Limited