Due to server slowness, downtime, and other issues, Eclipse will be moving to a more stable and efficient platform that should result in much better stability. There is no timeline for this yet, just want you to know what's happening with all the downtime and I have a plan to fix it.

Crawlers

Site suggestions.
User avatar
Duke
Full Moderator
Posts: 544
Joined: 16 Mar 2024, 13:32
OS: Windows 8.1 x64
Has thanked: 98 times
Been thanked: 181 times

Crawlers

Unread post by Duke »

Considering the nature and the content of this forum, it might be a good idea to block the search engines crawlers like Ahrefs [Bot], Bing [Bot], Google [Bot], Semrush [Bot] from browsing this forum ;)

An example of robots.txt from another forum:

Code: Select all

User-agent: Amazonbot 
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: SemanticScholarBot
Disallow: /

User-agent: PetalBot
Disallow: /

User-agent: YandexBot
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: AhrefsBot
Crawl-delay: 10
Disallow: /ajax.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /global.php
Disallow: /image.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

User-agent: *
Disallow: /ajax.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /global.php
Disallow: /image.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

User avatar
the_r3dacted
Lazy Owner
Posts: 1273
Joined: 11 Jan 2021, 07:40
Location: ur dads house
OS: Windows 8.1 x64
Has thanked: 890 times
Been thanked: 502 times
Contact:
United States of America

Crawlers

Unread post by the_r3dacted »

And kill discoverability? no lol
k4sum1 who?

I might know what I'm doing not the hit album by brad sucks

User avatar
Duke
Full Moderator
Posts: 544
Joined: 16 Mar 2024, 13:32
OS: Windows 8.1 x64
Has thanked: 98 times
Been thanked: 181 times

Crawlers

Unread post by Duke »

Google [Bot] ok, but AdsBot [Google] and Amazon [Bot] :evil:

Are you sure you want those ? They don't bring you discoverability at all but crap :thumbdown:

User avatar
the_r3dacted
Lazy Owner
Posts: 1273
Joined: 11 Jan 2021, 07:40
Location: ur dads house
OS: Windows 8.1 x64
Has thanked: 890 times
Been thanked: 502 times
Contact:
United States of America

Crawlers

Unread post by the_r3dacted »

too busy and too lazy
k4sum1 who?

I might know what I'm doing not the hit album by brad sucks

User avatar
the_r3dacted
Lazy Owner
Posts: 1273
Joined: 11 Jan 2021, 07:40
Location: ur dads house
OS: Windows 8.1 x64
Has thanked: 890 times
Been thanked: 502 times
Contact:
United States of America

Crawlers

Unread post by the_r3dacted »

Since the server likes to die every day, I wanted to try to tackle some of the potential reasons. That included finally implementing some sort of robots.txt

I didn't feel like outright blocking search engines like Yandex or AI bots, other than those from companies I hate like Meta. Also some SEO tools seem useless and could spam my shit so I blocked them outright. For example Ahrefs is blocked both in robots.txt and on a firewall level now.

So using your robots.txt as a base and with the help of a friend, I came up with this:
https://board.eclipse.cx/robots.txt
k4sum1 who?

I might know what I'm doing not the hit album by brad sucks

User avatar
Duke
Full Moderator
Posts: 544
Joined: 16 Mar 2024, 13:32
OS: Windows 8.1 x64
Has thanked: 98 times
Been thanked: 181 times

Crawlers

Unread post by Duke »

Well done! :thumbup:

Compa
Banned
Posts: 498
Joined: 13 Jan 2021, 08:09
Has thanked: 24 times
Been thanked: 6 times

Crawlers

Unread post by Compa »

Duke wrote: 12 Nov 2024, 23:13 Well done! :thumbup:
It took me about an hour to convince him to do a proper job of it.
Thanks for providing a nice template for phpBB though, that really helped us. :)

User avatar
Duke
Full Moderator
Posts: 544
Joined: 16 Mar 2024, 13:32
OS: Windows 8.1 x64
Has thanked: 98 times
Been thanked: 181 times

Crawlers

Unread post by Duke »

AI crawlers attacks and abuse:
https://news.ycombinator.com/item?id=43422413

User avatar
Duke
Full Moderator
Posts: 544
Joined: 16 Mar 2024, 13:32
OS: Windows 8.1 x64
Has thanked: 98 times
Been thanked: 181 times

Crawlers

Unread post by Duke »

About server slowness, downtime, and other issues maybe you should really consider using some filter like Anubis:
https://github.com/TecharoHQ/anubis

It's been used on Mozillazine.org but there are other ones.
Many forums are experiencing the same slowness or access issue these days because of AI crawlers, whatever and wherever the hosting is.

Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests