Seems like it has potential for abuse, though, aka forcibly driving traffic. It can be defeat-able though, it’s just meant to deter lazy human impulses.
I’ll make a complementary argument below in a sec, but “enforcing driving traffic” seems like a feature, not a bug.
For how testy people get about crawling for copyrigted stuff for things like AI, everybody seems super chill about search engines and aggregators ripping off content at industrial scales with zero repercussions.
Tbh, I’d be less testy about bots scraping my sites for AI input IF they respected my robots.txt file and didn’t slam the server. They’re just rude and I don’t like it. Sometimes they’re so rude it’s effectively a DOS attack.
Tbh, my sites exist to get information out there and I don’t care if someone mirrors my sites, as long as the information is still accurate.
I mean, that’s great and you’re well within your rights, but that’s not what people generally say when they express outrage about AI scraping. People straight up call it theft very often and seem to consider using online content for training is the equivalent of copying or distributing it.
Which stands out to me because that was not what happened when the EU decided that Google News was effectively piracy after a whole bunch of news outlets complained. The consensus there seemed to be that it was a bummer to lose the service despite all the scraping.
I would get behind “click through before vote”
Seems like it has potential for abuse, though, aka forcibly driving traffic. It can be defeat-able though, it’s just meant to deter lazy human impulses.
I’ll make a complementary argument below in a sec, but “enforcing driving traffic” seems like a feature, not a bug.
For how testy people get about crawling for copyrigted stuff for things like AI, everybody seems super chill about search engines and aggregators ripping off content at industrial scales with zero repercussions.
Tbh, I’d be less testy about bots scraping my sites for AI input IF they respected my robots.txt file and didn’t slam the server. They’re just rude and I don’t like it. Sometimes they’re so rude it’s effectively a DOS attack.
Tbh, my sites exist to get information out there and I don’t care if someone mirrors my sites, as long as the information is still accurate.
I mean, that’s great and you’re well within your rights, but that’s not what people generally say when they express outrage about AI scraping. People straight up call it theft very often and seem to consider using online content for training is the equivalent of copying or distributing it.
Which stands out to me because that was not what happened when the EU decided that Google News was effectively piracy after a whole bunch of news outlets complained. The consensus there seemed to be that it was a bummer to lose the service despite all the scraping.