Okay, given how things are going, do we know if the Internet Archive has a backup plan for when these fucks attack it in earnest?
Was this previously public data? Not illegal to download an torrent, right?
from the linked page
Excludes corrupt datasets and data not publicly accessible.
Time to download all of it!
Same, especially before the inevitable attacks on the Internet Archive to come. Who knows what nonsense will be in the works to try and get this removed, or the whole project shut down in the coming years.
Because the feds didn’t already have it out for IA.
How would you recommend someone go about archiving important parts of the IA? Just external drives?
The Internet Archive is, and I really want to emphasize this, Fucking Huge. If you want to help archive it, every upload has an associated torrent you can download and help seed. Torrenting itself isn’t illegal, only torrenting illegal stuff like copyrighted movies. You can buy a relatively cheap refurbished HDD of whatever size you want, set up qBittorrent, and torrent the uploads that you want to make sure are available even if the Internet Archive has to take them down or has a critical data loss failure.
i’m not smart enough for this but maybe look to communities like r/DataHoarder to get started
What’s CDC?
What a weird name…
I mean, not really. It is a research and policy center for controlling diseases.
well and truly based
it sounds like it’s only stuff that was already publicly available tho
key word was
Some of the publicly available data is disappearing under the new administration. Most notably information about COVID, long COVID, vaccines, and bird flu is disappearing. Presumably, this data dump contains the missing data.
Oh I hadn’t heard of that, I thought it was just stopping new data
Nope - they’re literally destroying data if it doesn’t align with their super regressive views on sexual identity stuff, amongst other things
Actual image of DOGE employee deleting CDC data
spoiler
It’s funny because the Nazis themselves (the 1930s ones, not the 2020s ones) also started their book burning on literally the exact same topic.
spoiler
And by “funny” I mean “not funny at all in the slightest, holy fucking shit!”
spoiler
It’s kinda neat that you can nest spoiler tags, by the way.
Importantly they are also removing all mentions of climate change. I imagine they’ll be deleting data on that front as well.
That happened last admin, so def wouldn’t be a shocker.
Get ready to Donate to their legal defense fund
you’re right and you should say it but it makes me sad
As long as money still means something after Elon is through with the Treasury…
It it long past overdue for the Internet Archive to move to the EU or Switzerland or something.
Yep.
I wish they also could implement a decentralised hosting protocol, though I know currently that technology is in it’s infancy.
isn’t that just a torrent?
There are different protocols that attempt to work for things like web hosting, but yes, the BitTorrent protocol is a decentralised file sharing protocol.
Would be best if there were several mirrors in several countries. It’s unfortunately too large to realistically host via crowd sourcing. The best you could do is something ala Storj where fragments are redundantly distributed across various hosts.
hi spujb. Only 98gb? I can mirror that 🤷♀️
I suggest also mirroring on https://academictorrents.com/
sry i dont know what that is but once i have all the data ill post a link here. im hosting in france and i am also outside the us so i will not take down the data at tronald dumps request tyvm.
Incredible 🫡
Use his original last name. Drumph. It pisses him off as much as being told that he has baby hands.
His father or grandfather changed it.
Based kate
Inb4 it gets DDoS’d again
We are screwed if the Internet Archive goes down, right?
Seems like a huge point of failure for one entity.
Agreed, I think the biggest issue though is just scale. It’s over 100 petabytes of data. Not outside the realm of big cloud providers to mirror, but they don’t really give a shit. It would require some sort of significant distributed software solution for the community to work with. Not impossible, but as far as I know, nobody’s taken up the mantle yet as I think it would need custom software just to begin the solution of how to distribute it as a sharded set of community mirrors, different people just mirroring individual pieces.
HexOS has a plan for shared encrypted data. With the simplicity of installation and management it could take off mainstream as personal NAS are gaining popularity, but its still in early development.
Interplanetary File System can do it
IPFS, GnuNet?
IPFS is the way to go IMO, it’s so perfect for archival that it pains me that it’s still pretty unknown
the fact that you don’t need any sort of central organization for everyone to help seed data is amazing, no more duplicate torrents splitting seeders, so long as you have identical data the network just figures it out.
If you have the hash for a piece of data you can just set a computer to watch for someone to start seeding it, even if the last time anyone saw the data was decades ago and a dude just found a CD in their recently passed dad’s basement, if that dude seeds it overnight and then their computer explodes, you’ve now downloaded it and it’ll remain available. It’s so fucking good.
Good thing they’re based far from the US in… oh.