@kensanata "Save page now" on https://archive.org/web/
However, they automatically index pretty much everything that's truly public. Meaning no archiving of ANY Facebook posts etc., because of the login wall. So no, the corps will not remember you for the rest of history. The Internet Archive will.
@krozruch @raucao @kensanata They should open up torrents so that people can seed the data forever, this also stops them from erasing the past.
@siraben @krozruch @raucao Or you could create a federal mandate: make them part of the Library of Congress and make it a mandate that they save a copy of everything, including copyright exemption. They can stop publishing contested material but they will keep a copy. And people can at least walk on and make their own copies.
@krozruch @siraben Somebody posted this, today: «Note there are also backups by public institutions http://www.bnf.fr/en/professionals/digital_legal_deposit/a.digital_legal_deposit_web_archiving.html If I die, the nationl library will have a copy of my blog :-)»
@kensanata @krozruch @raucao I don't know what the best compromise is for archiving the internet. Is more always better? Do we really need to remember every version of every single meme in the future? Maybe. Who knows if they'll be useful? But then again, maybe not.
@krozruch @raucao @kensanata Exactly. But do we want to have the power to arbitrarily delete content? Who decides what content is kept? Hard questions.
@siraben @krozruch @kensanata They're not that hard, if you accept that anything that is public can be archived by anyone anyway. Learn more about the Internet Archive, and if their FAQ and content aren't enough, ask them questions. This has all been talked about for decades.
@raucao @krozruch @kensanata And yet, by changing your robots.txt you can delete your entire site's history from archive.org look it up, it's happened before.
@krozruch @kensanata @siraben That's provably false. Anyway, have fun negotiating your breakfast choice with someone tomorrow. I'll keep living with minimal politics. Thank you.
@kensanata @siraben @krozruch The only thing that sure as hell does not survive history are current nation states and their institutions. There are many civil organisations, companies, churches, families, who still maintain archives that are vastly larger than any national collection.
@kensanata @siraben @krozruch How is that relevant? The risk and dependence aren't worth it. We already have decent civil orgs who are literally doing this right now. With zero of the countless drawbacks of handing control to a central bureaucracy, and all of the benefits of aligned incentives with donors and patrons.
@raucao I was under the impression that they had multiple sites, including one in Canada specifically, but Wikipedia says: «The Archive has data centers in three Californian cities: San Francisco, Redwood City, and Richmond. To prevent losing the data in case of e.g. a natural disaster, the Archive attempts to create copies of (parts of) the collection at more distant locations, currently including the Bibliotheca Alexandrina in Egypt and a facility in Amsterdam.»
#archive
@raucao @kensanata Anecdotally, I see there are lots of gaps for public-facing websites in IA's crawled coverage, some of them years old. Using the "save page now" function helps not only grab stuff, but seed the crawler.
@raucao @kensanata oh cool. I wonder if there’s a way to trigger this automatically when we publish pages?
@n @kensanata I vaguely remember there being one, but if not, you could just set up a headless browser script.
@raucao @kensanata thanks. This seems like a good feature for any cms.
@kensanata That said, I hope they have offsite backups of all their petabytes, because if The Big One hits SF, who knows what's going to happen.