Râu Cao is a user on kosmos.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.

We tell each other to self host in order to escape the big corps. But when we die, all the self hosted stuff is going to get wiped and history will only remember the archives that got stored by the big corps. And you know how history is written by the winners. Do we need a way to automatically hand over our sites to archive.org or similar? Or send them disc images? Nobody expects to get run over by a car but some of us will, today. I made no plans for this.

Râu Cao @raucao

@kensanata "Save page now" on archive.org/web/

However, they automatically index pretty much everything that's truly public. Meaning no archiving of ANY Facebook posts etc., because of the login wall. So no, the corps will not remember you for the rest of history. The Internet Archive will.

@kensanata That said, I hope they have offsite backups of all their petabytes, because if The Big One hits SF, who knows what's going to happen.

@krozruch @raucao @kensanata They should open up torrents so that people can seed the data forever, this also stops them from erasing the past.

@siraben @kensanata @raucao I have uploaded some community video to archive.org. They have pretty good support for torrents for most things. Make an upload of a video, you can stream it on the website or torrent it. They also seem to automatically translate into open formats. People don't seem to use it so much. It's not a replacement for eg YouTube but for community video and audio content, it's great.

@siraben @krozruch @raucao Or you could create a federal mandate: make them part of the Library of Congress and make it a mandate that they save a copy of everything, including copyright exemption. They can stop publishing contested material but they will keep a copy. And people can at least walk on and make their own copies.

@kensanata @raucao @siraben I can see only one problem with that in the sense that it might then be argued they need federal funding, which could open them to pressures that might challenge their independence. I don't know that that will be the case and I don't know how they are funded now, but it is one potential concern to anticipate.

@krozruch @siraben Somebody posted this, today: «Note there are also backups by public institutions bnf.fr/en/professionals/digita If I die, the nationl library will have a copy of my blog :-)»

@kensanata @krozruch @raucao I don't know what the best compromise is for archiving the internet. Is more always better? Do we really need to remember every version of every single meme in the future? Maybe. Who knows if they'll be useful? But then again, maybe not.

@siraben @raucao @kensanata Also, do we want to maintain every blog post where people are working their way through a nervous breakdown or personal crisis?

@krozruch @raucao @kensanata Exactly. But do we want to have the power to arbitrarily delete content? Who decides what content is kept? Hard questions.

@siraben @krozruch @kensanata They're not that hard, if you accept that anything that is public can be archived by anyone anyway. Learn more about the Internet Archive, and if their FAQ and content aren't enough, ask them questions. This has all been talked about for decades.

@raucao @krozruch @kensanata And yet, by changing your robots.txt you can delete your entire site's history from archive.org look it up, it's happened before.

@siraben @raucao @krozruch Yeah, but they are ignoring robots.txt now, exactly because of this

@raucao @kensanata @siraben No harder, at any rate, that important social and political questions have always been, and must in some ways continue to be if we are to be involved in deciding our own fates. It's always negotiation and renegotiation.

@krozruch @kensanata @siraben That's provably false. Anyway, have fun negotiating your breakfast choice with someone tomorrow. I'll keep living with minimal politics. Thank you.

@siraben @krozruch @raucao somebody will need to write a PhD on early Internet and Memes and politics and all that in 100 years. Just like there are thousands and thousands of students scouring old newspapers counting instances of this and that, trying to figure something out.

@kensanata @siraben @krozruch The only thing that sure as hell does not survive history are current nation states and their institutions. There are many civil organisations, companies, churches, families, who still maintain archives that are vastly larger than any national collection.

@raucao @siraben @kensanata Intellectual property is one of the major barriers, as always, since without it as the default citizens could often cover much of this by keeping copies using something that looks *something* like torrents / IPFS.

@raucao @siraben @krozruch they don’t all have to end in a fire, though – the successor state might take them, after all.

@kensanata @siraben @krozruch How is that relevant? The risk and dependence aren't worth it. We already have decent civil orgs who are literally doing this right now. With zero of the countless drawbacks of handing control to a central bureaucracy, and all of the benefits of aligned incentives with donors and patrons.

@raucao @siraben @kensanata Yeah, states tend to lose stuff they don't find convenient - even if only temporarily - while keeping enough to make it look plausibly completist. That's the danger with handing this stuff over to a centralised state bureaucracy or state-funded body.

@raucao I was under the impression that they had multiple sites, including one in Canada specifically, but Wikipedia says: «The Archive has data centers in three Californian cities: San Francisco, Redwood City, and Richmond. To prevent losing the data in case of e.g. a natural disaster, the Archive attempts to create copies of (parts of) the collection at more distant locations, currently including the Bibliotheca Alexandrina in Egypt and a facility in Amsterdam.»
#archive

@raucao @kensanata Anecdotally, I see there are lots of gaps for public-facing websites in IA's crawled coverage, some of them years old. Using the "save page now" function helps not only grab stuff, but seed the crawler.

@raucao @kensanata oh cool. I wonder if there’s a way to trigger this automatically when we publish pages?

@n @kensanata I vaguely remember there being one, but if not, you could just set up a headless browser script.

@raucao @kensanata thanks. This seems like a good feature for any cms.