Author Topic: Preserving Ornery  (Read 2133 times)

Kelcimer

  • Members
    • View Profile
Preserving Ornery
« on: February 11, 2020, 01:44:59 AM »
I’ve spent a bunch of time this week rereading a lot of old threads. It has been a wonderful strole down memory lane. The last time I did so was in 2013 after I met OSC. But OSC is in semi-retirement right now. How long is the sit going to continue to stay up? At such time as that he passes, how long will his family keep it up? Play it out over a long enough time frame and at some point it will go away.

I was 23 when I joined and am presently 39. I can see myself at 80 wanting to take the same stroll. Such is my regard for a lot of the threads. While there are a couple threads that I have copied in their totality, there are just so many threads that are enjoyable to pass through.

I wonder if someone could persuade the Cards to put a copy of the forum in a separate FTP location whereby people who want to could download the whole thing. If not the whole site, then at least the present archives up to 2015 with the understanding that they’d allow the same thing for the rest of the forum to be presevered at some future date. That way at such time as that the Cards shut down the site (Or for such time as they hand the site over to someone else who may or may not be inclined twoards preserving it) that those of us who care would have a full copy for their reference purposes.

Thoughts?

wmLambert

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #1 on: February 11, 2020, 07:49:56 PM »
I wish all sites and forums would archive their posts. The preservation of truth and intelligent conversation is too often lost because someone who owns it would prefer to hide stuff than see it get out. I archive stuff that i see as imperiled onto my computer, so knowledge isn't lost. By experience, I have been asked to "Put up or shut up" too often, only to find the good stuff was taken down or deleted.

The catchphrase in computers for the last decade is that "Memory is cheap."

TheDeamon

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #2 on: February 12, 2020, 05:10:42 AM »
I know there are linux users who could tell you how to have a linux-box crawl through the entire site and preserve it's file structure if so inclined.

As to sharing the "raw" data files themselves, I'm inclined to say that should be a hard no.

Between exposing md5 password hashes(md5 now having been compromised to some degree), and making e-mail addresses public--ones which the account holder never consented to being public, there are reasons why a full database dump should never happen. Now a "sanitized" version may be another matter, but the tech factor on doing that is a bit higher.

TheDrake

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #3 on: February 12, 2020, 12:44:46 PM »
Have no fear. Use the wayback machine. It's fun to look at the capture from 2001.

https://web.archive.org/web/20190401000000*/ornery.org

mwLambert - did you mean storage? Because memory has nothing to do with it.

JoshuaD

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #4 on: February 12, 2020, 01:38:52 PM »
I recently backed up and published a forum for an online D&D game I played years ago.  It's relatively easy to preserve the data with wget, but it's not trivial to publish it again.

These forums are backed by databases. What we see is the end product of a program fetching from that database, mixing it together with files and settings on disk, personalizing it for each user, and then generating an HTML page.

It's not easy for a database administrator to make the database available to us in a safe way.  I don't imagine the Cards and their admin will want to do that work.

It's relatively easy to use wget or curl to download all of the end-result html pages. The problem is that each page has a bunch of links embedded in them, and you have to change the target for each of those links if they target other pages you are archiving.  IIRC wget can do that for you as well with a little fiddling, but it still leaves a lot behind.

When I backed up my gaming forum (which was 1/500th the size of the old ornery, and much smaller than the new siet) I ended up writing a little script to do all the fetching doing a lot of batch find-and-replace on the entire collection of files to strip out dynamic content, replace links, etc.

Bottom line: it's definitely doable, but it's time consuming and requires some amount of expertise.  Anyone with the know-how and determination can get it done, but it's definitely work.

Kelcimer

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #5 on: February 12, 2020, 03:02:37 PM »
TheDeamon,

Fair points. I suppose the ideal would be that they periodically (every five years?) create an updated sanitized version available to the members. Or just do one now and one before they close down the site. Perhaps "email the mod if you want to download the sanitized version" so that it is not just available to the public as well.


TheDrake,

That’s fantastic! If Ornery goes down tomorrow, then it's there, even if the Wayback Machine wouldn't have the most recent comments. It looks like the last time it backed up ornery was in July of 2019.

The only other problem is that there doesn’t seem to be a way to search Ornery through the Wayback Machine. You can search within the Wayback Machine for a particular website or webpage, but can’t search for a website AND a keyword within that website. Operating within the Wayback Machine, the Ornery search function doesn’t work because (I guess) it is only mapping the visual appearance of the page, and not the functionality. That search function is invaluable to sifting through 16,000+ threads. 


JoshuaD,

Fair points. I have no expertise in this area, so I apologize if I ask stupid questions.

Quote
It's relatively easy to use wget or curl to download all of the end-result html pages. The problem is that each page has a bunch of links embedded in them, and you have to change the target for each of those links if they target other pages you are archiving.  IIRC wget can do that for you as well with a little fiddling, but it still leaves a lot behind.

Being able to preserve the links would certainly be handy and being able to have a search function is essential. Using wget or curl, how similiar would a search function work compared to the Ornery’s existing search funtion?

I’m primarily concerned with threads from 2003-2008, which is a realtively narrow band of time, even as a lot of threads were generated in that space. Using Pete at Home as an example, he has 3,862 comments on the new forum and an astounding 44,193 comments on the old forum (which means he's responsible for approximately 1 out of every 14 comments on the old forum and 1 in 9 on the new forum).

Being able to quickly search the forum for particular keywords is huge.

TheDeamon

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #6 on: February 12, 2020, 03:08:51 PM »
The search functionality is part of the database, which wouldn't be preserved in such a use case as you don't have the database. Which isn't to say you couldn't use other tools to create a new index of said collection of pages and use that to search the archive instead. Much like how you can use Google to search many internet forums if you're so inclined.

wmLambert

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #7 on: February 12, 2020, 10:22:44 PM »
...WmLambert - did you mean storage? Because memory has nothing to do with it.

No, I meant memory. Storage is a synonym for what memory chips are made for. When I built my first computer from scratch, 24K was considered quite a large amount of memory. Just a few years ago we were talking Gigabytes, then Terabytes. The cost of storage hardware has dropped in price and size tremendously. In computing circles, the byword is that "memory is cheap."

LetterRip

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #8 on: February 13, 2020, 11:17:01 AM »
Wayback machine doesn't preserve a lot of ornery, usually you will get at least the first page of most conversations, but anything longer is a crap shoot.

wmLambert

  • Members
    • View Profile
Re: Preserving Ornery
« Reply #9 on: February 13, 2020, 11:23:17 AM »
Yeah, we still need to help back up good stuff when it appears. Just yesterday, I archived Seriati's posts on tariffs.