A plan to immortalize I-War community heritage
7 years 9 months ago #20499
by palmer
A plan to immortalize I-War community heritage was created by palmer
Thank you SoupDragon and schmatzler for your efforts!
As an old I-War fan I'm so happy to see you keeping the heritage of I-War community online.
I just learned about the "The End of an Era" that was about to happen in Jan 2015. Despite the successful outcome in that case, I'm still frustrated that my favorite game's entire community website with years of accumulated user content could just disappear.
In this post I will analyze root causes of website fragility and suggest counter-measures.
First some examples of sad fate of websites for the games I played.
It is so disappointing to realize how many hours/days/weeks of creative work can vanish. Community is often good at creating content and bad at preserving it for decades.
Problems
Some of the problems that lead to site extinction.
1. Lack of effortless replication
It is not trivial to create an exact copy of an entire public website by a random visitor.
Requirements to the replica:
1. First of all, it must be doable.
Some websites fail at this very first step for:
Most websites fail here for:
3. It must be possible to completely recover a failed website from such replica.
Any website that does not expose the source of its content fails this requirement. For instance, any website engine that stores content in a SQL database, but does not expose a readonly connection to such database fails. MediaWiki fails because it is untrivial to fetch raw .mediawiki files that are rendered by the engine.
2. Lack of effortless incremental synchronization
Even if you manage to mirror a website with wget, how do you update your mirror when the origin changes? Re-downloading everything is not an option. Using wget as an example, incremental updates are only possible if the website takes care to:
3. Lack of integrity protection
How do you make sure your replica is not missing files? That no files were infected or damaged? How do you compare replicas made by Alice and Bob?
Replication and synchronization must be considered from the very first days of a website. Unfortunately it is completely ignored by most website authors today.
Solution
Make it trivial to replicate the whole website. The more people mirror it, the higher chances our grandkids will play I-War with mods.
Tools to achieve this are distributed version control systems and hash functions.
Let's start with requirements to such a website:
Implementation
Here I propose a specific implementation for i-war2.com.
The entire website is built from several repositories, each of which can be easily mirrored and updated.
1. Knowledge base repository
This is the primary repository with all knowledge and file metadata.
Contents include:
[2] Pages in Downloads section have special requirements:
Implemented as Git repo.
2. Source code repository for every mod or utility
It's always nice to have source code in addition to binaries.
Source code for your scripts to setup a game server also deserve a repository. "setting up DirectPlay on a Linux machine is the equivalent of hell" -- this work shall not be lost.
Implemented as Git repos.
3. Binary file storage
Stores all binaries (mods, utilities, screenshots, movies, etc).
Implemented as a directory structure with timestamped files. Served via HTTP or FTP, wget will only download new or changed files. Rsync is an option. GitHub has something for storing large files, worth investigating as well.
4. Recovery plan
To recover after failure (malfunction, site hacked, domain or hosting expired), several up-to-date replicas must be found to check against each other.
One way to find them is to "call for help" as you did on the forum [/forum/general-i-war-talk/3157-help-some-old-content-is-still-missing].
Another is to build a list of mirrors beforehand. This can be done as
Open questions
I have not yet deeply considered the following:
Benefits
Having these ideas implemented, dozens of tech-savvy I-War fans could replicate all available I-War knowledge, and, equally important, keep it up-to-date with trusted maintainers.
Using a platform like GitHub opens the door to contributions via pull request workflow. You could then review and merge big changes (like new FAQ articles) or small ones (like typo fixes) with couple clicks.
Conclusion
I think long-term profit is totally worth the effort and I wish all good old games communities strived for something like this.
P.S. Thanks for maintaining Tron! One of my all time favorites as well.
As an old I-War fan I'm so happy to see you keeping the heritage of I-War community online.
I just learned about the "The End of an Era" that was about to happen in Jan 2015. Despite the successful outcome in that case, I'm still frustrated that my favorite game's entire community website with years of accumulated user content could just disappear.
In this post I will analyze root causes of website fragility and suggest counter-measures.
First some examples of sad fate of websites for the games I played.
- Mu Online fansite muhq.com shut down around 2014. Today it is only browsable through web archive .
- Armada Online had a "forum meltdown". Lots of useful forum threads were lost due to server malfunction (or a hack -- don't remember).
- Armada Online wiki just broke. Now it prints dozens of PHP errors and fails to display content. Compare what it is now and what it used to be .
- Warzone 2100 website had a security breach resulting in a huge data loss. I think they never fully recovered.
- torn-stars.com was offline for some time. Now it's back, but could go down again. I cannot easily download the site with wget because this PHP-based site is a total mess, archiving-wise.
- some I-War related websites are dead. They are only available thanks to your tireless archiving efforts, and archive.org of course.
It is so disappointing to realize how many hours/days/weeks of creative work can vanish. Community is often good at creating content and bad at preserving it for decades.
Problems
Some of the problems that lead to site extinction.
1. Lack of effortless replication
It is not trivial to create an exact copy of an entire public website by a random visitor.
Requirements to the replica:
1. First of all, it must be doable.
Some websites fail at this very first step for:
- Having junk in URLs, like session IDs "&sid=blabla" in forum URLs.
- Having poorly designed URLs. Compare:
www.torn-stars.com/index.php?option=com_...gory&id=13&Itemid=49
and
www.torn-stars.com/lore/01-the-sultanate-of-khalilistan - Not using direct links for static files, relying on Content-Disposition.
- Having links generated by Javascript, which are not visited by the crawler.
- Having lots of "junk pages" like MediaWiki "Special" pages.
Most websites fail here for:
- Not using relative links, which forces to convert links, which in turn breaks synchronization.
- Hosting images on external domains which die few years later.
3. It must be possible to completely recover a failed website from such replica.
Any website that does not expose the source of its content fails this requirement. For instance, any website engine that stores content in a SQL database, but does not expose a readonly connection to such database fails. MediaWiki fails because it is untrivial to fetch raw .mediawiki files that are rendered by the engine.
2. Lack of effortless incremental synchronization
Even if you manage to mirror a website with wget, how do you update your mirror when the origin changes? Re-downloading everything is not an option. Using wget as an example, incremental updates are only possible if the website takes care to:
- Properly serve Last-Modified HTTP header
- Use relative links, so wget does not have to convert the links, which alters file timestamps
- Do not use dynamically generated content (PHP, Javascript). Disqus comment section is a good example of horribly non-archiveable content.
3. Lack of integrity protection
How do you make sure your replica is not missing files? That no files were infected or damaged? How do you compare replicas made by Alice and Bob?
Replication and synchronization must be considered from the very first days of a website. Unfortunately it is completely ignored by most website authors today.
Solution
Make it trivial to replicate the whole website. The more people mirror it, the higher chances our grandkids will play I-War with mods.
Tools to achieve this are distributed version control systems and hash functions.
Let's start with requirements to such a website:
- Distribute source files, not rendered pages. Source files do not include any "cruft" (navigation, sidebars, login or search forms, comment feeds, CSS, javascript, tracking, ads, etc).
- All visible content is generated from source files.
- Source files are in easily editable text formats (Markdown).
- Source files are stored in versioned repositories (Git).
- Dynamic content generation is avoided (Javascript, PHP). If used, source data is available in plaintext files (CSV or JSON database), which are also stored in repositories.
- The whole repository can be trivially replicated ("git clone"). If there are multiple, a script is provided to get them all at once.
- Incremental updates are trivial ("git pull").
Implementation
Here I propose a specific implementation for i-war2.com.
The entire website is built from several repositories, each of which can be easily mirrored and updated.
1. Knowledge base repository
This is the primary repository with all knowledge and file metadata.
Contents include:
- One Markdown file per article, including news/blog [1]
- All images linked from articles
- JSON file database with metadata for Downloads [2]
- CSS styles to render pages
- Code and instructions to build the website
- Code and instructions to mirror all repositories necessary to fully replicate the website
[2] Pages in Downloads section have special requirements:
- They contain hashes of files.
- All binaries (see repository #3) can be verified against these hashes in one script, reporting any missing or broken files.
- A page is generated for every file.
Implemented as Git repo.
2. Source code repository for every mod or utility
It's always nice to have source code in addition to binaries.
Source code for your scripts to setup a game server also deserve a repository. "setting up DirectPlay on a Linux machine is the equivalent of hell" -- this work shall not be lost.
Implemented as Git repos.
3. Binary file storage
Stores all binaries (mods, utilities, screenshots, movies, etc).
Implemented as a directory structure with timestamped files. Served via HTTP or FTP, wget will only download new or changed files. Rsync is an option. GitHub has something for storing large files, worth investigating as well.
4. Recovery plan
To recover after failure (malfunction, site hacked, domain or hosting expired), several up-to-date replicas must be found to check against each other.
One way to find them is to "call for help" as you did on the forum [/forum/general-i-war-talk/3157-help-some-old-content-is-still-missing].
Another is to build a list of mirrors beforehand. This can be done as
- simple list of mirror URLs ( example ) or contact details of mirror maintainers
- in case of GitHub, list of people who forked a given repo (tracked automatically by GitHub)
Open questions
I have not yet deeply considered the following:
- How to store dynamic data? Things like download counters and file ratings.
- How to store user messages? Forum posts, replies, file comments, news comments.
Benefits
Having these ideas implemented, dozens of tech-savvy I-War fans could replicate all available I-War knowledge, and, equally important, keep it up-to-date with trusted maintainers.
Using a platform like GitHub opens the door to contributions via pull request workflow. You could then review and merge big changes (like new FAQ articles) or small ones (like typo fixes) with couple clicks.
Conclusion
I think long-term profit is totally worth the effort and I wish all good old games communities strived for something like this.
P.S. Thanks for maintaining Tron! One of my all time favorites as well.
Please Log in or Create an account to join the conversation.
- schmatzler
- Offline
- Administrator
- Hey Clay? I'm back.
7 years 9 months ago - 7 years 9 months ago #20500
by schmatzler
Space. The final frontier.
Replied by schmatzler on topic A plan to immortalize I-War community heritage
While I appreciate that you took all the time to write this up, I think converting the Joomla-based installation, the forum and JDownloads to JSON data is a total overkill for the community of an old game with not that many active users.
The whole website including databases is mirrored every day on a special backup machine. It has saved my ass countless times. In case of failure, I also store weekly backups on another drive. Your examples all looked like no backups had been made beforehand, so a data loss was fatal in the end.
A GIT-based website is fine on projects like this , but I hesitate against implementing this here.
BTW, the sources and prerequisites for running the gameservers with wine are documented in these places:
www.ldso.net/tronforum/viewtopic.php?f=6&t=1242
appdb.winehq.org/objectManager.php?sClass=version&iId=7386
The whole website including databases is mirrored every day on a special backup machine. It has saved my ass countless times. In case of failure, I also store weekly backups on another drive. Your examples all looked like no backups had been made beforehand, so a data loss was fatal in the end.
A GIT-based website is fine on projects like this , but I hesitate against implementing this here.
BTW, the sources and prerequisites for running the gameservers with wine are documented in these places:
www.ldso.net/tronforum/viewtopic.php?f=6&t=1242
appdb.winehq.org/objectManager.php?sClass=version&iId=7386
Space. The final frontier.
Please Log in or Create an account to join the conversation.
7 years 9 months ago - 7 years 9 months ago #20501
by palmer
Replied by palmer on topic A plan to immortalize I-War community heritage
An overkill indeed. I wanted to share a vision of a final ideal setup, while I perfectly realize that it may not be feasible here.
I agree about backups. Maybe sites in my examples were just unlucky to have no backups. Intuitively I knew you do backups from reading the very first announcement
And thanks for the links.
Somehow this site, and no other triggered this writeup. Maybe I-War is too special for me. Turned out quite big, so thanks for taking time to read!
Anyway, I wget-ed this site just in case. Mostly went well, with two minor issues. One was that files in "Downloads" were not saved with correct paths because of redirects from "/downloads/send/..." to real location. But I figured with --trust-server-names wget uses the path after redirects. Another is that "Documents" have always changing Last-Modified so updating them is not as nice as "Downloads" (which work fine).
I agree about backups. Maybe sites in my examples were just unlucky to have no backups. Intuitively I knew you do backups from reading the very first announcement
And thanks for the links.
Somehow this site, and no other triggered this writeup. Maybe I-War is too special for me. Turned out quite big, so thanks for taking time to read!
Anyway, I wget-ed this site just in case. Mostly went well, with two minor issues. One was that files in "Downloads" were not saved with correct paths because of redirects from "/downloads/send/..." to real location. But I figured with --trust-server-names wget uses the path after redirects. Another is that "Documents" have always changing Last-Modified so updating them is not as nice as "Downloads" (which work fine).
Please Log in or Create an account to join the conversation.
7 years 9 months ago #20502
by Chessking
This is one tough navy, boy. They don't give you time off, even for being dead. -Clay
Storm Petrel
Replied by Chessking on topic A plan to immortalize I-War community heritage
I remember when I opened I-war2.com two years ago, and got a 404 not found error. I had considered backing up the site in the past, and was disappointed that I had not. Thankfully, the website was only down because Schmatzler was re-modeling it.
This is one tough navy, boy. They don't give you time off, even for being dead. -Clay
Storm Petrel
Please Log in or Create an account to join the conversation.
7 years 9 months ago #20503
by IronDuke
Very little about the game is not known to me. Any questions you got, throw them at me.
Replied by IronDuke on topic A plan to immortalize I-War community heritage
Did I seriously not check the forums for five days? :blink: I don't normally miss stuff... then again, this week has been overwhelmingly busy.
Skimmed yer wall o' text, and that plan sounds solid, I guess... I'm a gameplay coder, not a website coder. But what glimmers I do understand sound smart, so they must be good.
Welcome to the forums!
--IronDuke
Skimmed yer wall o' text, and that plan sounds solid, I guess... I'm a gameplay coder, not a website coder. But what glimmers I do understand sound smart, so they must be good.
Welcome to the forums!
--IronDuke
Very little about the game is not known to me. Any questions you got, throw them at me.
Please Log in or Create an account to join the conversation.