All right, here's the extremely long story short: I made an archive for Discussions to put on the wiki, just as the other forums like this one have archives. Over the course of this discussion, I intend to create these archive pages under the Forum namespace and enlist everyone's help in perfecting their structure and making what was first just a dream a not-so-frightening reality.
Contents
Background
For years we've wanted to archive Discussions forum posts in a searchable database. The platform has always been limited in navigation, making old posts harder to maintain without serious community know-how and grassroots cross-linking efforts. Without that large sense of what's already discussed on the forum, many conversations are repeated to the extent that they become cyclical and less engaging. It's important for the quality of discussions on the forum and the sanity of those who use it to keep things fresh, or at least focused by keeping our history alive.
Permanent threads have long been the way to go, but they too become hard to find and maintain after a while. Archive posts exist as well to bundle links, but we moderators are the only ones who can edit those old posts when the links need to be updated or moved around. I've experienced this firsthand with my own Community Projects & Permanent Threads archive, which we traded out for other link-tree threads later on.
Givinname changed the game completely with the ThreadHub project, compiling permanent threads in a network that could be easily navigable from a Master Hub root post. It's the best tool we have to this day, and nothing even comes close. Even that tree of links has its limitations, however, and click counts become a huge deterrent the further a search goes. The forum is actually searchable on a base level, but it's hard to come up with search terms for post titles, even with bracketed titles and emoji characters we've systemized.
The dream continued of a true archive. Then I made one in under an hour.
I had been looking into web scraping for a long time, figuring the only way to really index the whole forum would be to prompt some server to scroll down to the literal bottom of the post feed. Then one day back in December I decided to "inspect element" and stumbled across the actual database pages the forum calls when the scroll feed loads more posts. All the URLs look more or less like this and load up JSON files. The only parameters I needed to mess with were the "pivot" number (which was taken from the URL of the most recent Discussions thread to be posted), the "limit" of posts to load at a time (default 20, maximum 1000), and the "page" number (page 2 loads the most recent bunch of posts made before those in page 1).
I wrote a program to load up the forum, find the most recent post, then loop through those database pages all the way until I hit one that didn't contain any post data at all. Along the way I had the program write to a wikitext file, throwing the basic post data into a table that I could simply split up and copy into pages. I created the {{FP}} "Forum Post" and {{FU}} "User Posts" templates for shorthand links, nabbed an existing JavaScript table for the ability to show and hide posts by category, and pretty much finished everything right there.
Archive Layout
Eight years and a hundred thousand posts were piled into a 20 MB text file. Since the activity of the forum has varied so greatly over time and the size limit for a wiki page is 2 MB, some pages could fit an entire year's worth of posts, while others could only fit the posts made in a single month. Below is the best arrangement I could come up with that minimized both the size and number of pages. This chart will double as a template placed at the bottom of each individual archive page.
| Discussions archive | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||
In this case, years 2015, 2017, and 2018 would all simply link to their respective archive pages, since the sets of Discussions posts made in those years fit nicely on a single page each. Busier years are split up, so instead of containing a table of posts themselves, pages like "Forum:Discussions archive 2016" would contain a directory of subpages for each two-month period, such as "Forum:Discussions archive 2016/January–February" and so on. Easier to search by year first before narrowing down—there are eight months of January throughout Discussions history, but only one 2016. The only exception is 2020, which has a subpage per single month.
These page names derive from the older Trash compactor archive category pages, which are also split up by year. Note that in further alignment with the other forum pages, a main "Forum:Discussions" page should be created, which we mods as well as trusted community members would most likely use as a space to pin essential posts and spoiler discussions, continuing the existing work to maintain permanent threads. That can be worked out later since much of that page's content would be transcluded from the existing Discussions policy page. For now I'd like to present the archival content as-is.
Archive Pages
Archive Header
Here's the example archive page I've drawn up. First I created a header bar akin to {{Shtop-arc}} for these pages. Probably no elaboration needed there aside from the fact that I modified all these instances to remove redlinks.
Archive Sorting
Then comes the secondary header chart, which is a restructured version of the one on the Timeline of canon media article. Forum users once had a filtration tool to select which forum categories they wanted to see on their end, but Fandom changed the feature, and now it only allows for viewing a single category feed at a time. Thanks to this, filtration is back!
| Announcements (A) (click to hide) |
Star Wars News (SWN) (click to hide) |
Memes/Fun (M/F) (click to hide) |
Community Projects (CP) (click to hide) |
Polls/Favorites (P/F) (click to hide) |
| Featured (F) (click to hide) |
Fan Creations (FC) (click to hide) |
Theory/Analysis (T/A) (click to hide) |
Hall of Fame (HOF) (click to hide) |
General (G) (click to hide) |
Archive Content
Here's where it all goes down. This was the data my program grabbed and wrote in table format. I've only included a handful of posts for conciseness, easy testing of the filters, and some good reads.
The last point of note is that these pages shouldn't actually be "archived," only protected. This is really a live index for everyone rather than an archive, so someone needs to be able to update these lists every once in a while to reflect anything old that we mods might find to clean up, delete, or restore. Even though that would be me since it's my code that built the archive, the concept of being able to improve the index itself will certainly further the incentive for members of the forum to use it. I am extremely proud of this project and feel that it's ready to become the defining landmark in Discussions history—y'know, because it's a record of Discussions history.
Discuss
If there are no major holdups to the idea and structure of these pages, I'd like to start putting them up since I already have them written. Eventually I may write to the Consensus track about locking in this archive's complete layout for posterity. Please pass along as many questions and suggestions as you can think of. I'm excited to see what old threads the community digs up. Long live Project Swarm! Jedi Sarith LeKit (talk) 06:14, 19 February 2024 (UTC)
- This sounds like a good idea. Rsand 30 (talk) 13:39, 19 February 2024 (UTC)
- There would need to be a CT before these could actually be implemented under the Forum namespace, but I can't imagine the archive will raise many objections. Fan26 (Talk) 14:00, 19 February 2024 (UTC)
- Good to know, thanks. Figured that was for the best, and that'll help me set up some lingering work so it's all ready to go at once when the time comes. Usually I just get a "nah, you're good" from the admins when I bring these ideas up to them—more their expression of support and simply expecting it from everyone else as you do. Jedi Sarith LeKit (talk) 15:42, 19 February 2024 (UTC)
- Sorry if I'm being dense, but why are Discussions threads being archived on Wookieepedia rather than in Discussions? This seems like valuable work, but work that has nothing to do with Wookieepedia or its mission. Asithol (talk) 16:42, 28 February 2024 (UTC)
- There isn't much overlap between the wiki and the forums, but Discussions are a part of the Wookieepedia community. Taking advantage of the formatting capabilities of our wiki side is the best way to keep an index of Discussions threads. OOM 224 (he/him) 00:27, 2 March 2024 (UTC)
- I don't really have the authority to say this, but as I see it, the idea that there's Wookieepedia and then there's Discussions off to the side has been done away with. It's more like Discussions has its own corner. That shift has mostly been spread over Discord—mostly by me—so the perspective from an editor outside of that space isn't dense at all. I honestly hadn't thought about it, but consider that an effect of my intent to implement this work quietly. I don't want to force it on anyone, since I know Discussions was tacked on by Fandom, but with that in mind, I'd much rather have Wookieepedia's Discussions forum under our authority than Fandom's. On paper it should be just another forum subspace like the others here on the wiki. In practice it can be left alone by people who are just here to edit. It's just impossible to put this archive in the Discussions space without rendering it completely unusable. Jedi Sarith LeKit (talk) 23:02, 8 March 2024 (UTC)
- I do like this. But, for "…so someone needs to be able to update these lists every once in a while to reflect anything old that we mods might find to clean up, delete, or restore"—only thing I'd note about this is to define how to maintain this with a small list of instructions somewhere so the process remains evergreen and so new people/admins know how to do it, if it ever becomes necessary.—spookywillowwtalk 06:04, 29 February 2024 (UTC)
- Sure, the easiest way to do that is just to put the code itself at the bottom of the "Forum:Discussions archive" root page, or maybe a subpage. It's just over a hundred lines long, and to document new/edited/deleted posts after some time has passed, all anyone needs to do is run it on their own computer, and it'll write a text file of every post in a wikitext table that can simply be split up and copied over to the archive pages. Jedi Sarith LeKit (talk) 23:02, 8 March 2024 (UTC)