igenlode: The pirate sloop 'Horizon' from "Treasures of the Indies" (Default)
[personal profile] igenlode
Following on from yesterday's labours, I think I have now also completed the task of going through and adding the tags film-review and british-film to all the entries between July 2014 and December 2007.

NB one weird Dreamwidth feature I discovered -- far too late to be of any use! -- was that posts which are accessed by clicking on their titles have a "previous entry"/"next entry" link at the bottom only, which means an awful lot of scrolling when scanning through years' worth of posts in consecutive order.
Posts which were originally accessed by clicking on a date from the archive view get a URL in the form of a date rather than a link to the specific post, and when viewed in that format they acquire a convenient "previous day"/"next day" link at the top of the post!

For example, the archive post for September 2010 gives you a clickable link on the post title "I Write Like...", but you can also click on the number 1 above that in order to view all posts written on 1 September 2010.
And you get two different URLs from those two links, which present the same entry in two very subtly different formats.

https://igenlode.dreamwidth.org/2521.html
https://igenlode.dreamwidth.org/2010/09/01/

So subtly different that I had never noticed them at the point when it would actually have been very useful to have the top-of-page link!


Although to be fair it wouldn't actually have been all that helpful to know, since I require the literal post URL in order to be able to edit it -- well, I don't absolutely require it, but it saves an awful lot of typing to plug in the "2521" to the standard edit-my-post URL, rather than going the "Edit Entries->Certain Day" route. Yet another side effect of the weird browser/site issue that means Dreamwidth doesn't treat me as being logged in unless I'm actually in the 'login' page and accessing posts via that sidebar!

I can get the actual URL by hovering over the the post title text, though, and it still saves time overall to be able to flip quickly through the posts without having to scroll all the way down every single page; it's generally obvious whether the content is relevant to the subject I'm checking for or not. In fact it turns out that almost all the film content occurred in that beginning section anyhow -- my attendance at the IMDb dropped off dramatically over the years in question, as did my impulse to write about those films I did see :-(


At any rate, I think I've now reviewed, in many cases multiple times, every blog post I wrote over those seven years -- including a few that were originally set to Private but which after fifteen years or so I feel might as well be visible. (Since I don't have a paid-for account, I don't actually have any means of filtering my own content by security setting, so I have no means of listing all the Private entries to see which of them are still sensitive; one of the ones I happened to come across very definitely was, most of the rest in this time period weren't.)

I did a bit of tag hygiene in addition to putting in tags retrospectively; I removed/merged misspelt tags and those with zero usage, and deleted a few of the ones which were used only once and which could usefully be rolled into more general categories that had been established later on. (I don't need a separate tag for individual films, for example, which I was evidently doing for the first few I wrote about.) I did add a couple more general tags to help cover the various completely untagged posts, a number of which were just brief sentences of musing of the sort that I would probably post to Facebook rather than Dreamwidth these days.

Current tag count: 249.
Although of course, once you start having too many entries categorized under a single generic tag then Dreamwidth doesn't display the earlier ones when you click on it, so they effectively become non-searchable -- I still can't find this officially documented in the FAQ, but I seem to remember the silent limit was mentioned as being several hundred, and I got back to about 800 in my last mass-tagging attempt.



Random geeky observation: as a result of typing in all those post-ID numbers for editing purposes, I realised (unsurprisingly) that the ID numbers increase according to the order in which the posts were actually uploaded to Dreamwidth rather than the date order of the posts. So my very earliest DW posts around August 2010 have only 3 digits. The backdated ones from 2007 have five-digit IDs, and the latest batch I've been uploading into early 2008 have six-digit IDs; if you could be bothered, it would be possible to identify the previous dates on which I engaged in MySpace uploading by identifying the discontinuities in the numbering! (This sort of thing is what did for Harold Shipman.)

I read somewhere on the Web that Dreamwidth reuses each 'post ID' about 40 times before moving on to the next one (because of course those IDs are all relative to my blog address, and it would be perfectly possible to have a post uploaded to https://igenlode.dreamwidth.org/284367.html and another one uploaded to https://pedanther.dreamwidth.org/284367.html -- those are completely independent URLs). Which would account for the fact that the multiple entries I posted within a few minutes/hours of each other yesterday all have consecutive IDs, but with the actual number increasing by 100 or so each time -- presumably four thousand other Dreamwidth users were busy updating over the same interval!

It seems a slightly odd way of doing things, but I suppose it saves on generating ridiculously long URLs (and hence requiring more indexing space to store internal links) by assigning every post from every user a unique ID, and requires less individual tracking than incrementing each individual user's post ID by 1 every time every user posts, especially given the possibility that posts have been backdated and hence you can't safely look at the ID of the most recent single post and increment that; again, you would have to dedicate storage to recording the current 'counter' for every single site user, including ones that have been dormant for years or never posted at all.

So apparently they generate them as a steadily-incrementing stream and assign each new post-ID a certain number of times before generating a new one. Probably time-based rather than usage-based, at a guess, as they need to avoid the possibility of creating duplicate IDs where a single user makes several posts within a short period. (An automated import process might create multiple posts a second on the same account, even... although they may have a special system for that, since otherwise it would look like spammy activity!)

[Edit: actually, that doesn't make sense, because it was *LiveJournal* that I joined in August 2010 -- those low post-IDs from that date were part of precisely such an automated import process from Dreamwidth itself, and that site had been in existence long before July 2015, when I actually created this blog (preceded by all those imported entries). In other words, Dreamwidth on 8 July 2015 cannot possibly have been generating post-IDs across all users with values as low as "439.html"!
I can't be bothered to bend my head around this any longer...]

Date: 2022-09-12 09:42 am (UTC)
pedanther: (Default)
From: [personal profile] pedanther
Strictly speaking, clicking on the date link gets you a subset of your journal containing only the posts with that date. The distinction might be difficult to spot if you choose a day with only one entry and that entry doesn't have any content cuts, but here's an instance where it's easier to see: https://igenlode.dreamwidth.org/2022/01/20/

Profile

igenlode: The pirate sloop 'Horizon' from "Treasures of the Indies" (Default)
Igenlode Wordsmith

May 2025

M T W T F S S
    1 23 4
5 67 8 91011
12 13 1415 161718
1920 21 22232425
262728293031 

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated 23 May 2025 02:07 pm
Powered by Dreamwidth Studios