![]() |
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Category infrastructure (continued)
![]() |
This thread continues at a successor thread. |
This thread continues at a successor thread. |
I'm starting a new thread for this. I started the original thread just over a year ago; by making a clean break, the old section can be allowed to age and be archived. (Automatic archiving of this page is currently set for a "mere" four months.) Here's a summary of what's happened so far:
- The basic proposal was to use prefixes on most category names, so it's clear what they are; we've had confusion for many years now over whether a given category is a book category or a subject category, and the difficulty of keeping track of which category is of what sort is probably why we've never even tried to have topic categories that could be used to classify book pages as well as books as a whole. I suggested subject categories should start with
Subject:
, book categories withBook:
, and those other categories with some other prefix such asKeyword:
. The third part is the most unknown territory, but it would also be the last thing done, so I figured there was no urgency yet about figuring it out.
- I also hoped to get wikidialog running on Wikibooks, and perhaps use it to help with some of these other infrastructure changes.
- Within the first half year after starting the topic here, I did get the subject categories renamed, and took advantage of the more uniform structure to make the subject categories echo the classified book lists of the subject pages themselves, so that when you click on a link to a subject category, it's pretty much just as useful as being taken to the subject page itself. I also got the dialog tools installed here, but the real challenge of dialog turns out to be learning how to use them to build assistants ("wizards"), and that is an on-going exploration being done elsewhere, so not all that much has been done here with dialog even though it's available.
- Almost exactly three months ago as I write this, I started on the renaming of book categories. To start with there were about 2800 book categories with old-style names, and now there are 2100; so if we continued at this rate it'd take another nine months to complete that phase of the operation. Several other Wikibookians have helped in various ways, which has made a real difference in how fast things have gone. Rate of the current tedious renaming operations may decline over a sustained period, which would seem to make the projected completion of this phase more than nine months off; but then, if some dialog-based semi-automation comes on line at some point, it could greatly speed things up. There are some detailed notes on how to do book-category renaming yonder.
- The complete old thread is over thar.
Going forward:
- My running progress report has experienced a glitch, but I'm starting to get a handle on it again. As I'd originally envisioned the process, book categories would be moved one at a time, with all attendant checks and processing, template {{BOOKCATEGORY}} would make sure category references referred to the old name until the move was implemented for each book, and one category would list the books not yet done while another would list those completed. That worked through more than a third of the books, but then — with all good intentions — almost all the remaining categories got moved, without any of the attendant checks and processing. On 4 October 2017, shortly before that happened, there were 1600 book categories under the old naming convention, and 1259 under the new. Now I've got a list of 1430 book categories that were renamed but still need to be processed, plus another 143 still using the old naming convention, a total of 1573 to go, and 2714 under the new naming convention, indicating 1284 completed. I expect to process the ones in the old-naming-convention category first, and then do the ones on my list starting from the bottom. I don't really have a baseline for whether this will go faster or more slowly than before.
- Done, as of 23:45, 16 January 2018 (UTC).
- I'm still thinking about the third stage of this operation: keyword categories. The purpose of such categories is the same as the old library card catalogs, that were mostly destroyed around the turn of the century on the (incorrect) assumption that automatic string search would do the job better — customized help with finding what works in the collection are related to what topics, based on considering each individual work. In effect, the old card catalogs were crowdsourced.
- In the process of all these book category renamings, I find I'm getting a remarkably wide view of our whole collection, even though I rarely get more than fleeting glimpses of specifics. I've long perceived that Wikibooks is a confederation of thousands of ultra-tiny microprojects that have banded together to share a common administrative infrastructure. What they have in common, at least statistically, is more-or-less described by WB:WIW, but of course any given book may be not only different from all the others, but may be different in a way that none of the others are different from each other. If we want to provide help that will improve books across large parts of the whole collection, we have to do it in a way that admits endless variations (in contrast to the more usual technique used by programming platforms, where everyone is required to conform to some uniform structure for which some tools are then provided). A general problem I can see applies to most of the roughly-three-thousand books in our collection is that a contributor doesn't know how to contribute: the book is created with some sort of vision of how the whole is to be organized and what the various parts are to contribute to that whole, but since this is different for each book, how is a would-be contributor to know what to do? There are lots and lots of books in our collection that have a few sections written and then a bunch of red links, and in order to write those missing parts someone will have to create their own vision of what ought to go there. On Wikijunior, it seems to me, the most successful long-term-growth books have quite well-defined formats for each page, such as Wikijunior:World Religions which has a standard list of questions to address for each religion (in fact, an earlier version of the book basically stalled for years because of flaws in the list of questions). Most books don't have modules as uniformly structured as that, of course.
--Pi zero (discuss • contribs) 13:21, 21 July 2017 (UTC)
- Thanks for the update. Several years ago I had created some topic categories for pages, but not only. They have been deleted here but on the French Wikibooks, you can admire the example of the database category, which not only contains the DBMS books but the database management pages from the PHP and Python books. This structure suits me and I can't see any relevance to split it with different prefixes like
Books subject:
andPages subject:
. Moreover,Keyword:
looks like a referencing concept which would for example imply to categorize the page translating the term "database" from English to French into [[Category:Keyword:Database]], so the targeted readers would get many false positives (eg: programmers vs linguists). - Concerning Wikidialog I didn't test it but believe that we could adapt ourselves to it, like it's alternative: the JS lib MW:OOjs UI usable in our gadgets. JackPotte (discuss • contribs) 16:19, 21 July 2017 (UTC)
- @JackPotte:
- Regarding "subject" and "keyword" categories, there is a messy problem of terminology mixed up in it, and because of that, it might not be clear what I'm trying to do.
Many years ago, en.wb organized its books using bookshelves. Like other centralized organization systems I've seen on the wikis, the bookshelves arrangement apparently didn't work out. We later set up a hierarchy of "subject pages", with their own namespace, using DPLs (dynamic page lists), which are updated automatically from categories specified on the main pages of the individual books; and apparently that arrangement, automatically self-updating from specifications on the individual books, caused the bookshelves to fall out of use. (Root subject.) Later I upgraded the subject pages so that they would automatically list not only books in a particular subject, but also all books in any of its descendants in the hierarchy. The whole subject hierarchy, though, is still a means of classifying books, not pages. It is much like sections of a library; you know, science books are on the second floor, physics books are on that floor in the three aisles closest to the north stairwell, etc. I've sometimes regretted that we use the name "subject" for these hierarchical groupings of books, because it means the word "subject" isn't available for us to use to refer to anything else. I wouldn't think it'd be practical to change the name now; it must be hardwired into page references all over en.wb, all over the wikimedian sisterhood, and all over the internet.
The fact that we don't have categories that correspond to Wikipedia articles (whereas en.wn does have categories corresponding, roughly, to Wikipedia articles) is limiting both to the ability of our readers to find things, and to the ability of our sister projects to provide incoming links to our material. I often find myself setting up sister links from a topic category on Wikinews, and I can't provide a Wikibooks link because there simply isn't any one page here to target. String searches are always lousy compared to searches based on human-generated information about the semantic content of individual resources. There is a problem about what to call these, though. We have several different functions performed by categories here on en.wb, and I find prefixes like
Book:
andSubject:
are enormously effective in making it instantly clear what the functions of particular categories are (I'm extremely pleased by the results of these category renamings, so far), but the question becomes what prefix to use with these categories that could list any content page, not just the main page of a book. I thought ofkeyword:
as a possibility because it's common for academic pages to provide a list of "keywords" describing topics to which the pager is relevant; but those "keywords" are often really key phrases, of more than one word. The word "subject" might be an obvious choice, if it wasn't already taken, since the old library card catalogs were "subject catalogs". The word "topic" comes to mind, but, really, wouldn't it be awfully confusing to have two kinds of categories where one is called a "subject category" and the other a "topic category"? - An important part of my purpose in developing the dialog tools is to put all page interactivity under the direct control of wiki markup. This is a very big deal for me, as a matter of wiki philosophy; I see the Foundation explicitly trying to shift focus away from wiki markup and I see that as uniformly bad for the future of the entire sisterhood. I somehow found time to write an essay on that (link).
- Regarding "subject" and "keyword" categories, there is a messy problem of terminology mixed up in it, and because of that, it might not be clear what I'm trying to do.
- --Pi zero (discuss • contribs) 20:42, 21 July 2017 (UTC)
- @JackPotte: Perhaps a better choice that
Keyword:
would beTag:
. See w:Tag (metadata). --Pi zero (discuss • contribs) 15:58, 22 July 2017 (UTC)- Why not? Now let's decide the way to add them, I was thinking about {{BookCat|tag1|tag2}} because it's short and simple.
- Another longest solution would be (as {{Tag}} already exists) a dedicated template, and it would be able to bring an anchor to the tag related paragraph. JackPotte (discuss • contribs) 19:42, 22 July 2017 (UTC)
- @JackPotte: Perhaps a better choice that
- @JackPotte:
- @JackPotte: Some time ago when our working plan was
keyword:
, I see you suggested a template{{k}}
. If we usetag:
, I predict some people using a template called {{tag}} despite the fact it does the wrong thing; and I wouldn't care to use{{t}}
since it's so close to{{tl}}
. As for {{BookCat|tag1|tag2|...}}, that template already has a parameter, I forget what it does, and I've been thinking for some time I'd like to find and eliminate all uses of that parameter so I could remove it from the template and replace it with a parameter that works uniformly with our other related templates such as {{BOOKCATEGORY}} — which would still be incompatible with {{BookCat|tag1|tag2|...}}.Another difficulty with piggy-backing on {{BookCat}} is that it's really not meant to be used on the main page of a book; it's designed to not cause a problem when used there, but the main page of a book is really the only page associated with a book where {{BookCat}} isn't mean to be used; one uses {{subjects}} there, instead (and {{subjects}} is designed to degrade gracefully when used elsewhere, but really isn't meant to be used except on a book main page). --Pi zero (discuss • contribs) 22:17, 22 July 2017 (UTC)
- @JackPotte: Some time ago when our working plan was
Template renamings
Checking here before doing the following renamings of some magic-word-like templates; unless I get early opposition, I mean to move ahead quite soon.
- rename
Template:FULLBOOKNAME
toTemplate:NAIVEBOOKNAME
- rename
Template:BOOKNAME
toTemplate:NAIVEBOOKSTEM
- rename
Template:ROOTBOOKNAME
toTemplate:BOOKNAME
This even looks strange to me, when put baldly like this.
For years we have had two sets of magic-word-like templates for decomposing page names, with clashing naming conventions.
The first set of templates were created in late 2007, with the names BOOKNAME, FULLBOOKNAME, CHAPTERNAME, and FULLCHAPTERNAME. Likely these were intended to provide more book-oriented decomposition of page names than the built-in magic words that are generic to any wiki. Imho the only one of these that really did quite what it said was CHAPTERNAME.
A second set of templates were created starting in mid 2009, based on the idea of templates that orient to the book associated with a page (not only if the page is in the book, but if it's an associated category or template). The first of these was called ROOTBOOKNAME, only because the name "BOOKNAME" was already in use. This set has continued to grow, a particularly important one recently being {{BOOKCATEGORY}}; but it has been a continuing thorn-in-the-side that the name BOOKNAME was occupied by an older-style template.
In recent times I worked out that FULLCHAPTERNAME was really a device for producing a book sort key, and renamed it to {{BOOKSORTKEY}}. Nothing needs changing about CHAPTERNAME. Over the past few days, I've eliminated all (but one) use of the old BOOKNAME. We generally don't delete old infrastructure, we just mothball it or in some cases history-merge it; and I wouldn't care to merge the histories of BOOKNAME and ROOTBOOKNAME as they overlap and would produce a confusing mess, so I figured to shunt the old BOOKNAME aside to a name that wouldn't do any harm, for which I have suggested NAIVEBOOKSTEM. I do think NAIVEBOOKNAME is a much more accurate description of what the old FULLBOOKNAME template does. And "BOOKNAME" really is a much more natural name for the thing currently being called "ROOTBOOKNAME". --Pi zero (discuss • contribs) 19:28, 21 January 2018 (UTC)
- For what it's worth, I certainly have no objection if you want to do the work. If you need help, please ping me. —Justin (koavf)❤T☮C☺M☯ 05:47, 22 January 2018 (UTC)
- This has now largely been
Done ; The main vestige of the old naming, at this point, is uses of redirect "ROOTBOOKNAME", which I am slowly changing to "BOOKNAME". --Pi zero (discuss • contribs) 15:55, 24 January 2018 (UTC)
- good work. Artix Kreiger (discuss • contribs) 16:14, 24 January 2018 (UTC)
- This has now largely been