Help talk:Citation Style 1

Jump to navigation Jump to search
Citation templates
... in conception
... and in reality

Add wayback-timestamp parameter[edit]

When an archive is added to a reference, the vast majority of the time it is just a Wayback Machine archive of the exact same URL. This bloats the source code of pages massively. It would be much simpler if a wayback-timestamp parameter was added, which would be set to the timestamp of the archive found in the page's URL. This was mentioned seven years ago here but the discussion had no conclusion. Example: |wayback-timestamp=20200721125421 in \{{cite web|url=http://example.com/page|title=Example page|website=Example.com|date=2020-08-04|wayback-timestamp=20200721125421|archive-date=2020-07-21}} as opposed to the bloated {{cite web|url=http://example.com/page|title=Example page|website=Example.com|date=2020-08-04|archive-url=http://web.archive.org/web/20200721125421/http://example.com/page|archive-date=2020-07-21}}. Implementation: if waybackTimestamp then archiveUrl = 'http://web.archive.org/web/' + waybackTimestamp + '/' + url end.  Nixinova T  C   05:15, 4 August 2020 (UTC)

You're not wrong, but the thing is, server space keeps getting cheaper and cheaper, and programmer (paid, or volunteer time) keeps getting more expensive and scarcer. If you had to prioritize this against stuff that's either broken and needs fixing, or enhancements that would provide desired new functionality, well, you see the problem... Mathglot (talk) 04:00, 6 August 2020 (UTC)
I'm not saying to replace all archive-url's with this, just add it as an additional option.  Nixinova T  C   07:47, 27 August 2020 (UTC)
Bumping, I still think this is a good idea. Wayback is what most people use for archives, and this would save many kilobytes per page. Other archiving services could be used with archive-url without touching this syntax, but this would be very useful at minimising the size of references in a page's source.  Nixinova T  C   03:23, 2 October 2020 (UTC)
@Nixinova: Have you tried |wayb? For example, {{cite web |url=http://example.com/page |wayb=20200721125421}}? Note that when using |wayb you don't need to use |archive-date because it extracts the date from |wayb. I think that is an undocumented feature, I discovered it reading some page source to understand the inner workings. Joaopaulo1511 (talk) 08:05, 15 October 2020 (UTC)
@Nixinova: Sorry, |wayb works on Portuguese Wikipedia, but not on English Wikipedia. Check pt:ReactOS (page source) to see what I am talking about. The |wayb argument is documented here pt:Predefinição:Citar_web#URL and on other Portuguese citation templates. @Mathglot: The wayb I see it, one day of a coder's work can help editors save many months by not having to repeat wiki code over and over, and also help read (and edit) faster by uncluttering the sources' pages. And the code is already there, on the Portuguese Wikipedia, just waybting to be copied. 😅 Joaopaulo1511 (talk) 08:55, 15 October 2020 (UTC)
I like the idea in general, but not the proposed user-interface. I am not too fond of the idea of adding a specialized parameter |wayback-timestamp= or |wayb= just for archive.org. Also, these parameter names would not fit well into our parameter naming scheme. An alternative proposal, which works without introducing a new parameter, is discussed here: Help_talk:Citation_Style_1/Archive_72#Smart_substitution_token_to_reduce_redundancy_among_input_parameters
It is slightly longer (which shouldn't matter, as in both cases the full archive link must be available for truncation before adding it to a citation - basically noone types in archive links or timestamps without utilizing copy & paste), but it is more flexible (also possible for some other archivers) and it would be embedded into a more general concept potentially reducing the necessary amount of typing also for a number of other citation template parameters. Of course, both could be implemented in parallel, but for reasons of consistency across citation templates (similar to our ((accept-this-as-it-is)) syntax) I would prefer the broader concept of a smart substitution token.
--Matthiaspaul (talk) 11:09, 15 October 2020 (UTC)

(edit conflict)

Previous discussions:
Those seem to focus on all archive sources, whereas |wayb= is specific to Internet Archive. Because we have InternetArchiveBot I would guess that the vast majority of |archive-url= parameters hold wayback urls. If that is the case then perhaps there is some sense in supporting |wayb= or similar. But, for me, it is easier to copy/paste an entire archive url than it is to highlight 14 digits in the middle of the archive url and then copy/paste that. So that suggests, if the goal is to make life easier for editors, when |archive-url= holds a properly formed Internet Archive url, cs1|2 can extract the date from the 14-digit timestamp and return a YYYY-MM-DD archive date to be formatted according to |df= or {{use xxx dates}}.
I'm not all that comfortable with automatically assembling an archive url from |url= and an editor-supplied timestamp. Any change that 'fixes' the url will likely break the assembled archive-url.
Trappist the monk (talk) 11:14, 15 October 2020 (UTC)
Moreover, this would make the bot's work more complex. Solid "we should not do this". --Izno (talk) 13:07, 15 October 2020 (UTC)
Assembling archived links from a prefix, a timestamp and an URL is hardly "complex", it's trivial to code. However, there is, as Trappist correctly wrote, a risk to break the archived link when the URL gets modified later on. So, this whole idea depends on such timestamps been adjusted or removed whenever |url= is touched, or for them to be replaced by the expanded link in |archive-url= again. However, failing to update |archive-url= when modifying |url= is almost always an error, even without this proposal. What we'd lose is the "known good state" of an already existing |archive-url= when the |url= undergoes only minor tweaking (like removing unnecessary URL parameters). As bots not updated to take |wayback-timestamp= (or similar) into account would likely just add |archive-url=, the failure mode is on the safe side if the template gives |archive-url= priority over |wayback-timestamp=. In the case of the placeholder idea, an |archive-url= containing a * would not match and would likely be overwritten by the bot when it changes |url=. It's not 100% bullet-proof over the transitional phase, but little actual damage can be made, so this aspect alone should not invalidate the idea, IMO.
In general, we should not have "mercy" with bots. They are to make life easier for humans, not the other way around. Programs exist to code once, solve often. For as long as the work required to code a program is smaller than the accumulated amount of work that would be required to repeatedly solve a problem manually, the difficulties to code and maintain a bot are worth it.
--Matthiaspaul (talk) 09:58, 16 October 2020 (UTC)
I think, the goal of these proposals, as far as archive links are concerned, is to reduce clutter in citation source code (URLs tend to be long and ugly), less so to save storage space (because it doesn't matter much) or reduce the amount of typing (as the parameter value would be crafted from a pasted archive link rather than typed in manually).
In the case of |archive-date=, the goal is actually to reduce typing and maintenance time. Although this is only addressing a minor aspect of both proposals, making |archive-date= optional for |archive-url= links from archivers known to include timestamps would be something I would support as well. Wikipedia:List of web archives on Wikipedia lists a number of archivers producing links with embedded timestamps.
--Matthiaspaul (talk) 09:58, 16 October 2020 (UTC)

last-author-amp=[edit]

This documentation edit reminds me that |last-author-amp= should be deprecated in favor of a new parameter with a better name. We do not have |last-contributor-amp=, |last-editor-amp=, |last-interviewer-amp=, or |last-translator-amp= parameters. When |last-author-amp=yes, any of the other name lists that have two or more names will use the ampersand separator between the last two names in the list.

What is the new parameter name? |last-name-amp= is problematic for obvious reasons. |last-sep-amp=? Or, something different, perhaps: |namelist-last-sep=<keyword> where <keyword> is & or amp or and; possibly other keywords? Still needs the new parameter name and keyword definitions.

Trappist the monk (talk) 19:53, 19 August 2020 (UTC)

How about |author-ampersand=, |editor-ampersand=, etc.? Spelling out "ampersand" is a bit awkward, but its meaning is clearer than "amp". The |xxx-ampersand= model is easily extensible to other parameters, such as those listed above. The documentation could make it clear that the parameter, when set to "yes" or "y", renders an ampersand between the final two author/editor/translator names. – Jonesey95 (talk) 21:16, 19 August 2020 (UTC)
|last-author-amp= applies to all name lists even when there are no names in the author name list:
{{cite book |title=Title |translator=Translator |translator2=Translator2 |last-author-amp=yes}}
Title. Translated by Translator & Translator2. Cite uses deprecated parameter |last-author-amp= (help)
This mechanism makes sense to me because the name lists in a citation should all render with the same style. A single parameter name not closely tied to a particular name list seems to me better than renaming |last-author-amp= and creating four aliases of that – I can imagine editors adding an (unnecessary) alias parameter for each name list in the citation...
Trappist the monk (talk) 21:37, 19 August 2020 (UTC)
I've been wondering if late whether this parameter is strongly needed at all. But that aside, I'd go for |namelist-last-sep=<keyword> or similar. --Izno (talk) 21:45, 19 August 2020 (UTC)
I confess to wondering the same, but it exists and were we to take it away, no doubt, no doubt, torches, pitchforks, ...
Trappist the monk (talk) 21:50, 19 August 2020 (UTC)
My mistake. I would support something like |name-list-ampersand= then. And I would not be excited about an open-ended var option for the separator. The last thing we need around here is more citation variation, let alone within CS1 templates. – Jonesey95 (talk) 21:56, 19 August 2020 (UTC)
There was this discussion: Help talk:Citation Style 1/Archive 44 § Is there any interest... I thought I remembered more than that one but it appears that my memory is faulty.
Trappist the monk (talk) 22:18, 19 August 2020 (UTC)
(edit-conflict) If we switch to use a different parameter, I think, it should be one not only allowing the feature to be enabled or disabled, but to actually specify the separator as well. That would be your proposed |namelist-last-sep=, although, I think, that name is too complicated (and contains an abbreviation not all people will understand). The {{catalog lookup link}} template uses |list-leadout= for this. Given that it would apply to all name lists, |leadout-separator= or just |leadout=/|lead-out= could work as well (but could be easily confused with the |postscript= parameter).
Is there a chance that we'd need to specify alternative leadouts also for other lists in the future? Then, the parameter name should be chosen in a way already taking such extensions into account, namewise. However, the only other lists at present are identifier lists and pages — I don't see any possible need to divert from the default separation schemes there, hence, no issue.
However, there are other options as well:
If, for example, we would want to get rid of a parameter, the functionality could be merged into one of the existing parameters
  • |name-list-format= (either through a new token such as "amp", or by just taking all string values except for "vanc" as the actual leadout string — however, in the latter case, the parameter name should be changed to become more meaningful again)
or
  • |display-<names>= (either using negative values -1, -2, etc. to use & instead of the default leadout, or any string values other than "etal" to define the leadout string — in the latter case, the feature could not be used in combination with actually display-truncated lists, and in both cases, the parameter name may need to be changed as well).
If the feature is only rarely used, it could even be emulated manually using |<name>-maskn=, but this would give more options than necessary including some undermining the feature, so it would only be an option for occasional use.
--Matthiaspaul (talk) 23:01, 19 August 2020 (UTC)
I don't think that I like |list-leadout= because leadout seems rather more jargon-ish than most cs1|2 parameters. I don't particularly care for |namelist-last-sep= for the same reason.
The language list uses <space>and<space> (two languages) and ,<space>and<space> (three+ languages). I see no reason to change that.
I do rather like |name-list-format=amp and |name-list-format=and because that parameter applies to all name lists. amp and and will not conflict with vanc because Vancouver style only supports comma separators between names.
I don't think that name-list separators have anything to do with the purpose |display-<name-list>= serves (and negative numbers are just too cryptic). As it works now, |display-<name-list>= causes cs1|2 to ignore |last-name-amp=. I think that this is probably the correct action to take when both parameters are present.
Trappist the monk (talk) 00:12, 20 August 2020 (UTC)
I forgot about the language list, but, like you, I don't see any need for a change there.
I mentioned |display-<names>= only for completeness and because it also deals in some way with the last name in a list, but I completely agree with you, that semantically it has a very different purpose. (Talking about it, this reminds me that these parameters should better be named |authors-display=/|editors-display= than |display-authors=/|editors-display= to follow the naming scheme of most of the other modern parameters to further differentiate on the left rather than the right side.)
I, too, find |name-list-format=amp[ersand]/and/vanc a good name for the purpose (and much better than |last-author-amp=yes), and also like the idea of limiting the choices to a few hardwired tokens instead of allowing this parameter to accept free text.
--Matthiaspaul (talk) 10:44, 20 August 2020 (UTC)
We really should rename |name-list-format= to something shorter, like |nf= (which is short for name format) in parrallel to |df= (which is short for date format). Headbomb {t · c · p · b} 19:58, 23 August 2020 (UTC)
In the sandbox I have extended |name-list-format= to allow the additional keywords amp and and:
  • {{cite book/new |title=Title |author=Black |author2=Brown |name-list-format=amp}}Black & Brown. Title.
  • {{cite book/new |title=Title |author=Black |author2=Brown |name-list-format=and}}Black and Brown. Title.
  • {{cite book/new |title=Title |author=Black |author2=Brown |author3=Red |name-list-format=amp}}Black; Brown & Red. Title.
  • {{cite book/new |title=Title |author=Black |author2=Brown |author3=Red |name-list-format=and}}Black; Brown; and Red. Title.
|last-author-amp= still works:
  • {{cite book/new |title=Title |author=Black |author2=Brown |last-author-amp=yes}}Black; Brown. Title. Unknown parameter |last-author-amp= ignored (|name-list-style= suggested) (help)
I wonder about the punctuation for and. It looks odd to me without the name separator in the three-name list:
  • Black; Brown and Red
or
  • Black; Brown; and Red
Which is better? more correct?
Trappist the monk (talk) 10:59, 4 September 2020 (UTC)
MOS has a preference for the Oxford/serial comma, which I think reasonably extends to our use of the semicolon. --Izno (talk) 14:35, 4 September 2020 (UTC)
The following links indicate that a serial semicolon analogon to the serial comma exists, although it can't be exactly common (I cannot remember to have ever seen this in the wild and it looks quite odd to me):
Given that our specific use case here is a list of names and the fact that corporate names may include the conjunction "and" as well, I nevertheless tend to prefer the second form to avoid ambiguities. This would also be consistent with the way the language lists works at present.
Or go yet a bit further by generalizing the parameter |name-list-format= into |list-format= (also shorter per Headbomb), adding another token like "serial", and (despite what we both wrote above) apply the setting to both, name and language lists with "serial" being the default (also in the "vanc" case)?
--Matthiaspaul (talk) 15:54, 4 September 2020 (UTC)
Tweaked to use ; and for name-lists of three or more but your point about corporate names would also suggest the same tweak for two-name lists and also for name-lists that use the ampersand.
As part of this change, in ~/Configuration for i18n I created sep_nl_and and sep_nl_end in presentation {} and have renamed:
parameter-separatorsep_list
parameter-final-separatorsep_list_end
parameter-pair-separatorsep_list_pair
These were in messages{} but I have moved them to presentation {} where they more properly belong. This change applies to the |language= list and error-message lists. I had hoped that I could use a common function to handle the writing of name lists and language lists but |<name-list>-mask=<text> heaves a spanner into the works because the rendered value from text-masked names uses a space character as a separator. I may still write that function so that at least the language-name and error-message lists can share common code.
Also as part of this change, and unrelated to it, I added require('Module:No globals') which I'm pretty sure used to exist in one of the modules though I can't now find where that was ... This addition brought to light a handful of items that oughtn't to have had global scope so I have marked those items local.
This parameter is for name lists so its name should reflect that; vanc has no meaning for language or error-message lists.
Trappist the monk (talk) 22:14, 5 September 2020 (UTC)
In Module:Citation/CS1/Utilities/sandbox I have created list_make() as the common function that makes a comma-separated list (other separators possible) with selected coordinating conjunction. This function is now used to render certain error messages and to render the languages list:
{{cite book/new |title=Title |chapter=Chapter |section=Section}}
"Chapter". Title. More than one of |section= and |chapter= specified (help)
{{cite book/new |title=Title |page=1 |pages=23–24 |at=¶6}}
Title. p. 1. More than one of |pages=, |at=, and |page= specified (help)
and the language list:
{{cite book/new |title=Title |language=ale}}Title (in Aleut).
{{cite book/new |title=Title |language=cop, la}}Title (in Coptic and Latin).
{{cite book/new |title=Title |language=nv, chy, zun}}Title (in Navajo, Cheyenne, and Zuni).
This one illustrated here because the error message may be assembled in two modules:
{{cite book/new |title=Title |year=2002 |date=2001 Dec 2}} – assembled in Module:Citation/CS1/Date validation/sandbox and Module:Citation/CS1/sandbox
Title. 2001 Dec 2. Check date values in: |date= and |year= / |date= mismatch (help)
{{cite book/new |title=Title |date=2001 Dec 2 |url=//example.com |access-date=2001}} – assembled in ~/Date validation/sandbox
Title. 2001 Dec 2. Retrieved 2001. Check date values in: |access-date= and |date= (help)
Excepting the coordinating conjunction, date error messaging renders differently from the live messaging for the same errors (separator font):
Title. 2001 Dec 2. Retrieved 2001. Check date values in: |year=, |access-date=, |date=, and |year= / |date= mismatch (help)
Title. 2001 Dec 2. Retrieved 2001. Check date values in: |year=, |access-date=, |date=, and |year= / |date= mismatch (help)
Trappist the monk (talk) 17:33, 11 September 2020 (UTC)
Just for reference sake, deprecation will cause a change to about 36k pages. --Izno (talk) 17:04, 4 September 2020 (UTC)
Yep, know about that. I have a bot task pretty much ready to go. In testing that task I learned that it is almost never the case that all cs1|2 templates in an article that could make use of |last-author-amp= (those cs1|2 templates that have two or more names in a name-list) actually have |last-author-amp=. These came from the top of my article list from my testing a week or more ago:
Belarus – 1 use in 18 eligible templates
India – 2 uses in 88
Barack Obama – 1 use in 82
Australia – 1 use in 27
Ronald Reagan – 6 uses in 33
It will, I think be the rare case that every eligible template in an article uses |last-author-amp=.
Alas, BRFAs require test runs so until the deprecation goes live (which includes the new keywords for |name-list-format=), there isn't much progress to be made.
Trappist the monk (talk) 17:50, 4 September 2020 (UTC)
Given that so many pages need to be touched (but can be fixed up by a bot), I actually think we should change the parameter name |name-list-format= into |list-format= (regardless of if we add the "serial" token or not), so that we don't have to change them all again at a later stage.
Meanwhile I actually think we should add the "serial" token as well to allow citations to blend in perfectly with a pre-existing list style in articles.
--Matthiaspaul (talk) 15:24, 5 September 2020 (UTC)
Going through the parameter list, the term "format" is currently used for three different things:
* To specify the document format of URL links with |format= and variants like |archive-format=, |chapter-format=, |section-format=, |entry-format=, |article-format=, |conference-format=, |contribution-format=, |event-format=, |lay-format=, |transscript-format=
* In the |name-list-format= parameter above
* (Indirectly in the |df= ("dat a e format") parameter)
Therefore, in our attempt to improve the consistency of parameter names, I think, we should change the |name-list-format= to something not containing the term "format" any more. Existing usage of |name-list-format=vanc amounts to some 6.5k citations, but if we have to run a bot on 36k entries anyway, before we hammer it into stone forever, another 6.5k edits doesn't really matter, if we thereby reach a higher level of consistency.
Probably the easiest choice would be |name-list=, but this might be misleading. We have |mailing-list= and |series-separator= already. |name-list-separator=? |list-separator=? |separator-style=? |name-list-style=? |list-style=? Opinions?
--Matthiaspaul (talk) 10:28, 9 September 2020 (UTC)
|series-separator= was apparently invented for an early lua version of {{cite episode}}. I can't find where it was actually used in the wikitext version of that template. When I migrated {{cite episode}} to the module suite, |series-separator= was not included. And then came the great separator purge with the invention of |mode=. I'm astonished that |series-separator= survived the purge (an indication of too damn many parameters?). I will remove it and its meta-parameter.
When we invented |mode=, my preferred name for that parameter was |style=. That was rejected, in part, because it would be the same as the html style= attribute.
And |df= is date format, not data.
Trappist the monk (talk) 11:37, 9 September 2020 (UTC)
I was wondering what that is - this explains why I didn't find anything regarding |series-separator=... ;-)
type is in use as well already.
|separator-mode=? |name-list-mode=? |list-mode=?
--Matthiaspaul (talk) 12:28, 9 September 2020 (UTC)
But if it can't be |name-list-type= because |type= then it can't be |name-list-mode= because |mode=, right?
Trappist the monk (talk) 12:35, 9 September 2020 (UTC)
Almost. ;-) It would have to be |cite-mode= then (another 6.9k hits)... (the old problem of too unspecific parameter names biting again) ;->
In the case of mode the two settings are at least both switching between different ways how citations are rendered, whereas in the case of type, the pre-existing usage of the parameter is to specify the media type ("Video") or formal document type ("Essay", "Report"), something not even remotely related to a list style in the citation itself.
I still like style; while it is true that we should try to maintain consistent parameter names across Wikipedia, I think it is even more important to at least reach a logical and consistent parameter naming scheme among the citation templates. So, if we don't find something linguistically and semantically more pleasing, I would still opt for something ending on -style - and if a temporarily confused editor would accidently throw HTML at it this wouldn't cause harm but just return an error message.
BTW. The old thread was Help_talk:Citation_Style_1/Archive_7#Display_parameters:_do_we_need_them?
Any other suggestions?
--Matthiaspaul (talk) 20:40, 9 September 2020 (UTC)
Two-and-a-half weeks have passed without an answer. As we need to find a good new name for the parameter before the pending update of the template (because otherwise, the bot task would hammer the -format name into stone forever), I have continued to seek for alternatives. Some remarks:
  • |name-list-format= is inadequate for our purpose, because semantically, -format implicitly deals with input data. Also, as detailed above, we have an otherwise consistent established use for this already, so we really should use something different here.
  • |name-list-mode= could be a good choice, but then we should move the existing |-mode= to |cite-mode= or similar (and leave |mode= as an alias for it). Semantically, -mode affects some internal configuration of the template and possibly the output, so while it would fit into a future parameter class |-mode= for all kinds of mode settings, it is not a perfect match.
  • |name-list-style= is linguistically very pleasing and semantically a well-suited name, as -style implies that this parameter somehow deals with output data. The HTML argument against |style= does not really apply, as our parameter would be named |name-list-style= rather than just |style=.
  • |name-list-appearance= is, like |name-list-style=, linguistically and semantically well-suited, but quite long.
  • |name-list-display= might be a good choice as well, in particular if we also switch the semantically misleading |display-names= parameters to the |name-display= form, which are semantically better suited and in compliance with our parameter naming conventions to list the input "type" last and disambiguate on the left side. Switching these names (and keeping the older ones as aliases for now) would considerably improve the consistency in documentation and make it easier to remember the parameter names. |name-list-display=vanc/and/amp would fit in the group of |author-display=0/n/etal/|editor-display=0/n/etal, etc. parameters if we define the -display as a parameter class to change the appearance of a citation and not change the template's internal configuration.
Other synonymns I came up with were linguistically or semantically worse.
My order of preference is (in descending order): |name-list-display=, |name-list-style=, |name-list-mode=
Which one should we choose?
--Matthiaspaul (talk) 21:27, 27 September 2020 (UTC)
I'm perplexed. Here you complain that the bot task would hammer the -format name into stone forever) yet, elsewhere on this page you appear to anticipate that |title=none will redefined in future. If the [hammered] ... into stone argument applies to the one it must also apply to the other.
semantically, -format implicitly deals with input data. Really? Where do you get that notion?
If we must choose another name (I'm not yet convinced that we must), I would choose |name-list-style= because this |<noun>-<verb>= parameter in combination with its assigned value, instructs cs1|2 how to style the name lists.
Trappist the monk (talk) 14:54, 29 September 2020 (UTC)
Trappist, thanks for taking the time to think about it and your answer. Having thought about the various parameter classes and their possible future extensions for another two days, I have also come to the conclusion that |name-list-style= is the best name, and that the argument regarding a possible clash with the HTML style= attribute can be ignored here.
Regarding format being associated with input data, I had hoped that my "implicitly" would make it clear that this was meant in the context of our usage in citation templates; all the other parameters using format describe input data, |name-list-format= is the sole exception. In general, format can be associated with output data as well, of course, but at least not with internal states such as mode. While the name is "bearable" and we are used to just use what is given, if, in our attempt to improve the user interface for normal users, we seek for the most-suitable parameter name fitting into our naming scheme, such nuances or subtleties are important to become aware of. Does this make things clearer? It is also possible that not all people have the same associations... ;-)
There is no reason to be perplexed: If we keep the |name-list-format= name, your bot task will hammer it into 36k articles since we merge |last-author-amp= into this parameter. The number would be much too high to carry out this change manually (and also non-neglectible for a bot), but fortunately we have your bot task. Now, if we use |name-list-style= instead, your script will have to edit another 6.5k articles (not much of an addition for the bot, therefore acceptable), but in the end we'd have a parameter name which does not clash with other semantically considerably different uses of parameters of the -format class (as discussed above), and if we would have other settings only affecting the output we could use the -style class for them as well. If we skip this chance to rename the parameter, and would decide that |name-list-format= needs to be changed later, we would have to run a bot just for this task on 42.5k articles (which might be too much to be acceptable). So, doing it now, we can "save" 36k edits. That's why I think we should not skip the chance. (Even, if we want to freeze the code now for the update and could not come to a decision before it, I think, we should include it in the update, because if we would decide against it, we could still silently remove it again in the next update, whereas if we don't include it and then decide to use it, we would have to delay the deprecation of the |last-author-amp= parameter for another quarter.)
(Regarding redefining |title=none, that's a completely different case (best discussed in the other thread), but IIRC it only affects some 1k cites, so it is even possible to achieve manually.)
--Matthiaspaul (talk) 18:04, 29 September 2020 (UTC)

One comment regarding any change: remember that {{harv}} et al. use an ampersand. In articles that repeat references to the same book, I put the full citation on first reference and then use {{harvp}} for subsequent references, akin to how The Chicago Manual of Style shortens subsequent footnotes to a previously used source. If |last-author-amp= weren't available, I'd run into an inconsistency where full citations and shortened citations in the same reference list won't do similar things. (See footnotes 40 [full] and 51 [shortened] or footnotes 50 [full] and 55–57 [shortened] in Michigan State Trunkline Highway System for an example in just one article. Every eligible footnote should be using |last-author-amp= as well.) Imzadi 1979  00:46, 6 September 2020 (UTC)

I don't understand the point you are attempting to make here. It appears that you think that the |last-author-amp= functionality is going to go away because that parameter will be deprecated. Not true. |last-author-amp=yes shall be replaced with |name-list-format=amp. Writing your example citations using the sandbox:
{{cite web/new |url = http://www.michiganhighways.org/history.html |title = The History of Roads in Michigan |last1 = Pohl |first1 = Dorothy G. |last2 = Brown |first2 = Norman E. |name-list-format = amp |publisher = Association of Southern Michigan Road Commissions |date = December 2, 1997 |access-date = September 11, 2008 |page = 1 }}
Pohl, Dorothy G. & Brown, Norman E. (December 2, 1997). "The History of Roads in Michigan". Association of Southern Michigan Road Commissions. p. 1. Retrieved September 11, 2008.
{{harvp|Pohl|Brown|1997|p=3 }}
Pohl & Brown (1997), p. 3
How does that not give you what you want? Or are you silently complaining about the possible inclusion of a name separator with the ampersand: ; & so the {{cite web}} would render like this:
Pohl, Dorothy G.; & Brown, Norman E. (December 2, 1997). "The History of Roads in Michigan". Association of Southern Michigan Road Commissions. p. 1. Retrieved September 11, 2008.
My prospective bot task reports that all eligible cs1|2 templates in Michigan State Trunkline Highway System are using |last-author-amp=yes. Seems peculiar to me that the long-form cite is for page 1 but the short-form cite is for page 3.
Trappist the monk (talk) 01:27, 6 September 2020 (UTC)
Monkbot task 17; BRFA
Trappist the monk (talk) 16:07, 4 October 2020 (UTC)
Yes. I, too, and wondering about circumstance in which a previous editor, for a citation containing three authors, invoked the name-list-style=amp parameter, producing thus: Last1, First1; Last2, First2 & Last3, First3. It just looks weird to me. — Christopher, Sheridan, OR (talk) 06:28, 2 November 2020 (UTC)

Guidance about indexing by first name?[edit]

Is there any guidance about how to handle instances where authors should be indexed by first rather than last name? E.g. Chinese names where family name comes first, or Thai names where given name (which comes first) is the polite term of address? For example, should I call a Thai given name "last=" so the correct name comes first, as you would see in an index? Calliopejen1 (talk) 17:00, 17 September 2020 (UTC)

If you are uncomfortable using first/last in such cases, you may use |given= and |surname=. --Izno (talk) 17:50, 17 September 2020 (UTC)
What do you mean by indexed?
Whatever name you give |last= or |surname= will appear first in the rendered citation. |first= or |given= is always follows and is separated from |last= or |surname= with a comma and a space character. The only way to get cs1|2 to render a person's names in a particular order with particular punctuation is to do it manually with |author=. This same applies to the other name lists (contributor names, editor names, interviewer names, translator names). But none of this has anything to do with indexing.
What do you mean by indexed?
Trappist the monk (talk) 18:00, 17 September 2020 (UTC)
I assume that an author name in a citation should be rendered in the way it would be listed in an index, which is what I'm referring to. There are plenty of external guidelines about this, e.g. Chicago Manual of Style 16.76-16.87. Thai names should appear in an index by first/given name. To respond to Izno, simply using given/surname doesn't work for Thai names because the given name is what they should be referred to by, though it comes first. I suppose I could just do author=, but then I would need to add ref={{harvid|first|year}} because short-form citations (which should use only the given name) wouldn't work properly. Calliopejen1 (talk) 18:10, 17 September 2020 (UTC)
Before electronic indexing this was important. Indeed, citation element order followed the indexing in printed reference works. The primary index often being published main-author-name with publish-date being a secondary index. Today though such reference works are electronic databases with flexible options regarding indexing and sub-indexing (the present discussion). Which makes the positioning of citation elements more of a presentation issue. There is however an existing guideline: present the author name the way you saw it published. Presumably, that would be the easiest way to find it. The parameter |author= fits the bill. 65.88.88.69 (talk) 18:38, 17 September 2020 (UTC)
I agree that it is a presentation issue, but I don't think that presentation is unimportant. For example, I wouldn't want us to be using the wrong part of the name in short-form citations because {{harvnb}} links to "last"/"surname" by default. That would as akin to doing a short-form citation with "Melissa" or "Jennifer" (i.e. inappropriate). And highlighting the wrong portion of the name through inversion is also odd, as is alphabetizing a work in the wrong place in a works cited list. I do think that "author" combined with ref= is probably the way to go. I'm not sure if any other cultures have this particular issue that can't be sorted out by doing given/surname. Possible it's unique to Thai names.... Calliopejen1 (talk) 18:48, 17 September 2020 (UTC)
...existing guideline: present the author name the way you saw it published. Is there? Where?
Trappist the monk (talk) 18:46, 17 September 2020 (UTC)
It is in the same page where it is said that titles should render as published. We are not allowed to be creative with most citation elements if we want verification to be as easy ss possible. There are presentation options with dates for example (within the given dating system). But when one is trying to present a date in a foreign system, it is better to do so verbatim. 65.88.88.69 (talk) 19:23, 17 September 2020 (UTC)
What page is this, out of curiosity? Also interested in the dating issue -- should we be giving Thai solar calendar dates for Thai sources? That seems pretty unhelpful to readers, who may want to know at a glance what year a work was published (i.e. is it an up-to-date source or not?). I checked two Thai works on Worldcat, and one had no date, while another had a Gregorian date. I assume the dates in Thai library catalogs are the usual Thai solar calendar dates though... Calliopejen1 (talk) 19:34, 17 September 2020 (UTC)
I was referring to the general guidelines re: verification. It was not my intent to be mysterious or snarky, and hopefully it will not be seen so. The question the way I understand it, is how to present foreign terms to an English-speaking audience for purposes of verification. Doesn't this answer itself? The technicalities of implementation (the parameter "author", custom short reference anchors etc) will then present themselves in the discussion. 65.88.88.69 (talk) 20:00, 17 September 2020 (UTC)
I don't have access to the on-line CMOS but a cursory look-through of this copy of "Indexes" 15th edition (different chapter number but apparently same title) seems to indicate that "Indexes" is about indexes, not about citation style. But, yeah, if the affect you are wanting to achieve is given name followed by surname and linkable from a short-form template, then |author=<given> <surname> and |ref={{sfnref|<given>|<year>}} will do that. You might want to leave <!--<hidden comments>--> so that editors who visit the article after you have finished with it know your intent.
Trappist the monk (talk) 18:46, 17 September 2020 (UTC)
I agree it is about indexes. But where we have works cited lists, I assume we want them alphabetized in the same way/order they would appear in an index, no? Isn't that implicit in our inversion of first/last names? Calliopejen1 (talk) 18:49, 17 September 2020 (UTC)
Yeah, generally, per WP:CITE we sort by surname – that guideline seems to be mute on the topic of non-western name order. But, this is Wikipedia; I have seen (western) given-name-first reference lists sorted by surname. Why would anyone do that? I don't know, but, as long as it is consistent in the article, WP:CITEVAR protects that style.
The topic of non-western-name-order comes up here periodically. We just haven't determined how-best to deal with it. It is complicated because transliterations of Chinese and Japanese names are apparently not reversible – it is possible to transliterate a to Latin script but not possible to transliterate back to the original – so 'properly' supporting these kinds of names is more than just rendering the transliterated names without the inversion indicator (comma).
Trappist the monk (talk) 19:15, 17 September 2020 (UTC)
See also:
--Matthiaspaul (talk) 04:14, 22 September 2020 (UTC) (updated 19:33, 29 October 2020 (UTC), 02:45, 8 November 2020 (UTC))
(edit-conflict) I would also advise to use the |given= and |surname= parameter variants rather than the |first= and |last= ones. While the order of display for names is "last, first" or "surname, given" at present, this does not necessarily remain so forever. Our style guide may change or we may introduce an |af= ('author format') parameter (as suggested by Headbomb) in the future to control the display order. (See also: Help_talk:Citation_Style_1/Archive 71#First/last_or_given/surname_canonical_form?)
What is important for semantical reasons is that the part of the name that fits into the concept of a family/group name belongs into |surname= (or |last=) and the part of the name that fits into the concept of an individual name into the |given= (or |first=) parameter variant, regardless of their order of display in citations. I think, this is also important for proper meta-data creation.
If, by applying this rule, the current display order or interpunctuation does not look correct for some reason, the display can be overridden using the corresponding -mask parameter variant (like |author-given=Given |author-surname=Surname |author-mask=Given Surname or |author-mask=Surname Given). This is more complicated than just using |author=, but better (at least for as long as the concept of a family and an individual name applies - not sure if this holds true everywhere on this planet).
Now, the anchor is derived from what's in the |surname= or |author= parameter. If it is true that, in the case of Thai names, it should better be derived from what's in the |given= parameter, it might be worth considering a new option like |ref=thai (or |ref=given) for this. (Or, if this should still run under the |ref=harv moniker, a new parameter like |ref-mask=given could be used for this, or this could be even be combined with the proposed |af= into something like |name-mode=Western/Eastern/Chinese/Japanese/Thai/Malay/Indian/Indian-surname/Icelandic/Hungarian/... to control the name display order and style as well as Harvard ref-ID composition and proper meta-data creation by a single parameter.)
--Matthiaspaul (talk) 19:36, 17 September 2020 (UTC) (updated 18:25, 24 September 2020 (UTC))
I may be missing something here. When "indexing" is mentioned, I understand it to mean bibliographic/citation reference indexing. As mentioned earlier, nowadays such databases can be searched via several indices, including combinations. So discovering a work with a "foreign" author name is much easier. But it seems that this is about how such works are indexed in "internal" Wikipedia lists, a presentation issue. I believe they should follow the published rendition. As stated above regarding {{harv}} a custom anchor could work in these cases. 65.88.88.69 (talk) 19:49, 17 September 2020 (UTC)
(edit conflict) Yeah, it's true that a custom anchor can be created manually using {{harvid}} (and is the way to do it now), but assembling an anchor this way is a bit like "open-heart-surgery". (Ideally, the whole information about how CITEREF anchors look like should be "internal" to template editors and no normal editor should have any need to deal with it, so that the implementation could be changed whenever a need would arise for this.)
If, however, this "given name thing" is a general concept for Thai names (I don't know), it would be worth to capsule the assembly of these anchors away from the user and invoke the creation of suitable anchors by some kind of citation template option. This way, the given name and date would not have to be repeated as arguments for {{harvid}}, following the idea of having to provide one piece of information only once for traceability, to ease its maintenance, and also to save some storage space. (In my example above this principle isn't followed for the |author-mask= parameter as well, but this is another possible "shortcoming" of the current implementation, whereas in a hypothetical future version it might be possible to have the template create a suitable display mask automatically if it knows it's a Thai name (this proposal goes in this direction, although related to display styles not naming conventions in general). However, the problem is that on a global scale there are many different naming conventions and once we enhance the current implementation we should ideally find a solution that works good for all of them. Therefore, we are still in learning mode tinkering about possible solutions whenever such a topic comes up.)
--Matthiaspaul (talk) 20:24, 17 September 2020 (UTC)
@Matthiaspaul: One semi-related note in case in case the template's handling of these sorts of things comes up again in the future... My understanding is that Cambodian names are first=surname, last=given name, but the proper mode of address (or anchor) is given name. Right now it's fine just to use first/last for these. (That's what I did, after giving it some thought, in Ratanakiri Province. But it's another instance where given is the proper term of address, but it falls in a different place in the name. You recommend doing given/surname variants, but that wouldn't work for Cambodian names unless you're also going to do author-mask and a custom anchor. Calliopejen1 (talk) 20:09, 17 September 2020 (UTC)
Thanks, this sort of info is always useful. Another example are Hungarian names. Eastern name order has a bit on this, but unfortunately does not name Thai names specifically. --Matthiaspaul (talk) 16:21, 19 September 2020 (UTC)
But we have Thai names... --Matthiaspaul (talk) 16:24, 19 September 2020 (UTC)
@Matthiaspaul: The thing is that Thai names don't use Eastern name order. They use Western name order, but the polite way to address someone is by their first name (i.e. given name). Calliopejen1 (talk) 04:47, 22 September 2020 (UTC)
BTW I assume the reason Thai names are how they are is that Thai people didn't use family names until relatively recently, and the family-name initiative was a reform to "modernize" Thai names. Perhaps the Western way of doing things was viewed as more "modern" at the time (?). This also may be why last names as a term of address didn't really catch on... Calliopejen1 (talk) 04:52, 22 September 2020 (UTC)
Providing an option to switch the composition of reference anchors from "last" to "first" (or other schemes, if necessary for some locales) would be very easy. Above, I suggested to add something like |ref=thai for this purpose or combine this into something like a |name-mode= parameter, which would allow us to select from a number of predefined combinations of display order and anchor composition styles.
However, meanwhile it occured to me that a work may have more than one author and that different settings may be necessary for different authors, if, for example, an English, a Thai, a Hungarian, an Icelandic and a Chinese author collaborated.
In order to enter the names in their various forms (native script, transliterated and/or translated) we need the generic parameter prefixes |script-= and |trans-= to be available as prefixes to name parameters (this has been requested many times already, we "just" need to implement it somewhen). This will ensure that the information can be provided accurately on a technical level.
Likewise, for reliable data entry on a semantical level, editors would choose from the parameter postfixes |-first=/|-last=/|-given=/|-family=/|-forename=/|-surname= the (one or) two that most accurately agree with the naming scheme present in an author's name. (This thread (Help_talk:Citation_Style_1/Archive 71#First/last_or_given/surname_canonical_form?) has a bit on selecting the most suitable postfixes during data entry.)
In this thread (Help_talk:Citation_Style_1/Archive_67#Possible_improved_treatment_of_title_parameters_and_language_attributes), I proposed how the scheme of language prefixes of the |script-= parameters could be expanded from only supporting a number of non-Latin scripts to all language codes without introducing any backward-compatibility problems. We could then use these language prefixes to control, (like that hypothetical |name-mode= parameter above, but) on a name-by-name basis, the various settings needed to display the name correctly (display order and possibly necessary text decoration), to generate the correct meta-data for it, and to derive the name parts for an anchor in Harvard style from it.
The current assignment of first=given=forename and last=family=surname would continue to hold true by default (also for backward compatibility with names provided without |script-=). However, if a name would be entered with f.e. the language prefix zh indicating a Chinese name, the internal assignments would become last=given and first=family, so that |script-author-given=zh:Given and |script-author-surname=zh:Surname would work just as well as |script-author-last=zh:Given and |script-author-first=zh:Surname and be rendered as "Surname Given" (no comma) (whereas the conventional |author-given=Given and |author-surname=Surname or |author-last=Surname and |author-first=Given would be rendered as "Surname, Given").
These settings could be implemented as properties in a table of language codes. Also, now, that the non-hyphenated name parameter forms will soon be gone, the code could take advantage of the symmetries in the parameter naming scheme to fold the name parameters into one "[prefix-][name[#]][-postfix[#]]" form (where name would be author/editor/contributor/translator/subject/interviewer). This would reduce redundancies in the code and avoid an endlessly long parameter whitelist.
I think, this extensible scheme would allow us to enter any kind of name in a semantically and technically correct way and process the data according to the rules necessary to be obeyed for each individual name for proper output on all ends.
--Matthiaspaul (talk) 19:31, 24 September 2020 (UTC)
This might look complicated, however, this is only because the example is for a Chinese name in Chinese script where the usage of the |script-= parameter variants would be mandantory. For Latin-based scripts, including translated Chinese names, things would be much simpler. As having to use the |script-= variants just to give language codes appears to be too cumbersome in these easier cases, what about supporting the language prefixes also for the non-script parameter variants? This would reduce something like
|script-author-given=hu:Given and |script-author-surname=hu:Surname
down to
|author-given=hu:Given and |author-surname=hu:Surname
or even to
|given=hu:Given and |surname=hu:Surname
Likewise
|script-author-last=hu:Given and |script-author-first=hu:Surname
to
|author-last=hu:Given and |author-first=hu:Surname
or even
|last=hu:Given and |first=hu:Surname
Assuming that Hungarian names (as indicated by hu:) would be internally configured to be rendered in "Eastern order", this would be rendered as "Surname Given" (without comma) (NB. This is only an example, the actual configuration for Hungarian names could be different, in fact, according to some style guides it is), whereas English names per
|given=en:Given and |surname=en:Surname
or
|last=en:Surname and |first=en:Given
or
|given=Given and |surname=Surname
or
|last=Surname and |first=Given
would be indexed/rendered as "Surname, Given" (with comma).
--Matthiaspaul (talk) 21:20, 16 November 2020 (UTC)

Bump PMC to 8000000[edit]

PMC 7528258 is valid, but gets reported as an error. Headbomb {t · c · p · b} 18:42, 2 October 2020 (UTC)

bumped.
{{cite book/new |title=Title |pmc=7528258}}Title. PMC 7528258.
Trappist the monk (talk) 19:46, 2 October 2020 (UTC)
Is the rate by which this increases predictable with reasonable certainty so that we could automatically increase the upper limit depending on the current date somehow? If so, the maintenance rate for these limits could be reduced significantly. Not that this would be much of a problem right now, but it requires monitoring. Let's think a couple of years into the future when we might no longer be around here any more - it's always better if things are set up in a way that does not need any or only very few updates.
--Matthiaspaul (talk) 21:51, 2 October 2020 (UTC)
See also:
--Matthiaspaul (talk) 21:30, 12 October 2020 (UTC) (updated 11:36, 5 November 2020 (UTC))

PMID limit[edit]

At Special:Permalink/982911547#PMID error, Nixinova was concerned that PMID 33022132 was outside the range specified at Help:CS1 errors#bad_pmid. This turns out not to be the case, as the limit specificed there is 33100000. However, it's awfully close, which led me to investigate it.

  • #1426 @ 2020-10-10: last id 33038074
  • #1423 @ 2020-10-07: last id 33026741
    • 33038074 - 33026741 = 11333 ids / 3 days = 3778 ids/day
  • #1334 @ 2020-09-11: last id 32915410
    • 33038074 - 32915410 = 122664 ids / 29 days = 4230 ids/day
  • #1100 @ 2020-03-01: last id 32113198
    • 33038074 - 32113198 = 924876 ids / 223 days = 4147 ids/day

The PMIDs appear to be assigned sequentially and are documented to "not be re-used". Based on the highest numbers found in several daily files here, the rate is roughly 4000 per day. The latest PMID as of the 2020-10-10 file is 33038074, which means it will hit 33100000 in less than 16 days. Was there a reason for the (strangely specific) 33100000 limit, should it be increased (soon), and to what? —[AlanM1 (talk)]— 15:25, 11 October 2020 (UTC)

I see Trappist the monk has been maintaining Module:Citation/CS1/Configuration. —[AlanM1 (talk)]— 15:30, 11 October 2020 (UTC)
I picked 33100000 just to clear the error. The limit exists to catch simple typos: too many digits, most significant digits out of bounds. Alas, we can't catch too-few-digits or typos that produce in-bounds results... cs1|2 can't do much more to protect editors from these kinds of mistakes. The limit should be sufficiently tight that we catch typos but not so tight that we overrun the limit every few days.
We might set the limit at 33500000 which, at 4k/day, will last us 100+ days. Elsewhere on this talk page it is suggested that we automatically increment the limits for the various identifiers. I don't particularly like that as a solution because there is no way to automatically close the loop to reduce or increase the limit-deltas as conditions warrant.
Trappist the monk (talk) 16:24, 11 October 2020 (UTC)
Not without some arbitrary number like we have today, of course. --Izno (talk) 18:04, 11 October 2020 (UTC)
If someone has a general purpose bot, perhaps a job could be added to it, to be run monthly. It could retrieve the latest XML file from the FTP link above, find the highest PMID value, add 120,000 (30 days' use), round up to the next 100,000, and update the id_handlers['PMID'].id_limit value in the config file. Or someone could do it manually. While I do have a couple of things I do monthly manually, I don't have a foolproof system in place to ensure things get done and it would seem like this is too important for my casual approach. Face-smile.svg Are there other values here that can/should be updated, too? —[AlanM1 (talk)]— 06:05, 12 October 2020 (UTC)
We could also define a bot task to scan for the highest identifier value used in an article while performing other tasks (or have a bot continously loop over all articles), check this value against a value recorded in a new "/Limits" sub-page of the citation template, and update that value if the found value is higher. This sub-page would have to be unprotected to be easily accessible by bots and editors. The citation template could read this value and compare it against the value specified in its "/Configuration" module (which is protected), take the higher value, add some safety margin to it, and treat the result as the allowed upper limit in citations (with or without some extrapolation facility). Many variants of this are possible.
Using this approach would make it possible to more frequently update the limits while still ensuring that at least all values below the value specified in "/Configuration" are treated as valid. The limits in "/Configuration" would be updated whenever the template gets updated. By specifying a much too high value in "/Limits" vandals could temporarily disable the upper limit check but they could not cause the template to use much too low values in an attempt to invalidate (older) values in citations.
--Matthiaspaul (talk) 21:58, 12 October 2020 (UTC)
At least in theory Wikidata could also be used to retrieve some useful information instead or in addition to something like "/Limits": PMID (P698) has a property "number of records" P4876.
  • 30060294 @ 2019-08-01
  • 30178674 @ 2019-11-19
However, the info there is outdated.
The "number of records" is also defined for DOIs (P356) and JSTORs (P888); similarly outdated.
--Matthiaspaul (talk) 19:24, 14 October 2020 (UTC) (updated 13:25, 15 November 2020 (UTC))
Just to illustrate this a bit more, the unprotected "/Limits" subpage to be regularly kept up to date by bots or editors could be in a simple CSV format like:
pmc-limit=8000000,pmid-limit=33200000,ssrn-limit=4000000,s2cid-limit=230000000,oclc-limit=9999999999,osti-limit=23000000,rfc-limit=9000
The template would attempt to read this file and if present, check the identifier against either the internally defined limit or the limit defined in this file, depending on which one is larger.
Whenever the template would be scheduled to be updated, the internally defined limits would be updated to those from the "/Limits" file plus some margin.
Depending on the amount of overhead allowed the format of the "/Limit" could also be Lua source code instead of CSV.
--Matthiaspaul (talk) 23:54, 5 November 2020 (UTC) (updated 10:54, 10 November 2020 (UTC), 23:39, 13 November 2020 (UTC))
That was here: Help_talk:Citation_Style_1#Bump_PMC_to_8000000.
This "auto-increment" would still require monitoring/updates/adjustments of the limits and factors, but less frequently.
--Matthiaspaul (talk) 21:58, 12 October 2020 (UTC)
Sounds more complicated and error-prone than using the latest XML change file at PubMed for the max value and adding enough headroom to get past the next anticipated run. I don't think it should try to be exact, since new IDs are constantly being assigned and the latest articles may not be cited for some time. The new increment could even be re-calculated on each run based on the current and previous months' max values and file dates, plus a fudge factor based on some stats I can get from the variance in the current history file set. —[AlanM1 (talk)]— 00:16, 13 October 2020 (UTC)
Yes, for as long as such an XML file exists as an external resource, but this does not seem to be the case for all identifiers which need to be bumped up frequently. --Matthiaspaul (talk) 00:58, 13 October 2020 (UTC)
Another approach would be to allow users to temporarily enter "too high" values using the accept-this-as-written markup, this would put them into special maintenance categories similar to invalid ISBNs, etc. (This could be implemented with minimal overhead.)
If bots would run into this markup in the |pmc=, |pmid=, |ssrn= or |s2cid= parameters, they would retrieve the currently configured limit for an identifier through
{{#invoke:Cs1 documentation support|id_limits_get|<identifier>}}
like
  • Current PMC limit: 8000000
  • Current PMID limit: 33200000
  • Current SSRN limit: 4000000
  • Current S2CID limit: 230000000
  • Current OCLC limit: No limit defined for identifier: oclc (will show after the next template update)
  • Current OSTI limit: No limit defined for identifier: osti (will show after the next template update)
  • Current RFC limit: No limit defined for identifier: rfc (will show after the next template update)
and compare it against the number specified in the citation. If the limit is larger, they would remove the markup, otherwise leave it as it is. This would have the advantage that the "fix" is trivially easy for editors, and that the templates would not have to read a "/Limits" file. However, bots would have to edit the citations.
Still, the bots should record the highest found numbers in some prominent place (for example in a "/Limits" file), so that the internally defined limits can be easily updated accordingly when a template update is scheduled. Otherwise, someone would have to manually go through the maintenance category to determine the new limits.
--Matthiaspaul (talk) 23:54, 5 November 2020 (UTC) (updated 10:54, 10 November 2020 (UTC), 23:39, 13 November 2020 (UTC))

Error category names standardization[edit]

Could the error categories in the next version sync be standardized? Out of the 55 categories in Category:CS1 errors, 44 start with "CS1 errors". These are the ones that use a different style:

--Gonnym (talk) 15:11, 22 October 2020 (UTC)

See Help talk:Citation Style 1/Archive 71 § error category names standardization and the top of Module:Citation/CS1/Configuration/sandbox
Trappist the monk (talk) 15:17, 22 October 2020 (UTC)
In case that response is unclear, the sandbox version of the module has been updated to standardize the above category names (follow the Archive 71 link to see the new names). They will be updated the next time the sandboxes are copied to the live module (typically every couple of months). – Jonesey95 (talk) 15:59, 22 October 2020 (UTC)
Thanks both for the link (and Jonesey for saving me time reading that). --Gonnym (talk) 16:22, 22 October 2020 (UTC)
Since it's already archived, I'll comment here. I didn't see these 3 mentioned at the discussion. All sub-categories of Category:CS1 properties which use a colon: Category:CS1: long volume value, Category:CS1: Julian–Gregorian uncertainty and  Category:CS1: abbreviated year range. --Gonnym (talk) 18:06, 23 October 2020 (UTC)
I think that it should be the other way 'round: all Category:CS1 properties cats should have a colon after the 'CS1' prefix just as all error and maintenance categories with the 'CS1 errors' and 'CS1 maint' prefixes have a colon. I don't know if it is really necessary but, we could go further and use 'CS1 prop' prefixes.
Trappist the monk (talk) 18:51, 23 October 2020 (UTC)

And Category:CS1 has been listed for renaming to Category:Citation Style 1; see Wikipedia:Categories for discussion/Speedy § Current requests.

Trappist the monk (talk) 19:27, 23 October 2020 (UTC)

Not particularly important, but if we are going to rename / streamline the CS1 category names anyway, perhaps we should also change
"maint" -> "maintenance"
in the category names. The rationale would be to avoid unnecessary abbreviations. Space is not an issue here. "Maint" is non-standard developer jargon, therefore pretty obvious for us. But I'm not sure if uninvolved readers (our target audience) will guess its meaning equally easy. --Matthiaspaul (talk) 14:22, 26 October 2020 (UTC)
It is most definitely an issue for people who use Timeless where the categories end up in the sidebar through no fault of their own ;). Uninvolved readers can't see the category on each page anyway since it is hidden (like CS1 errors for that matter), much less the maintenance message itself, so the only other place they might stumble upon the category name is the category page itself (which provides sufficient context) or the context of discussions about the categories, like this one (which also provides sufficient context). --Izno (talk) 15:25, 26 October 2020 (UTC)
I don't understand this argument. So what if categories are in a side bar? Here is an article using timeless skin that has three hidden categories. All are visible to me (I presume because I have enabled hidden category display in my preferences). All of those category names are readable. The maintenance messaging must be turned on by interested editors but our choice of category names has no bearing that. So what is it that you are really complaining about?
Trappist the monk (talk) 13:42, 28 October 2020 (UTC)
I did not assert that I could not see them in Timeless. I did not assert clearly anything by what I did say, in fact... To make it clear now, I do not want longer category names because they will not wrap cleanly and/or will make an already often-long sidebar on the right much longer for no obvious gain. I honestly don't want longer message names either (as CS1 maintenance: is longer than CS1 maint:), which has a similar, though of lesser nature, concern associated with how long the word is.
I did argue that how long or what is in the category name is immaterial to the casual reader who cannot see the categories in any location whatsoever (c.f. But I'm not sure if uninvolved readers (our target audience) will guess its meaning equally easy. by Matthias). Someone who can't see the category listing on a specific page won't care how long or what the names are, which means that only the following groups are of interest: a) casual people who have somehow navigated to the category page are there by happenstance and are provided an explanation in the rest of the page; b) casual people who see a discussion on a page like this one, in which they are provided sufficient context; and c) people who are not casual and have turned on hidden categories will need to learn what is going on, but that also is made obvious by the content of each named category. And then, those who see the structure once can probably figure out what is going on from thereon. (Do I presume too much?) In all cases regardless, someone can click and see what is on the category page and will see "maintenance" in some form or another on each.
I assert that the reason our error messages don't have CS1 in them is that abbreviation is just as much technobabble as the asserted shortening of "maintenance" is....
As for 'properties', I do think those should at the least be consistently at CS1:. I honestly haven't decided whether I like "props" or "properties" more, though you might tell which I lean toward. --Izno (talk) 21:15, 28 October 2020 (UTC)
Please, no, "props" is really cancer to the eyes.
In general, as Wikipedia is for readers, I wonder if we should support grammatical nonsense such as "maint" at all. If Timeless can't cope with "maintenance" in category names well, it will have problems with longer-than-average words and titles in general, that is, it is an issue that occurs all over the place. If so, it is a problem of the skin, not the contents, and consequently should be addressed at skin level, not by adjusting the contents. Looks very unprofessional to me.
--Matthiaspaul (talk) 10:24, 29 October 2020 (UTC)
You apparently continue to ignore what I said. Readers. Can't. See. These. Categories.
I happen to agree that the skin is doing something dumb here, but saying we must do X because of Y reasons and then ignoring the other Z reasons that we really don't need to do X isn't cool. "Looks very unprofessional to me." --Izno (talk) 00:50, 1 November 2020 (UTC)
I wonder how you come to that conclusion. I read your reasoning and value it as any constructive input into the discussion, but it didn't convince me much, in particular because addressing this in our narrow context by using abbreviated category names won't solve the problem anywhere else, so it's clear that the fix for this must be elsewhere. Also, while I originally wrote "uninvolved readers", editors and developers are readers as well. Since you more or less suggested to change the category names to "props" I provided my opinion on this.
Y and Z must be weighted. I think that, in general, skin issues should be solved on skin level. I would support special-casing something to improve the appearance in Timeless (although I don't know what that could be in this particular case). The extent of this support would stop where it would weaken the general appearance in other skins (including the default Vector skin).
Perhaps the solution would be to change the thresholds when categories move from the bottom of a rendered page into the sidebar and/or to blend in the categories only after pressing a special button, I don't know. (I don't use Timeless and given the many issues you reported with this skin in the past it does not appear to be very desirable to use.)
--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)
I'll leave the Timeless redesign discussion aside save to say that no, the (quantity of) issues I have reported do not reflect my happiness with the skin.
I'm not sure how much more productive this line of discussion between you and me will be, so I will leave that there also. Perhaps another editor or two will appear to discuss/give input. I will suggest something a little more off-the-wall down the way to see if that fits anyone's tastes. --Izno (talk) 04:15, 2 November 2020 (UTC)
From there, I think it's just a question whether we shall catch headgoblins. --Izno (talk) 21:19, 28 October 2020 (UTC)
This discussion has meanwhile been moved to Category_talk:CS1#Opposed speedy move request.
--Matthiaspaul (talk) 21:26, 3 November 2020 (UTC)

'CS1 maint:' → 'CS1 maintenance:' and properties: 'CS1' (with and without colon) → 'CS1 properties:'. See Module:Citation/CS1/doc/Category list. —Trappist the monk (talk) 15:30, 26 October 2020 (UTC)

And ... I've been reverted by some mechanism that doesn't notify editors that a revert has taken place and which also reverted an unrelated edit to Module:Citation/CS1/Configuration/sandbox.
Trappist the monk (talk) 13:16, 28 October 2020 (UTC) 13:20, 28 October 2020 (UTC)
Because of that unrelated edit, MediaWiki indicated there was an intervening edit that could not be reverted with 'undo', I needed to perform a manual revert i.e. opened the previous good version, copy-pasted the uncontested change in from the current version, and saved. (In retrospect, I suppose I could have undone all of the edits rather than the two I would have preferred to revert with 'undo', then reinserted the uncontested change in the one edit pane.) --Izno (talk) 21:15, 28 October 2020 (UTC)
Or not revert at all. --Matthiaspaul (talk) 10:24, 29 October 2020 (UTC)
Or not revert at all. This is a wiki. Get off that dumb horse there. Moreover, it isn't cool to deliberately misinterpret what I said to mean "I had to revert". You know that was not the intention. These modules are used on a couple million pages. Consensus is required for change. A revert makes it obvious you don't have consensus. --Izno (talk) 00:50, 1 November 2020 (UTC)
Izno, I did neither question the technical necessary steps of your reversion nor did I question your good intentions in general. Likewise, I can rule out any deliberate misinterpretation on my side. My remark was meant as a friendly reminder that reversion of perfectly good-faith contributions should remain restricted to cases where no other options for improvement exist. An occasional revert hardly harms, but frequent reversions do. Trappist edited the sandbox, not the live template. The sandboxed modules are not used on millions of pages. The sandbox is, by its very definition, open for experimentation by anyone, and while it makes sense not to wait until the next update to clean up edits for which there is no consensus also in the sandbox (at least for as long as we don't have a separate release stage area), there also is no reason to revert changes almost immediately in the middle of the discussion just because you don't agree with them.
--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)
By the same token, on Wikipedia, quick reverts save everyone time untangling good edits from bad. This is no less true in a community sandbox (that I personally treat as requiring consensus) than the module which is the live representation of that sandbox. You particularly, in the last module release, conflated many reasonable edits with many edits that I would have personally preferred not to have seen added to the module, but I could not easily revert them and gave you the benefit of the doubt that you would announce those changes proactively (like Trappist and I have done when we make changes to the sandbox).... T'was not to be. (I still don't understand a few of the changes that were made, and that disturbs me both from the user-perspective and the rationale perspective.) Instead, I will revert now, skip being worried that unnecessary complexity has been added, and ensure that what is in the sandbox is something that could be deployed tomorrow with the appropriate consensus (if we were interested in doing so).
If you (or anyone else) would like to "experiment" (rather than announce and propose actual changes), then, like the main page sandbox suggests for that area (see edit notice), your own personal copy of the modules might be preferable for experimentation or redesign. (Were it the case we could fork more easily... maybe there is a Javascript writer who could do that for us. :^) --Izno (talk) 04:15, 2 November 2020 (UTC)

Different suggestion: We should consider removing all CS1 'subtypes' from the category names, meaning that "CS1 errors:", "CS1 maint:", and "CS1(:)" would all become "CS1:", and then only the parent category name would need to match the category of message/property. This would have the side-benefit that maintenance messages promoted to errors, or properties to maintenance/errors, would not need to have their category name changed when promoted. --Izno (talk) 04:15, 2 November 2020 (UTC)

Meh. Now that the sandbox has been altered to normalize the category names, certainly the prefixes can be removed from the category names in Module:Citation/CS1/Configuration/sandbox error_conditions{}. We can then apply the prefixes in Module:Citation/CS1/sandbox where we actually create the category wikilink. Same side benefit and we retain the prefixes which I think we should do. I suspect that it isn't possible to apply some sort of css trick to category wikilinks so that individual editors could hide the prefixes in the hidden categories list...
We should not forget the non-English wikis of various types that also use these modules and the associated categories; for them, the prefixes may (or maybe not) be important.
Trappist the monk (talk) 14:10, 2 November 2020 (UTC)
Indeed, but they are also free to customize as they see fit; they are not beholden to precisely the same naming.
As for prefix removal to elsewhere if such occurs, my understanding is that it is both harder for i18n and more expensive for Lua to process two strings like that in separate places. Where necessary we should perform string manipulation, but I do not think this would be a place were we to go down that path. --Izno (talk) 15:00, 2 November 2020 (UTC)
Agreed, other wiki's are not obliged to follow our lead. For those that do, fully spelled-out names may be important.
Of course. Any work that the module has to do costs time and processor resources. We are already concatenate Category: with <category name> which we then hand off to utilities.make_wikilink() where we concatenate [[, the prefixed category name, and ]] to make the final result. Changing utilities.make_wikilink ('Category:' .. v) to utilities.make_wikilink ('Category:' .. cfg.presentation['<category prefix>'] .. v) isn't much extra work.
Maybe better would be to write something like:
utilities.substitute (cfg.messages['cat wikilink'], {cfg.messages['cat err prefix'], v})
where the messages{} table has:
['cat wikilink'] = '[[Category:$1$2]]'
['cat err prefix'] = 'CS1 errors: ',
['cat maint prefix'] = 'CS1 maintenance: ',
['cat prop prefix'] = 'CS1 properties: ',
For i18n, should probably do something like that anyway so that other language wiki's don't have to edit Module:Citation/CS1 as well as Module:Citation/CS1/Configuration.
Trappist the monk (talk) 16:32, 2 November 2020 (UTC)
I have have added ['cat wikilink'] = '[[Category:$1]]' and a matching [':cat wikilink'] = '[[:Category:$1|link]]' to ~/Configuration/sandbox. To use those I have replaced the calls to utilities.make_wikilink() with utilities.substitute (cfg.messages['cat wikilink'], {v}) where we make category names and the similar call where we make the link in maint error messages.
Trappist the monk (talk) 18:00, 2 November 2020 (UTC)
Apparently, even experienced editors don't understand that properties categories are not error categories. I just stumbled on this discussion: Help talk:Citation Style 1/Archive 68 § Bogus long volume which suggests to me that were Category:CS1: long volume value renamed to Category:CS1 properties: long volume value then that discussion might not have been necessary or would have been about something else.
Trappist the monk (talk) 15:21, 6 November 2020 (UTC)

A parameter for open content licenses (CC BY) and automatic filling/parsing via reFill and Autofill and/or a bot[edit]

Could you add a parameter to indicate open content licenses of studies? Especially (or only) Wikimedia-compatible ones and mainly CC BY 4.0.

Such tags would have many advantages for readers and editors − for instance, they can indicate that the source may have relevant freely licensed images which could be used by the reader or be uploaded (and possibly added to the article) by an editor.

It could work similar to the |doi-access=free parameter and would complement it. In particular this parameter is not about access to the (full-text of the) reference/paper but about the license of the content (in particular whether or not it's an open/compatible license and if so which).

It would be best if this parameter was set automatically by the Autofill tool (the magnify icon in the RefToolbar) and reFill. It could also be set by a bot similar to User:OAbot or even that same bot. Here's an example of one of the bot's changes. However, the parameter could be added to the template before any of these is implemented.

The visual display should include the CC BY 4.0 (or similar) logo, similar to the icon that is displayed for |doi-access=free, so that it's quickly and clearly visible that the respective study is licensed that way. The respective reference could then look like this:

Kawaguchi, Yuko; et al. (26 August 2020). "DNA Damage and Survival Time Course of Deinococcal Cell Pellets During 3 Years of Exposure to Outer Space". Frontiers in Microbiology. 11. doi:10.3389/fmicb.2020.02050. S2CID 221300151. CC-BY icon.svg Text and images are available under a Creative Commons Attribution 4.0 International License.

(I'm currently adding it manually as in the above example to references at 2020 in science, which also helps in my, and possibly at some point others', efforts to upload relevant images from these studies to Commons. The above example is from that page.)

--Prototyperspective (talk) 13:49, 24 October 2020 (UTC)

This is a very bad idea - references should be chosen because they are the best, most suitable and reliable sources for the article, not whether they are released under a convenient license. This proposal would merely reinforce FUTON bias.Nigel Ish (talk) 14:03, 24 October 2020 (UTC)
This is not about which references to choose at all.
You could argue against pretty much all the other existing parameters like this; it's irrelevant to this proposal. --Prototyperspective (talk) 14:39, 24 October 2020 (UTC)
No. The purpose of a citation is to identify the source that editors consulted. Licensing of that source doesn't aid the reader in locating the source. Similar proposals have been rejected here before. You might want to troll through the archives of this talk page for those discussions.
Trappist the monk (talk) 14:49, 24 October 2020 (UTC)
This is not about the selection of which reference to use at all. Agree on The purpose of a citation is to identify the source that editors consulted.
You could argue against pretty much all the other existing parameters (except for the DOI/URL and including the author parameters or the |doi-access= parameter) like this; it's irrelevant to this proposal. --Prototyperspective (talk) 15:08, 24 October 2020 (UTC)
See also:
--Matthiaspaul (talk) 12:20, 5 November 2020 (UTC)
  • No, per all past discussions concerning this. Citations are there to verify the information, not to advertise, document, and promote whatever random license a specific article, chapter, book, webpage, etc. is published under. Headbomb {t · c · p · b} 17:33, 24 October 2020 (UTC)
  • Again: you could argue against most other existing parameters (except for the DOI/URL) like this. Why do we hyperlink the URL, add the date, specify whether or not its full text is open access and specify the language for references? These things have utility for readers (and editors). I'll try to find and read the former discussions. --Prototyperspective (talk) 17:59, 24 October 2020 (UTC)
    Because we follow the best practice as outlined by style guides. Those things identify a citation and help verify information. Free or not let readers know that when clicking on a link, what's on the other end isn't a paywalled article. Language helps the reader know if they can read what's on the other end if they gain access (or who to get to translate if they can't). That an article is published under the GPL, CC-BY-SA, CC-BY-NC, CC-BY, MIT License, or any of dozens of random licenses doesn't help anyone verify the information. Headbomb {t · c · p · b} 18:13, 24 October 2020 (UTC)
    Because we have not discussed at any length removing those, much less obtained consensus. That said, our citation system has grown out of offwiki ones, so information like location and authorship comes from there. Language particularly has been discussed here but never given too much thought (some musings like "why is that there"), the usual answer for which is "I wouldn't want to get this book because I can't read French, and I should know that before trying to verify the content in the article". YMMV on the value. If you'd like to start a discussion on removing one or another parameters, you are free to. (Good luck.) --Izno (talk) 18:16, 24 October 2020 (UTC)
    About the parameters. Modern citation systems didn't just spring up out of nowhere. They were tailored to the needs of rapidly expanding scientific research and high volumes of institutional publications and archives, joined shortly by concerns (both legal and commercial) for attribution. All these works had to be categorized before they were cited. Before electricity and telecoms this was not an easy or standardized endeavor. The simplest solution, to this day, was to classify (and eventually index) by author, title, and date, most often in that order. The place where the work was published and name of those who made the work public came next. The subject was also included but this is dicey, since people 200 years ago had very different ideas of what, say, chemistry is about. Modern citation systems when first devised, followed this model, with some variation. They still do. Because the emphasis of citations is on verifying something easily and quickly (and not to provide encyclopedic-like metadata analysis like a bibliographic reference), many more location parameters were added. "Location" meaning something that helps the reader locate the information. Including marketing identifiers such as ISBN, chapter names, content locators such as URLs, page numbers, and, when the work is in a language different than the citation language, the language, because this makes for more efficient and effective use of the readers' time and resources. Maybe I am missing something, but I don't see any value-add by the proposed parameter. 98.0.246.242 (talk) 01:22, 25 October 2020 (UTC)
    I explained the value added by the parameter. Also see WP:NOTPAPER.
The sincere considerations to remove even information like location and authorship from reference is a perfect showcase of the prevailing thinking related to these matters here whereby improvements, if they are possibly slightly unconventional, are immediately rejected on that basis.
If the utility for verifying the information is the only thing any changes to the templated are being judged by / considered here – for yet unexplained reasons – then information on the license could help with verifying information by third parties via direct inclusion/use of content contained in these references. This shows that at the other end is a reference whose content could be used to verify information to others. --Prototyperspective (talk) 10:38, 25 October 2020 (UTC)
How does knowing the license, if any, tell a reader how to find the cited source? How does knowing the license, if any, tell a reader if the cited source can be read or even understood? The various bibliographic details, like author/date/title/publisher would allow me to find a source. The free access icon lets me know if the source is freely available and not locked behind a paywall. The language notation lets me know if the source is written in a language that I can understand. The license does not help me in those regards. Imzadi 1979  13:43, 25 October 2020 (UTC)
  • A whole new parameter for this is overkill, but all of the |*-access= parameters (both sets of them) could accept a value of "open" indicating open content (free-as-in-freedom, not just -beer), without wallowing in exact specifics of any license, but perhaps with a particular icon. The preset behavior is that the *url-access group supports "registration", "limited", and "subscription" ("free", as in beer, is the default and isn't supported as an explicit parameter); but the bibcode/doi/hdl/jstor/ol/osti/s2cid-access group supports only "free" (as in beer), because some kind of paywall is the norm in those cases. A side bit is that arxiv/biorxiv/citeseerx/pmc/rfc/ssrn do not have a corresponding *-access parameter because they "are always free-to-read. For those named-identifiers there are no access-indicator parameters, the access level is automatically indicated by the template." Unless we really, really need to distinguish between beer-free and freedom-free in these cases, I'm skeptical that we'd need to add *-access parameters for that third group just to support =open. I think indicating "open" might be of some use. While the primary purpose of our citation is demonstrating that our article material isn't nonsense, we also know that one of the main purposes WP is put to by students and researchers, rather than just the casually curious, is "mining" WP for source material to reuse.  — SMcCandlish ¢ 😼  22:01, 26 October 2020 (UTC)
While I'm not convinced that we actually need this, I like your solution-oriented approach of attempting to think a problem from a user's perspective and trying to actually address it instead of just telling him it was a bad idea. We should always keep in mind that all the work we are doing while we are developing these templates is to serve editors solve their problems. If they use citation templates for things outside of the original narrow scope, they have a reason for this. In some cases, there are other solutions, but often enough such requests indicate some shortcoming in our templates or sensible new use cases. So, there's is always something to learn from such requests. --Matthiaspaul (talk) 13:40, 5 November 2020 (UTC)
  • I'm not convinced yet that we actually need this, but it is undeniable that there have been many requests for such a parameter in the past, so there obviously are users who have a need for such information and/or parameter. I don't like the verbose and obtrusive
"CC-BY icon.svg Text and images are available under a Creative Commons Attribution 4.0 International License."
appendage suggested by the OP, but providing this information in a very subtle way through a machine-readable parameter and displayed by adding some tool-tip to the title would seem to be an acceptable solution to me. Adding this info as a new value "open" etc. to our |-access= parameters would be misleading, as the license of a work belongs to the work/title "as is", not some specific link. Offline works have licensing terms as well. (The individual webpages for the given identifiers may have usage terms as well, but they are irrelevant in our context aiming to cite a source for the article.) So, the info would have to be associated with the |title=. Such a parameter could be named |title-status= (in analogy to |url-status=) or just |license=. Related, many bibliographic entries in literature databases and metadata also have something like |rights=. For example, in the OpenURL/COinS profile info:ofi/fmt:kev:mtx:dc it is collected in the &rft.rights= tag. So, this could be a suitable parameter name for such kind of information as well.
--Matthiaspaul (talk) 13:40, 5 November 2020 (UTC) (updated 15:11, 14 November 2020 (UTC))

Nbsp in |author, |last, and equivalents for other contributors[edit]

Prior to the last release, the code that looks for looked for a count of characters that was more than 1 of either commas or semicolons. For example, |author=Last, First, Jr. or something like |last=Last; Last2; Last3 (unfortunately not contrived :( ) would have triggered the maintenance message, both of which still today emit a maintenance message. (I am not sure if a mix of semicolon and comma would have done the same but think one semicolon and one comma would have.)

However, the behavior changed in the last release so that now commas and semicolons are counted separately, and if there are more than 0 semicolons, the module emits the maintenance message.

Due to an error on my part (perhaps the original code also contained the error, I haven't tested), it is now the case that any HTML entity encoding will be identified as needing maintenance. This is most common with the non-breaking space (i.e. &nbsp;), as in the last two cases of test_Mult_names on Module talk:Citation/CS1/testcases/errors. (Perhaps this is why the check was originally at least 1 semicolon, I do not know.) I noticed this because I had been working on the category for authors, which had been hanging around 13k, which is now some 30k pages (and I do not think there were that many semicolons... maybe there were and I have found a hidey hole of cleaning. :)

For a discrete example, a construction like |author=Tolkien, J.&nbsp;R.&nbsp;R. aka |author=Tolkien, J. R. R. (those are non-breaking spaces) emits the message currently.

  • Tolkien, J. R. R. Title.CS1 maint: multiple names: authors list (link) (JRR would probably have triggered this message before the last release since it has two non-breaking spaces.)

Is it worthwhile supporting HTML entities in |author=/|last=? It will come up in the |author= case most-often as we rarely abbreviate last names (and moreover almost-never have multiple last names to abbreviate), for which a 95% solution can be a conversion to |last= |first= as this check does not occur for |first= (we prefer the use of |last= |first= anyway for best metadata generation). Cases other than can be worked on if they occur, since nbsp is not the only kind of entity that could end up encoded this way in |author= (I am skeptical it would occur in most uses of |last=). By worked on I mean that we can create templates similar to {{ndash}}, or convert to the Unicode representation.

Aside: I don't know if it would be reasonable for the software to be checking |first=; I suspect so given some constructions in the wild I've seen.

Thoughts? --Izno (talk) 20:42, 24 October 2020 (UTC)

I would think non-breaking spaces (using any mechanism) may be important in situations where author names separated by a hyphen? One could argue that some readers could be confused or misunderstand a citation that splits a compound last or first name into a newline. I haven't looked at the code to see how it handles such cases. 98.0.246.242 (talk) 01:31, 25 October 2020 (UTC)
Not non-breaking spaces, but dashes/hyphens/straight lines in the middle of names, for which we do already have other workarounds. --Izno (talk) 01:38, 25 October 2020 (UTC)
Right. I mixed up non-breaking with non-wrapping in my previous comment. So now I cannot think of any other use-case for such markup, but who knows. 98.0.246.242 (talk) 02:06, 25 October 2020 (UTC)
Multiple authors' names should never be separated by any type of hyphen or dash or slash or whatever, if that's what the IP is asking. They should be separated by entering them as separate parameters (or by a comma, but only when using "Vancouver style" in which both periods and spaces are omitted from authors' initials anyway, and therefore moot).
Adding non-breaking spaces to the output for any first-name input which matches (in whole or part) a pattern of multiple consecutive initials (spaced or unspaced) seems like an easy task for regular expressions inside the module. This would be more robust than encouraging or requiring users to include html entities in the input, or even think about doing so.
There are of course certain abbreviations which look like a person's initials but should remain unspaced according to the MOS. It's conceivable that some user might enter something other than a person's name, such as |author=U.S. Treasury but in reality that should be spelled out and moved to |publisher=United States Treasury.
Of course, an even lazier approach might be to just enclose the output for every firstname and every lastname in a span with some class CSS-styled as white-space: nowrap. This would (rather aggressively!) prevent wrapping when name parts contain (a) initials (e.g. |first=J. R. R. or |first=F. Scott), and/or (b) real or implied hyphens (e.g. |first=Mary-Kate or |last=Lloyd Webber), which would otherwise risk being wrapped in the middle of. ―cobaltcigs 18:54, 25 October 2020 (UTC)
Please do not do that last, "aggressive" proposal. There are too many cases like |author=U.S. Select Foobarian Subcommittee of the International Committee of Bazquuxians for Global Widgetization-Dingusification Standards (where |publisher= has another long-winded thing that is the parent organization name[s]).  — SMcCandlish ¢ 😼  22:15, 26 October 2020 (UTC)
Only names divided into a |first= and |last= part would be affected. Between two nowrap spans would be a comma and a regular space, where wrapping would be permitted. Barring bad input, that should only be individual human authors. And not even all of them. ―cobaltcigs 01:29, 1 November 2020 (UTC)
|author=, used regularly for organizational authors, is a synonym for |last=. Nowrapping the output for that parameter is accordingly a no-go due to regular column size constraints. (I have said a few times now that we should have a |org-author=, but alas, it has not happened.) --Izno (talk) 01:36, 1 November 2020 (UTC)
In the long run, it might be necessary to decouple |author= from |last= to improve support for foreign names, anyway. But I agree that right now it would not be worth the trouble to change this just to improve some wrapping behaviour. Given that wrapping in the middle of initials looks particularly odd, adding automatic "anti-wrapping" to |first= could already improve the display of names somewhat.
--Matthiaspaul (talk) 02:02, 6 November 2020 (UTC)
As for the template behavior, it would be nice if it permitted &nbsp; and other entities, and excluded any &...; pattern from its counts of semicolons while trying to detect improper input. Now that I'm migrating back to Windows I'm remember what a hassle it is to get various special characters inserted, though I think I will buy PopChar for Windows and hope that it works as well as the Mac one (esp. compared to Windows Character Map). Even if we wish people would always use the composed Unicode character, we know that they will not. And &nbsp; is actually desirable, since no one can visually tell the difference between a regular space and a non-breaking one otherwise.  — SMcCandlish ¢ 😼  22:15, 26 October 2020 (UTC)
Right, we don't want to use the invisible character. We have a separate test for the invisible control characters that will emit an error. Invisible very badddd.
However, as I said earlier, we can encourage the use of {{nbsp}} to meet this user desire. Or encourage the use of normal spaces. (Now that I look, there is the caveat at WP:NBSP regarding the use of the template with links. Maybe that is sufficient reason to support it until there is some consensus about whether non-breaking behavior is actually desirable in the citation templates.)
I do not think there is a general way to allow all HTML entities. We would need to add and check against some published list (perhaps of the most common), which seems like overkill for most, since most others (maybe all-of-interest) have a visible alternative.
Finally, though I disagree with permitting &nbsp;, I've tweaked the module to discount these markers. You can see the output for yourself at test_Mult_names in Module talk:Citation/CS1/testcases/errors, where the two affected test cases are orange as not-matching (because we just compare against the live version rather than the preferred output of course ;). --Izno (talk) 05:45, 2 November 2020 (UTC)
I guess I don't think that we should support the use of &nbsp; in the namelists. We have noted at Template:Citation Style documentation#COinS that html entities should not be used in parameters that contribute to the citation's metadata; we should not allow something on the one hand but disallow it on the other hand. {{nbsp}} is not appropriate for use in cs1|2 templates because it will cause the inclusion of all of this in the name's metadata:
<span class="nowrap">&nbsp;</span>
cs1|2 wraps some or all of the value assigned to |access-date= in <span class="nowrap">...</span> because |access-date= is rendered at the end of the citation. That was an experiment conducted quite long ago. Did anyone notice? We don't similarly wrap |isbn= which, because of the permitted (desired) hyphens and occurring at the end of a book citation rendering, can break oddly. Did anyone notice?
And beyond the first name or maybe three, who reads the namelists in a citation? Yeah, I know, wandering a bit off topic, but wouldn't it be better for us to set a default |display-<names>= value so that all cs1|2 templates show the default number of names (+ et al when there are more names in the template)? Do we really need to display 400 names? or even 29? (that's a popular number; why?) who is going to read or even need all of those names to locate the source?
Trappist the monk (talk) 13:03, 2 November 2020 (UTC)
Anecdotally I probably notice on occasion, but never something like "oh, I would miss this were it gone".
who reads the namelists in a citation I do not, but that might be something that varies by domain and no-doubt by personality. When I see namelists longer than 6 is when I personally add the et al in my gnoming.
I think some tool at some point was limited to 29, Citoid or refill perhaps. I've noticed a similar pattern (but again, maybe anecdotal).
It is interesting that there is a suggestion not to use nbsp in COinS parameters. I am not sure what opinion I have of that, but our typical implementation has been to categorize and remove metadata problems. Consider that we perform a substitution in page handling of dashes; we could do the same for author lists. --Izno (talk) 15:08, 2 November 2020 (UTC)
I added that suggestion, and though I don't remember exactly why, I suspect it was to avoid having to add support to translate every html entity into its unicode character form. The page handling of &mdash; and &ndash; was necessary to resolve a technical issue because editors will use a semicolon between individual page numbers when a comma should have been used.
Trappist the monk (talk) 16:56, 2 November 2020 (UTC)
While I sometimes study author lists, including longer ones of a few dozens, I have yet to see and study a list of 400 names. Nevertheless, I don't think we should set a default limit for |display-name=. This should remain up to the editorial judgement of the article editors, not us. Setting a too low limit would also make it more difficult to enter longer lists, as one would first have to add |display-name=some-high-value to see the remaining names. Courtesy dictates to try to list all authors of a work and limit the display for practical reasons only where necessary. Depending on the "house standard" for author lists (alphabetical order, chronological increasing/decreasing order, increasing/decreasing importance by amount of contribution or "status", no order, etc.) being followed, the first author is not necessarily the main author. The editors of an article probably have the best insight into a source and context to set the display limit to an appropriate value for a citation, if necessary.
Regarding methods to avoid orphans, yes, I have occasionally recognized this (I'm one of those editors who sometimes inserts &nbsp; between the last two words of a paragraph to improve text flow appearance on some browsers).
As an aside regarding <span class="nowrap">, does someone know if it is possible to define a kind-of-strength for nowrapping so that the browser tries not to wrap (for as long as possible), but would start wrapping anyway when it could otherwise not avoid a horizontal scrollbar or truncation on narrow windows?
--Matthiaspaul (talk) 02:02, 6 November 2020 (UTC)
DIEP flap – 380 names. Setting the default to say, six, accomplishes the purpose of helping readers locate the source. In other discussions, editors have stated that identifiers and their links are not reader friendly (I disagree, readers are not stupid) but if that argument maintains, then surely an excessive number of authors will also dissuade readers from seeking the source. If by Courtesy dictates to try to list all authors you mean courtesy to the authors, then I disagree. Citations are not here to serve as 'credits'; that is the duty of the publisher. Courtesy plays no part in citations except as a courtesy to the reader to provide consistent, understandable, identification of the sources used by en.wiki editors to create and maintain the encyclopedia's articles.
I don't know of any citation style that requires name-lists in citations to be ordered in any way except the order in which they are named in the source; to sort the names in a list any other way is to do a disservice to readers. Of course, editors of an article probably have the best insight into a source and context to set the display limit to an appropriate value; setting a default limit does not take away from that editorial discretion.
Isn't the insertion of &nbsp; between the last two words of a paragraph discouraged because what you see on your screen doesn't necessarily mean that any other viewers will see the same thing?
I am not sufficiently versed in the minutia of css so I can't answer the kind-of-strength for nowrapping question though there is the html tag <wbr />. It is my understanding that using &nbsp; to prevent line breaks is discouraged in favor of css: <span class="nowrap">$1</span>. That would add 28 characters to every |firstn= that is breakable. We could minimize that to some extent by nowrapping the entire name list and insert <wbr /> between authors and between last and first names:
<span class="nowrap">Black, <wbr/>A; <wbr/>Brown, <wbr/>B; <wbr/>Red, <wbr/>C; <wbr/>Orange, D.</span>
Trappist the monk (talk) 14:50, 6 November 2020 (UTC)
Thanks for the link. :-) I don't find this list intimidating, except for that this is a clear case for list-defined references rather than inline references. If there would be many citations with such long lists of contributors in an article, it would probably start to look strange at some point - that's when |display-authors= would come in handy at the editor's discretion. Interestingly enough, in this example citation, the authors are listed in alphabetically ascending order (actually, in this particular example, two such lists), so, when you would cut the list at some arbitrary point, you'd risk missing main contributors. I don't find this desirable at all.
Regarding Courtesy dictates to try to list all authors, yes, I meant the authors of the cited work. If they are listed in the published work, we should specify them as well (by default). If those authors wouldn't have published their work, we could not cite it, so I see us as a continuation in a line of works built upon each other in an environment where it is common to list the sources (to avoid circular references and help identify false information).
Citations are not only used to locate a source, but also to evaluate the significance of statements ("Is there a widely known and trustable expert on a field among the authors?") or as a starting point for new research ("Let's check the authors' affiliations for related works and other publications."). Also, having a pre-defined default limit makes entering larger lists somewhat more difficult if there are more authors than what is defined as a limit. So, it's better to leave it to the article authors to set an upper limit (if necessary at all).
Regarding "house styles", here I meant publishing standards, not citation standards. Of course, we should list authors in the order given in the source, if such an order is determinable (as is the case most of the time). In many cases, the most important authors for a work are listed among the first, but unfortunately different organizations have different publication conventions to the effect that it is also possible that significant or even the most important contributors happen to be listed in the middle or the end.
Regarding concatenating the last two words of a paragraph with &nbsp;, this is quite common among web designers. It dates back to times long before the introduction of CSS and works even with the oldest browsers. As you wrote, the text of a page will flow differently depending on the width of the window (and other things). However, the &nbsp; will have no effect except for when the browser would otherwise move the last word of a paragraph into a new line, whereas with &nbsp; in place the browser would ensure that there will be at least two words in the last line of a paragraph, thereby preventing the last word from becoming an orphan. This might no longer be necessary with browsers supporting CSS, but also can't harm (unless the two words would be long and force the browser to go into an otherwise unnecessary horizontal scrolling mode, which, of course, would be counter-productive).
--Matthiaspaul (talk) 12:47, 7 November 2020 (UTC)

Nbsps in MediaWiki[edit]

(Slightly offtopic to nbsps in citation templates so split Cobalt's reply from SMC's comment SMC at 22:15, 26 October 2020 (UTC) in #Nbsp_in_|author,_|last,_and_equivalents_for_other_contributors to its own thread.)

Assuming your browser renders the above character U+1F63C CAT FACE WITH WRY SMILE in a colorful way (as mine does), you can determine which font said browser is using, then edit that font to fill the glyph bounds for U+00A0 NO-BREAK SPACE with some light pastel color (instead of transparent nothingness). This will give every nbsp on the page a subtle glow (much like shining a UV flashlight across a motel room), without being too distracting to read. I use a technique similar to this myself. Among other things, you'll become aware of situations where the MediaWiki software replaces regular spaces with nbsp on its own for no apparent reason (e.g. before ! or ?). Perhaps if the rules for doing this could be configured on a per-wiki basis, everyone would be happy. ―cobaltcigs 01:40, 1 November 2020 (UTC)

P.S. Never pay for software. I use BirdFont (what, no article?) for editing and gucharmap for character lookup by name. Brief research suggests both have been ported to Windows—where they probably work equally well, but I can't confirm that. ―cobaltcigs 01:40, 1 November 2020 (UTC)

I assume you are referring to the edit window when discussing the novel (distinguishing color) markup for non-breaking spaces. The user (reader) version should be free of any visible formatting notation or artifact - transparent nothingness is best, in this case.
I haven't verified the MediaWiki soft-space replacement before punctuation that you indicated above. If true, it is odd. Normally, space of any kind is considered erroneous practice if it is before most punctuation - it breaks continuity between the punctuation mark and the text the punctuation is supposed to apply to. For similar semantic (and esthetic) reasons, sentences should not wrap until after punctuation marks. Adding a hard space is compounding an error. 98.0.246.242 (talk) 04:13, 1 November 2020 (UTC)
I know of one place where MediaWiki will insert non breaking spaces (before %), but that should only occur on French Wikipedia. --Izno (talk) 14:26, 1 November 2020 (UTC)
Not so. See example screenshot (enwiki edit preview, with highlighting hacks enabled). ―cobaltcigs 19:18, 1 November 2020 (UTC)
And another, lol. ―cobaltcigs 19:26, 1 November 2020 (UTC)
@Cobaltcigs: I went to verify on phab, apparently having researched this 3 years ago.... phab:T181441#3798402. --Izno (talk) 20:54, 1 November 2020 (UTC)
The exclamation-mark french-space case is problematic in enwiki, but I am not aware of the percent sign being used for punctuation in any language? Not to say there should be a leading space there. It seems that the situation hasn't been resolved, or if it was, there was a parser update or html update that messed up things again. 98.0.246.242 (talk) 21:28, 1 November 2020 (UTC)
The task linked directly in my comment discusses the % sign directly. --Izno (talk) 21:49, 1 November 2020 (UTC)
I went through it before my previous comment. I was just wondering why the behavior persists in enwiki. That is, why is french-spacing applied in situations where French terms are not used. Without examining the related patches (some of them are still in beta, I believe) it seems to me that the parser interprets a space before certain punctuation marks and other characters as attempts at french-spacing and applies the "correct" space format, i.e. a nbsp. However, this seems to be done indiscriminately, it should be done only when French-language terms include such syntax. In English, any such spacing would very likely be wrong. It seems that in the attempt to fix a special case, the general case was botched. 98.0.246.242 (talk) 23:14, 1 November 2020 (UTC)
In case you're confused about the functionality, &nbsp; is not inserted arbitrarily before these symbols. MediaWiki substitutes a plain-ol' space for the non-breaking space, where such a space is present. However, English doesn't use the plain-ol' space (indeed, as you suggest, such spacing would very likely be wrong), so it is usually not an issue here. One might argue that that code should not be executed in our locale at all, but I don't see the fundamental harm, since we wouldn't want those symbols, were they to be separated by a space, to have the same issues with wrapping that originally caused the behavior to be added to the software so long ago (... or at least, the I presume that was why). --Izno (talk) 02:36, 2 November 2020 (UTC)

|nocat=[edit]

I have cleared Category:CS1 maint: nocat. I have also removed support for |nocat= and Category:CS1 maint: nocat from the sandbox module suite. After the next update, |nocat= will cause cs1|2 templates to emit the unknown parameter error message:

{{cite book/new |title=Title |nocat=yes}}
Title. Unknown parameter |nocat= ignored (|no-tracking= suggested) (help)

Trappist the monk (talk) 13:36, 29 October 2020 (UTC)

Thanks, Trappist. I haven't gone through all citations using |no-tracking= yet, but in the past weeks I cleaned up some of the |nocat= parameters as well and among them did not run into any citations for which we would actually need this feature in mainspace any more (for broken DOIs we now have a much better option). Did you?
If all uses in mainspace would have been removed, and categorisation would be disabled outside mainspace, the parameter could be removed completely or reduced to a pure debug option (possibly with reversed logic to optionally enable categorisation outside mainspace).
Questions from Help_talk:Citation_Style_1/Archive_71#no-cat_parameter_cleanup:
  • Do we actually need this in mainspace? Should we disallow the feature in mainspace?
  • What should be the default behaviour in other namespaces? Should the behaviour be changed to populate categories only when a special option is given?
  • If it would be always disabled in mainspace and enabled elsewhere, do we need a parameter to control it at all?
  • Should we change the temporary nocat category into a permanent maint category for the feature as a whole?
  • Find a better parameter name based on the resulting functionality and use case of the feature. If we don't keep a maint category the parameter name needs to be unique to also serve as a good search pattern.
Opinions?
--Matthiaspaul (talk) 23:05, 4 November 2020 (UTC)

Hinting on Citation Bot's duplicate parameters[edit]

When a citation contains duplicate parameters the Mediawiki software will display a yellow warning at the top of the page:

This is only a preview; your changes have not yet been saved!
Warning: xxxx is calling Template:yyyy with more than one value for the "zzzz" parameter. Only the last value provided will be used.

However, this warning will only be shown in edit preview.

When Citation Bot finds duplicate parameters in citations it renames them by adding a "DUPLICATE_" prefix to them. Our citation template then throws a red error message:

Unknown parameter |DUPLICATE_zzzz= ignored

Since our citations templates can optionally issue a parameter suggestion, I added a rule so that the template would display instead:

Unknown parameter |DUPLICATE_zzzz= ignored (|zzzz= suggested)

However, I was reverted by Izno stating that the parameters should be removed or merged. While this is correct in general, for users to select or merge into one of the duplicate parameters and remove the others, they first need to know the name of the underlying parameter in question. While this can be guessed from the DUPLICATE_* name, this is a private convention used by Citation Bot, and I think it is more user-friendly to name that parameter explicitly in the error message and for consistency to use our own established message system for this (hence my addition of that rule). (The "suggested" in our standard "(|zzzz= suggested)" message does not mean that the suggested parameter is necessarily a direct 1:1 replacement (although it often is), only that it is the (most likely) parameter target that needs to be dealt with to fix the issue and that additional changes may still be required in such a parameter transformation/merge.)

Opinions?

--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)

Fix Citation Bot so that it doesn't act like a template editor? Such function is way beyond its scope. 98.0.246.242 (talk) 00:11, 2 November 2020 (UTC)
Such function is exactly in its scope because finding duplicate parameter names is something that MediaWiki prevents all templates and modules from doing.
Trappist the monk (talk) 00:49, 2 November 2020 (UTC)
I thought it was supposed to edit incorrect citations, not the code they are based on. That part, the new parameter class |DUPLICATE_(anything)=, and the accompanying terminology should be discussed, and here. If the bot has to do something it would do better to apply the error message of the preview, which follows longstanding practice in wiki (and many other coding environments). As I think you state below, the current bot action makes for a convoluted situation. 98.0.246.242 (talk) 01:17, 2 November 2020 (UTC)
Were this a serious problem, by which I mean lots of these kinds of error messages in Category:Pages with citations using unsupported parameters attributable to duplicate parameter names, I might be inclined to agree with you.
I don't know for sure, but a quick look into the Citation bot source (line 3271 et seq.) seems to suggest that the bot creates a single |DUPLICATE_zzzz= parameter name for each duplicated parameter name. I don't know if the bot applies this only to valid cs1|2 parameter names. If it doesn't and there are, for example, |blue=yellow and |blue=orange in a cs1|2 template, then the bot will rename one of these, perhaps the first it encounters, perhaps the last, I don't know, to say |DUPLICATE_blue=orange. Then, when cs1|2 sees that, your hint would cause cs1|2 to emit Unknown parameter |DUPLICATE_blue= ignored (|blue= suggested). Not much good to be gained by that.
Certainly, this ought to be mentioned at Help:CS1_errors#Unknown_parameter_|xxxx=_ignored.
Trappist the monk (talk) 00:49, 2 November 2020 (UTC)
Citation Bot flags the one that is not used. Often the data is good stuff, just in the wrong place. For example the publisher might be set to Reuters and then some one else adds Fox News and fails to convert Reuters to agency. The bot makes the error apparent. AManWithNoPlan (talk) 01:06, 2 November 2020 (UTC)
I like the idea of suggesting a way to fix the problem, either at the category page or on the help page (or both; do they use the same text?), but as AManWithNoPlan says, the solution is usually to fix one of the labels, not simply reinstate the duplicated label. If there are two |publisher= or |last2= parameters, the solution is usually to change one of them (to e.g. |work= or |via= in the first case, or e.g. |last3= or |first2= in the second case). – Jonesey95 (talk) 02:09, 2 November 2020 (UTC)
Yes, the text at the help page is section-transcluded to the category. --Izno (talk) 02:24, 2 November 2020 (UTC)
Yeah, as I wrote the "(|zzzz= suggested)" should not imply that the solution is to just replace the parameter name (or even to just reinstate the duplicate parameter - that would be counter-productive). It just hints that the zzzz parameter is what (most likely) needs to be dealt with.
Still, it might be possible to further improve the hinting system by allowing the right sides of the rules to contain more than one word (or move those into a separate list of rules). The template's code could then issue this text instead of the preformatted "(|zzzz= suggested)" message. There are other cases, where this could be useful to give a few more hints what to do (for example in the case of |editors=, see Help_talk:Citation_Style_1/Archive_72#support_for_|editors=_withdrawn_(in_the_sandbox)).
['^DUPLICATE_(%w+)$'] = '$1'
could become
['^DUPLICATE_(%w+)$'] = 'merge into <code>|$1=</code>'
to display
"(merge into |zzzz=)"
Or
['ignore-isbn-error'] = 'isbn'
could become
['ignore-isbn-error'] = 'use <code>|isbn=((...))</code>'
to display
"(use |isbn=((...)))"
Or
['editors'] = 'editor'
could become
['editors'] = 'split into <code>|editor''n''=</code>'
to display
"(split into |editorn=)"
--Matthiaspaul (talk) 15:45, 3 November 2020 (UTC)
I guess, the |DUPLICATE_blue= scenario is very rare as hardly anyone would repeat a non-existing parameter |blue= more than once. It might occur in the case of previously supported parameters, but then the template would typically throw a message suggesting the new parameter name once someone would try to reintroduce |blue=. So, the user would be led to the correct solution at least by iteration (as in the |editors= example at present).
I found about a dozen uses in mainspace and some 150 in total (including some where the |DUPLICATE_zzzz= parameter was empty), probably because they are actively worked on by some editors. AManWithNoPlan probably has a better overview how often this parameter is being added by the bot.
--Matthiaspaul (talk) 15:45, 3 November 2020 (UTC)

Edition and pages extra text as errors[edit]

Per a discussion elsewhere, in the sandbox I have separated Category:CS1 maint: extra text into two separate categories, as well as promoted the two categories to errors from maintenance. The two categories are per parameter: one for |edition= and one for |p/pp/page/pages=.

This change is demonstrated at test_extra_text test on Module talk:Citation/CS1/testcases/errors. I did not implement sensitivity to the exact parameter name in the pages test since that's still a bit beyond me. I have no strong opinion on someone else doing so.

Secondly, I see "volume" text in |work= in the wild often (and equivalents, esp. in the titles of encyclopedias and books). An example might be |title=Title, Volume X: Volume Name, which I would envision as better being |title=Title|volume=X: Volume Name. I would like to entertain an "extra text" test for that pattern and an associated maintenance category, and invite discussion accordingly. --Izno (talk) 03:39, 2 November 2020 (UTC)

As there are so many possible variants, I don't see a more narrow pattern as to just search for the string "Volume" or "Vol." in a title. In most cases it will be preceded by a separator and located near the end of a title, but I can also think of cases where that would not hold true. We'd have to live with the false positives.
Similar to the volume thing, I sometimes see variously formatted "Part" info in the title as well. If the |volume= parameter isn't used, this could be abused to move the part info into there, but what we'd actually need for this is a separate parameter |part= (see also Module_talk:Citation/CS1/Feature_requests#Part/Help_talk:Citation_Style_1/Archive_58#Books_with_volumes_and_parts, there even is a COinS tag for this, &rft.part=, although, as odd as it is, this appears to be defined only for periodicals, not books).
Applying to both volumes and parts, an Arabic or Roman number at the end of a title might also give a clue (but could also be a version number and valid part of the title).
--Matthiaspaul (talk) 14:59, 3 November 2020 (UTC)
Per Help_talk:Citation_Style_1/Archive_49#Edit_request_for_Template:Cite_book the template now also detects the British abbreviation "edn" in |edition= as extra text:
Extended content
Cite book comparison
WT {{cite book |title=Title |date=2020 |author=Author |edition=1st}}
Live Author (2020). Title (1st ed.).
Sandbox Author (2020). Title (1st ed.).
Cite book comparison
WT {{cite book |title=Title |date=2020 |author=Author |edition=1st ed.}}
Live Author (2020). Title (1st ed. ed.).CS1 maint: extra text (link)
Sandbox Author (2020). Title (1st ed. ed.). |edition= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |date=2020 |author=Author |edition=1st edn}}
Live Author (2020). Title (1st edn ed.).
Sandbox Author (2020). Title (1st edn ed.). |edition= has extra text (help)
--Matthiaspaul (talk) 20:25, 7 November 2020 (UTC)
The extra text test for |page=/|pages= and |quote-page=/|quote-pages= now also checks for pattern "pg(s)(.)" etc. in addition to ""p(p)(.)" etc.:
Extended content
Cite book comparison
WT {{cite book |title=Title |page=p. 35}}
Live Title. p. p. 35.CS1 maint: extra text (link)
Sandbox Title. p. p. 35. |page(s)= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |page=pp. 35}}
Live Title. p. pp. 35.CS1 maint: extra text (link)
Sandbox Title. p. pp. 35. |page(s)= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |page=pgs 35}}
Live Title. p. pgs 35.
Sandbox Title. p. pgs 35. |page(s)= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |page=pgs. 35}}
Live Title. p. pgs. 35.
Sandbox Title. p. pgs. 35. |page(s)= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |page=p123}}
Live Title. p. p123.CS1 maint: extra text (link)
Sandbox Title. p. p123. |page(s)= has extra text (help)
Cite book comparison
WT {{cite book |title=Title |page=P123}}
Live Title. p. P123.
Sandbox Title. p. P123.
--Matthiaspaul (talk) 01:17, 17 November 2020 (UTC)
Only remotely related to this "extra text detection" topic but I don't want to open a new thread for this minor bit: I changed the "et al." extra text detection code to also detect "et alii" and "et aliae" in addition to "et alia" and the abbreviated variants.
Extended content
Cite book comparison
WT {{cite book |date=2020 |title=Title |author=Author, et alia}}
Live Author; et al. (2020). Title. Explicit use of et al. in: |author= (help)
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author= (help)
Cite book comparison
WT {{cite book |date=2020 |title=Title |author=Author, et alii}}
Live Author, et alii (2020). Title.
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author= (help)
Cite book comparison
WT {{cite book |date=2020 |title=Title |author=Author, et aliae}}
Live Author, et aliae (2020). Title.
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author= (help)
Cite book comparison
WT {{cite book |title=Title |author1=Author |date=2020 |author2=et alia}}
Live Author; et al. (2020). Title. Explicit use of et al. in: |author2= (help)
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author2= (help)
Cite book comparison
WT {{cite book |title=Title |author1=Author |date=2020 |author2=et alii}}
Live Author; et alii (2020). Title.
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author2= (help)
Cite book comparison
WT {{cite book |title=Title |author1=Author |date=2020 |author2=et aliae}}
Live Author; et aliae (2020). Title.
Sandbox Author; et al. (2020). Title. Explicit use of et al. in: |author2= (help)
--Matthiaspaul (talk) 03:26, 17 November 2020 (UTC)
The sandboxed version now no longer leaves bracket-artifacts when it removes a double-bracketed pattern of et al.:
Cite book comparison
WT {{cite book |title=Title |author1=Author1 |date=2020 |author2=((et al.))}}
Live Author1; (); et al. (2020). Title. Explicit use of et al. in: |author2= (help)CS1 maint: numeric names: authors list (link)
Sandbox Author1; et al. (2020). Title. Explicit use of et al. in: |author2= (help)
--Matthiaspaul (talk) 14:12, 21 November 2020 (UTC)

CS1 maint: others[edit]

We presently capture citations that have no authorship information, besides |others=, in Category:CS1 maint: others (with some 20k pages). Due to prominence in the documentation of the templates {{cite AV media}} and {{cite AV media notes}}, these templates often have |others= exclusively, which makes it hard for other cases where this is an issue.

I am considering separating these out into a separate category (something like Category:CS1 maint: others in cite AV media (notes)) so that someone interested in working through slightly-less painful categories can do so.

Has anyone seen another of the core CS1 template set cause such inclusion in this maintenance category? Does anyone have an issue with that path? --Izno (talk) 05:05, 2 November 2020 (UTC)

Alternatively, is there something we can do about those templates? Provide still-more named parameters?... --Izno (talk) 05:08, 2 November 2020 (UTC)

This search can be helpful. We might restore |artist= as a template-specific parameter for {{cite av media notes}}. Instead of keeping it separate, the content of |artist= might be concatenated as a prefix to |title= so this:
{{cite av media notes |title=Dark Side of the Moon |artist=Pink Floyd}}
might render:
Pink Floyd: Dark Side of the Moon (Media notes).
with the metadata as:
&rft.btitle=Pink+Floyd%3A+Dark+Side+of+the+Moon
There are probably better rendering / metadata choices.
The {{cite av media}}, {{cite av media notes}}, {{cite episode}}, {{cite serial}} templates all deserve reworking. These are the templates that are the primary users of |people=, an alias of |authors= so none of the names listed in that parameter make it into the citation's metadata. All kinds of extraneous text is added to that parameter, mostly roles (director, producer, actor, voice-over, narrator, etc) none of which belongs in the metadata. Now that cs1|2 supports template-specific parameters, we could introduce specific role parameters for these templates so that the names are annotated in the rendering, and the names without annotation are included in the metadata. In the meantime, |people=, can be constrained to these templates only, and once the template specific parameters are available, deprecated and withdrawn.
To avoid the torches and pitchforks militias from those wikiprojects that use these templates, whichever those projects are should be consulted before we act on this.
Trappist the monk (talk) 15:37, 2 November 2020 (UTC)
Sounds good to me in general. --Matthiaspaul (talk) 12:40, 3 November 2020 (UTC)
It is a good idea to reinstate |artist=. However, this may better be a free-form parameter since artist names maybe idiosyncratic, and of course we have cases of compilation works, collaborations etc.
I would think the role parameters should follow industry practice, i.e. render as they do in "credits" sections of artistic works. I suppose distinct roles should be limited to the main creators/contributors. Minor credits could be bundled in |others=. 98.0.246.242 (talk) 22:09, 3 November 2020 (UTC)

Others[edit]

Moved from Template talk:Citation#Others. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:12, 10 November 2020 (UTC)

Has anyone analysed what are the commonest types of role added as |others=? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:53, 8 November 2020 (UTC)

Not that I know of. Such analysis will be difficult because tools like ve have misused (and may still be misusing) |others= for author names and for editor names (without role being specified). That is the problem with free-form parameters; editors and tools can put just about anything there. There are approximately 52k-ish uses of |others= [search results]
Trappist the monk (talk) 11:47, 8 November 2020 (UTC)
So should we add more non-free-from parameters, like |illustrator=? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:58, 8 November 2020 (UTC)
Probably better asked at WT:CS1 which is a bit more-watched.
Trappist the monk (talk) 14:19, 8 November 2020 (UTC)
The question seems somewhat (tangentially?) relevant to discussion in #CS1 maint: others. --Izno (talk) 19:06, 10 November 2020 (UTC)

I suggest author of foreword (P2679) is another likely candidate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:06, 12 November 2020 (UTC)

Perhaps not a good candidate for |others=. cs1|2 book citations support forewords, afterwords, and other contributions to an author's book:
{{cite book |author=Author |title=Title |contributor=Contributor |contribution=Foreword}}
Contributor. Foreword. Title. By Author.
Trappist the monk (talk) 23:14, 12 November 2020 (UTC)
While there are use-cases for |contribution= with |contributorn= and it is good that the feature supports |contributor-first= and |contributor-last= as well as n-enumerated variants, I don't like the fact that only one |contribution= is allowed and that it is impossible to specify different types of contributions for different contributors (unless lumping them all together in |contribution=). What also looks odd most of the time is that the contributors are listed in front of the authors as this draws too much attention to them:
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contributor-first2=CF2 |contributor-last2=CL2 |contributor-first3=CF3 |contributor-last3=CL3 |contributor-first4=CF4 |contributor-last4=CL4 |contribution=Illustration/Foreword/Afterword |others=Others}}
CL1, CF1; CL2, CF2; CL3, CF3; CL4, CF4 (2020). "Illustration/Foreword/Afterword". Title. By AL1, AF1. EL1, EF1 (ed.). Translated by TL1, TF1. Others.
This is okay if the goal is to cite something from a foreword or afterword and draw particular attention to this specifically, but not if the goal is to cite a source in general and list the various contributors for completeness or because, f.e., the writer of a foreword was specifically "advertised" on the book cover. Right now, we'd have to use |others= for this, but this does not support enumerated and -first/-last parameter variants, and the article editor has to invent his/her own notation to list multiple contributors and their roles as in the following three examples:
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=CL1, CF1 (Illustration). CL2, CF2; CL3, CF3 (Foreword). CL4, CF4 (Afterword). Others}}
AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. CL1, CF1 (Illustration). CL2, CF2; CL3, CF3 (Foreword). CL4, CF4 (Afterword). Others.
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=Illustration: CL1, CF1. Foreword: CL2, CF2; CL3, CF3. Afterword: CL4, CF4. Others}}
AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. Illustration: CL1, CF1. Foreword: CL2, CF2; CL3, CF3. Afterword: CL4, CF4. Others.
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=Illustrated by CL1, CF1. Foreword by CL2, CF2; CL3, CF3. Afterword by CL4, CF4. Others}}
AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. Illustrated by CL1, CF1. Foreword by CL2, CF2; CL3, CF3. Afterword by CL4, CF4. Others.
Before we now introduce individual parameters for all possible roles, what I would like to see is a mix of both, |contributor= and |others=:
Multiple possible contributors with different contributions (with support for -first/-last and enumerated forms), but listed after the list of authors, editors and translators (and before |others=). This could be achieved by adding |contributor-role= (and enumerated forms). If the role would be specified, it would be listed alongside the corresponding contributor. In order to allow multiple contributors contributing to the same type of contribution, the role should occur either before all or after the last contributor of a specific group (as in the example renderings above). The markup for this could be like this:
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contribution-role1=Illustration |contributor-first2=CF2 |contributor-last2=CL2 |contributor-role2=Foreword |contributor-first3=CF3 |contributor-last3=CL3 |contributor-role3=Foreword |contributor-first4=CF4 |contributor-last4=CL4 |contributor-role4=Afterword |others=Others}}
As a further refinement we could make subsequent |contributor-role= parameters optional if they would specify the same role as that of the preceding contributor (|contributor-role3= here):
  • {{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contribution-role1=Illustration |contributor-first2=CF2 |contributor-last2=CL2 |contributor-role2=Foreword |contributor-first3=CF3 |contributor-last3=CL3 |contributor-first4=CF4 |contributor-last4=CL4 |contributor-role4=Afterword |others=Others}}
How to distinguish between the two forms? Either by the existence of |contribution=, by the existence of a |contributor-role= parameter, by introducing |others-first/-last/-role= instead of |contributor-first/-last/-role= or some mix of it.
--Matthiaspaul (talk) 20:11, 18 November 2020 (UTC)
I don't like the fact that only one |contribution= is allowed... From that, can I take it that you don't like the fact that a single cs1|2 template allows only one |chapter= or one |section= or one |entry= or one |article=?
The |contribution= and |contributor= pair are intended to cite the contributor's contribution to the work written by |author= as, for example, Anna Quindlen's introduction to Jane Austen's Pride and Prejudice, here where Quindlen is the writer who is being cited, not Austen, so it is correct that Quindlen is listed ahead of Austen in the citation. So, yes, [this] is okay if the goal is to cite something from a foreword or afterword and draw particular attention to this specifically because that is the defined purpose.
If an editor is not citing the writer of a foreword ... specifically "advertised" on the book cover, there is no need to clutter the citation with that extraneous detail; we don't need to distract or confuse the reader.
We should certainly not introduce individual parameters for all possible roles. If any such parameters are added they should only be added after careful consideration and when it can be shown that the new parameter is needed.
Trappist the monk (talk) 13:50, 19 November 2020 (UTC)
I never proposed to introduce individual parameters for all possible roles, quite the opposite, I proposed to have a more general set of parameters that can be customized to suit all possible roles and use cases, so that we don't have to discuss this subject again and again. After all, whenever we added another set of parameters for a specific role, someone came around the corner asking for the next one. There is obviously a need to list some contributors, but the current system does not address all use cases (except for through a free-text parameter |others=, which, however, is unsatisfactory for most of the same reasons for why we are fading out |editors= and |authors= in the long term).
While there have been several requests in the past to add this and I too have come into sitations where it would have been great to handle more than one chapter in a single citation without having to lump them together in one parameter, I don't propose this. However, contributions are a completely different case, because there are often multiple contributions and of different types.
The Pride and Prejudice example you gave is a perfect example for the current use of |contribution= and |contributor=. I described this use case as well in my reply above. But it does not cover the more common use case where the afterword, foreword, illustrations, etc., are not by itself the subject to be cited, but they are nevertheless part of the contributions to a work and thus may be listed in a citation. (This is also why this ([1]) won't have the desired effect.) In this case, the contributions would be clutter when displayed before the main contributors. They should rather be listed following the main contributors like authors, editors and translators - basically they should be at the position where we show |others=. I could have worded my proposal to introduce |other-firstn=/|other-lastn=/|other-linkn=/|other-maskn= plus |other-rolen= (and fade out |others= in the long term). However, if we can combine this with the parameters for contributors we could just use the existing |contributor-firstn=/|contributor-lastn=/|contributor-linkn=/|contributor-maskn= for this as well and just add |contributor-rolen=.
--Matthiaspaul (talk) 20:17, 21 November 2020 (UTC)
Before we now introduce individual parameters for all possible roles, what I would like to see is a mix of both, |contributor= and |others=: ... reads, to me, like this mix of both is merely a prelude to the [introduction of] individual parameters for all possible roles which is something that we should not do.
I am not convinced that we need anything more than a carefully curated, select few, role-type parameters. We do not need something that will allow editors to name every last person who was even remotely connected to the cited work. We do not need to be film-credit-like and include the craft-services' third journeyman soup stirrer; leave that to the publisher.
I can imagine certain additional roles being added to replace |people= and |credits= which are predominantly used in {{cite AV media}}, {{cite episode}}, and {{cite serial}}. These new role parameters would be constrained to these templates.
But it does not cover the more common use case where the afterword, foreword, illustrations, etc., are not by itself the subject to be cited, but they are nevertheless part of the contributions to a work and thus may be listed in a citation. You're right, it doesn't and it shouldn't. When an afterword, foreword, introduction, preface, etc is not the subject to be cited, such contributions, noteworthy though they may be, are superfluous to the purpose of the citation which is to identify for the reader the subject to be cited. Including mention of afterwords, forewords, introductions, prefaces when they are not the subject to be cited merely obfuscates the subject to be cited within the citation and so does not benefit the reader. cs1|2 is not a repository for all possible bibliographic data associated with a source. If you want that, go write a template series to do that. It may be that in bibliographic lists of an author's works, for example, such a bibliographic information template might be desirable. Citations need only the bibliographic detail that is sufficient to identify the portion of the source that is the subject to be cited.
Trappist the monk (talk) 18:49, 22 November 2020 (UTC)

My experience with "others" is that it is usually used incorrectly, for instance for authors after the first one. —David Eppstein (talk) 23:23, 12 November 2020 (UTC)

Even though the documentation has problems, in this case it correctly leads the horse to the water. 71.247.146.98 (talk) 12:56, 13 November 2020 (UTC)

Redirection[edit]

Tangent Why is that talk page un-redirected? --Izno (talk) 13:19, 10 November 2020 (UTC)

Don't know. Probably should be don't you think?
Trappist the monk (talk) 15:05, 10 November 2020 (UTC)
As far as I understood, {{Citation}} is for CS2, not CS1. If so, redirecting here ("Help talk:Citation Style 1") would probably be wrong. I'm all for merging CS1 and CS2, but for as long as this hasn't happened, CS2 followers probably need a place to hold out as well. However, crosslinking would be appropriate, so that discussions won't be missed (as it apparently happens often).
--Matthiaspaul (talk) 16:29, 10 November 2020 (UTC)
The CS1 module handles CS2 and questions regarding it are 99% applicable to both. Help talk:CS2 also redirects here. --Izno (talk) 18:44, 10 November 2020 (UTC)
Almost, Help talk:Citation Style 2. Perhaps, we should redirect Template talk:Citation there?
--Matthiaspaul (talk) 22:08, 10 November 2020 (UTC)
No. Here is best. Help talk:Citation Style 2 has 29 watchers. Template talk:Citation has 201 watchers. This page has 384 watchers. No doubt, many of those watchers are the same.
Trappist the monk (talk) 22:16, 10 November 2020 (UTC)
Merge the pages, rename & redirect. Only after the appropriate discussion. What the module does is irrelevant to how humans discuss and categorize things. If editors want to have seoarate pages for discussion because it makes sense to them, then that is how it should be. 208.251.187.170 (talk) 12:55, 11 November 2020 (UTC)

Fixed evaluation of accept-this-as-is syntax in parameters supporting item lists[edit]

Template parameters supporting item lists such as |pages=, |pp=, |issue=, |number= (and now also |quote-pages=) supported the accept-this-as-is syntax to suppress the conversion of hyphens to dashes globally as well as for individual list items. However, a bug prevented the code from properly evaluating item lists, where the first and the last list items were using this syntax. Such combinations were erroneously interpreted as if the global accept-this-as-is markup was used, resulting in invalid list items (fifth and last example). This has been fixed now:

Extended content
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=1-3,5-7 |journal=Journal}}
Live Author. "Title". Journal: 1–3, 5–7.
Sandbox Author. "Title". Journal: 1–3, 5–7.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=1,201-1,234 |journal=Journal}}
Live Author. "Title". Journal: 1, 201–1, 234.
Sandbox Author. "Title". Journal: 1, 201–1, 234.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1,201–1,234)) |journal=Journal}}
Live Author. "Title". Journal: 1,201–1,234.
Sandbox Author. "Title". Journal: 1,201–1,234.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1-3,5-7)) |journal=Journal}}
Live Author. "Title". Journal: 1-3,5-7.
Sandbox Author. "Title". Journal: 1-3,5-7.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1-3)),((5-7)) |journal=Journal}}
Live Author. "Title". Journal: 1-3)),((5-7.
Sandbox Author. "Title". Journal: 1-3, 5-7.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1-3)),5-7 |journal=Journal}}
Live Author. "Title". Journal: 1-3, 5–7.
Sandbox Author. "Title". Journal: 1-3, 5–7.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((((1-3)),((5-7)))) |journal=Journal}}
Live Author. "Title". Journal: ((1-3)),((5-7)).
Sandbox Author. "Title". Journal: ((1-3)),((5-7)).
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1-3)),((5-7)),9-10 |journal=Journal}}
Live Author. "Title". Journal: 1-3, 5-7, 9–10.
Sandbox Author. "Title". Journal: 1-3, 5-7, 9–10.
Cite journal comparison
WT {{cite journal |title=Title |author=Author |pages=((1-3)),5-7,((9-10)) |journal=Journal}}
Live Author. "Title". Journal: 1-3)),5-7,((9-10.
Sandbox Author. "Title". Journal: 1-3, 5–7, 9-10.

--Matthiaspaul (talk) 02:19, 4 November 2020 (UTC)

The parameter evaluation for |volume= internally uses parts of the same code for list item evaluation, hyphen-to-dash conversion, and accept-this-as-is markup recognition as used for |issue=, |pages=, etc. above. However, a bug in the somewhat-heuristic code deciding if a volume value should be presented in boldface or not prevented this from being executed if the given argument was longer than 4 characters. This has now been fixed as well.
As before, the volume is shown in boldface only if it is a single number consisting of either Arabic or Roman digits only or if is not longer than 4 characters in total, that is, ranges are displayed in boldface only if they are very short, and list items framed with the accept-this-as-is markup are never shown in boldface. However, given the many requests in the past asking to not display volumes in boldface at all, this can be seen as a feature as well to optionally suppress boldface also for short volume values: ((1)), ((X)), ((1-2)), ((1–2)).
Extended content
Cite journal comparison
WT {{cite journal |title=Title |volume=2 |author=Author |journal=Journal}}
Live Author. "Title". Journal. 2.
Sandbox Author. "Title". Journal. 2.
Cite journal comparison
WT {{cite journal |title=Title |volume=((2)) |author=Author |journal=Journal}}
Live Author. "Title". Journal. ((2)).
Sandbox Author. "Title". Journal. 2.
Cite journal comparison
WT {{cite journal |title=Title |volume=X |author=Author |journal=Journal}}
Live Author. "Title". Journal. X.
Sandbox Author. "Title". Journal. X.
Cite journal comparison
WT {{cite journal |title=Title |volume=((X)) |author=Author |journal=Journal}}
Live Author. "Title". Journal. ((X)).
Sandbox Author. "Title". Journal. X.
Cite journal comparison
WT {{cite journal |title=Title |volume=1-2 |author=Author |journal=Journal}}
Live Author. "Title". Journal. 1–2.
Sandbox Author. "Title". Journal. 1–2.
Cite journal comparison
WT {{cite journal |title=Title |volume=((1-2)) |author=Author |journal=Journal}}
Live Author. "Title". Journal. ((1-2)).
Sandbox Author. "Title". Journal. 1-2.
Cite journal comparison
WT {{cite journal |title=Title |volume=1-2 |author=Author |journal=Journal}}
Live Author. "Title". Journal. 1–2.
Sandbox Author. "Title". Journal. 1–2.
Cite journal comparison
WT {{cite journal |title=Title |volume=((1–2)) |author=Author |journal=Journal}}
Live Author. "Title". Journal. ((1–2)).
Sandbox Author. "Title". Journal. 1–2.
--Matthiaspaul (talk) 20:40, 4 November 2020 (UTC)
If this is a way to circumvent/subvert the module styling, please find another solution or revert yourself. --Izno (talk) 21:01, 4 November 2020 (UTC)
This would be pointless as the volume evaluation code has always been based on heuristics trying to cover the most common cases in the most desirable way for most users, but it never ruled out potentially invalid entries. The fixed code is an improvement on this, but it still does not rule out all corner-cases, also to keep the changes minimal and the code small.
If the above mentioned behaviour (which was not some deliberately coded feature) would be actually undesired it might be possible to add extra code to explicitly test for this condition and disallow it, but I think it is easier to just not enter them this way (as before). And to rule out these combinations, that code would have to be added to the original code as well, so nothing would be gained by reverting.
However, I mentioned this possibility because we have had many requests in the past to streamline the display of volumes (that is, to not bold them at all), so some users might even find this useful (if documented accordingly). The existing heuristics were the result of trying to find a compromise so that some short and special types of volumes would be displayed in boldface whereas others would not. This works exactly like before.
--Matthiaspaul (talk) 22:40, 4 November 2020 (UTC)
An aside: I doubt that the "existing heuristics" was the result of any compromise. If I remember correctly, some years back, somebody suggested that long volume labels be unbolded because of reasons (probably purely esthetic). The initial "discussion" was barely 3 comments long, IIRC. And that was it, |volume= was reclassified into the bipolar bin. As you state, many people have asked for a resolution either way (all bold font or all regular). It must be somebody's pet cause, because nothing has transpired. Other than that, if your edits cause no harm and correct a bug (personally I was not aware of it) then I don't see why they shouldn't stand. 98.0.246.242 (talk) 03:43, 5 November 2020 (UTC)
FWIW, here are some links to former discussions regarding the bolding/non-bolding of the volume label:
--Matthiaspaul (talk) 21:07, 16 November 2020 (UTC)

Italics[edit]

I want to italicize the newspapers in Dietrich Adam but it comes up with an error. Please allow the option to do it manually, I hate it when things are controlled by a template.† Encyclopædius 13:12, 5 November 2020 (UTC)

Use {{cite news}} and |newspaper=
{{cite news |url=http://www.spiegel.de/kultur/tv/dietrich-adam-ist-tot-friederich-stahl-in-sturm-der-liebe-a-548adb45-64fe-49ce-8e71-58a7cce9c3a9 |title=Schauspieler Dietrich Adam ist tot |newspaper=Der Spiegel |date=4 November 2020|access-date=5 November 2020 |language=de}}
"Schauspieler Dietrich Adam ist tot". Der Spiegel (in German). 4 November 2020. Retrieved 5 November 2020.
Did the error message help text not answer this question?
Trappist the monk (talk) 13:19, 5 November 2020 (UTC)
Wikipedia doesn't actually force anyone to use citation templates. The only requirement is that the style you use looks identical to the one in the rest of the article. Glades12 (talk) 13:48, 6 November 2020 (UTC)

Request for the "nbk" (NCBI bookshelf) attribute for "cite book"[edit]

Please add the "nbk" attribute for the "cite book" template to specify the NCBI NBK number. You already have the "pmc" and "pmid" attributes, but the "nbk" is different. It refers to the NCBI bookshelf site that has different URL forman than PubMed Central. The URL to the bookshelf looks like http://www.ncbi.nlm.nih.gov/books/NBK557634/ (where 557634 is the NCBI NBK number). My idea is when you specify the "nbk" to the "cite book", the direct URL to the book at the NBI site will be generated. Currently, NCBI bookshelf books cannot be accessed directly from Wikipedia or other Wikimedia cites that allow the "cite book" template. Maxim Masiutin (talk) 19:42, 6 November 2020 (UTC)

Weird category text[edit]

What's going on with Category:CS1 errors: dates? A bunch of sectioned text just appeared today, that don't have to do with dates. Does it have to do with the {{#lst}} stuff? I don't understand how those work. kennethaw88talk 22:14, 6 November 2020 (UTC)

Thanks for reporting this. A couple of hours ago I swapped some sections at Help:CS1 errors to reestablish the alphabetical order of entries, however, I must have overlooked something. As Izno reverted me, the effect should already have been gone by now. To be sorted out.
--Matthiaspaul (talk) 23:03, 6 November 2020 (UTC)
Fixed. --Matthiaspaul (talk) 11:04, 7 November 2020 (UTC)

Triple curly[edit]

From Women in the Byzantine Empire:

{{cite book| author = | chapter = | chapter-url = | format = | url = | title = [[The Oxford Dictionary of Byzantium]] | orig-year = | agency = ed. by Dr. [[Alexander Kazhdan]] | edition = |location= N. Y. |date = 1991 |publisher= |volume= {{{том|}}} | pages = {{{страницы|}}}| series = | isbn = 0-19-504652-8| ref = {{harvid|Kazhdan|1991}}}}

Produces:

Are triple curly-brackets {{{том|}}} and {{{страницы|}}} error or feature? -- GreenC 16:09, 7 November 2020 (UTC)

The template variables are in the first version of that article. cs1|2 does not see them because they are empty strings by the time the template is passed to Module:Citation/CS1.
Trappist the monk (talk) 16:30, 7 November 2020 (UTC)
(edit conflict) It's an error caused by copying and pasting the template from the Russian Wikipedia when the article was created. I found only one other instance of this problem in article space, so it looks like it is not a big problem. – Jonesey95 (talk) 16:35, 7 November 2020 (UTC)
This is good news as finding the template's terminus }} when there are triple curly brackets embedded raised some edge case complications, now they can just be logged and removed. -- GreenC 00:40, 8 November 2020 (UTC)

Epic citations[edit]

Occasionally come across citations that might be described as "epic". From Parallel (operator):

<ref name="Cajori_1928">{{cite book |author-first=Florian |author-last=Cajori |author-link=Florian Cajori |title=A History of Mathematical Notations – Notations in Elementary Mathematics |chapter=§ 184, § 359, § 368 |volume=1 |orig-date=September 1928 |publisher=[[Open court publishing company]] |location=Chicago, US |date=1993 |edition=two volumes in one unaltered reprint |pages=[http://archive.org/details/historyofmathema00cajo_0/page/193 193, 402–403, 411–412] |isbn=0-486-67766-4 |lccn=93-29211 |url=http://archive.org/details/historyofmathema00cajo_0/page/193 |access-date=2019-07-22 |quote-pages=402–403, 411–412 |quote=§359. […] ∥ for parallel occurs in [[William Oughtred|Oughtred]]'s ''Opuscula mathematica hactenus inedita'' (1677) [p. 197], a posthumous work (§ 184) […] §368. Signs for parallel lines. […] when [[Robert Recorde|Recorde]]'s sign of equality won its way upon [[the Continent]], vertical lines came to be used for parallelism. We find ∥ for "parallel" in [[John Kersey the elder|Kersey]],{{citeref|A|ref=FC-A}} [[John Caswell|Caswell]], [[William Jones (mathematician)|Jones]],{{citeref|B|ref=FC-B}} Wilson,{{citeref|C|ref=FC-C}} [[William Emerson (mathematician)|Emerson]],{{citeref|D|ref=FC-D}} Kambly,{{citeref|E|ref=FC-E}} and the writers of the last fifty years who have been already quoted in connection with other pictographs. Before about 1875 it does not occur as often […] Hall and Stevens{{citeref|F|ref=FC-F}} use "par{{citeref|F|ref=FC-F}} or ∥" for parallel […] {{anchor|FC-A}}[A] [[John Kersey the elder|John Kersey]], ''{{citeref|Kersey (the elder)|1673|Algebra|style=plain}}'' (London, 1673), Book IV, p. 177. {{anchor|FC-B}}[B] [[William Jones (mathematician)|W. Jones]], ''Synopsis palmarioum matheseos'' (London, 1706). {{anchor|FC-C}}[C] John Wilson, ''Trigonometry'' (Edinburgh, 1714), characters explained. {{anchor|FC-D}}[D] [[William Emerson (mathematician)|W. Emerson]], ''Elements of Geometry'' (London, 1763), p. 4. {{anchor|FC-E}}[E] {{ill|Ludwig Kambly{{!}}L.<!-- Ludwig --> Kambly|de|Ludwig Kambly}}, ''Die Elementar-Mathematik'', Part 2: ''Planimetrie'', 43. edition (Breslau, 1876), p. 8. […] {{anchor|FC-F}}[F] H. S.<!-- Henry Sinclair --> Hall and F. H.<!-- Frederick Haller --> Stevens, ''Euclid's Elements'', Parts I and II (London, 1889), p. 10. […]}} [http://monoskop.org/images/2/21/Cajori_Florian_A_History_of_Mathematical_Notations_2_Vols.pdf]</ref>

Might we have a page to document epic/creative usage of a single CS1|2 citation. -- GreenC 14:49, 8 November 2020 (UTC)

Already exists, though likely, very few of us know of it: Module talk:Citation/CS1/Rogues gallery.
Trappist the monk (talk) 15:05, 8 November 2020 (UTC)
It seems the main problem here is the misused |quote=. Personally I would only use that parameter to quote items relevant to the publication itself (from the verso, index, toc etc.). I would use footnotes for any quoted content. 65.204.10.231 (talk) 15:33, 8 November 2020 (UTC)
There is one even longer in Exponentiation. I think it will the specimen for the museum gallery. -- GreenC 02:56, 11 November 2020 (UTC)
Now on display (last entry). -- GreenC 03:05, 11 November 2020 (UTC)
Epic enough to have its own page in article space. 208.251.187.170 (talk) 13:08, 11 November 2020 (UTC)

Improving COinS metadata output[edit]

Investigating the COinS metadata output I have spotted some areas for possible improvement on various levels. Since most of them are small and/or affect corner-cases only they aren't worth individual threads polluting the TOC, so I will combine them into this thread.

There will be more, but so far there have been only two changes, both related to the metadata generated for identifiers which have no predefined &rft.<id-name> or &rft_id=info.<id-name> tags associated with them within COinS. For such identifiers, the template uses the &rft_id=<id-link> tag to provide URLs to the external resource. The code assembling such URLs uses prefix and suffix definitions from a table defining the various properties for the identifiers. While the suffix was added to the visible URLs, there was a bug omitting to add the suffix to the identifier URLs for COinS as well. This has been fixed. However, this is an internal change only and has no impact on the actually generated metadata because none of the identifiers defined so far actually used a suffix.

On the receiver side, users of the identifier data passed through via URLs may want to retranslate it back into a human-readable form "<id-name> <id-number>". While it is sometimes possible to derive the identifier type from the URL, this is not always the case. For example, DOI and bioRxiv as well as JFM and Zbl identifiers both resolve to the same URLs, respectively:

  • DOI <id-number> → "&rft_id=//doi.org/<id-number>" → ?
  • bioRxiv <id-number> → "&rft_id=//doi.org/<id-number>" → ?
  • JFM <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>" → ?
  • Zbl <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>" → ?

This is not a problem in the DOI case, because a predefined info:doi tag exists and thus is used by the metadata generator instead of creating an URL for it.

  • DOI <id-number> → "&rft_id=info:doi/<id-number>" → DOI <id-number>

However, to make the URLs more useable on the receiver side, the generator now appends an URI #fragment to the URLs indicating the name of the identifier. This is transparent for browsers (would this metadata be copied and pasted into the address line of a browser), but is readable for humans and scripts which can thereby pick up the original name and translate the URL back into the "<id-name> <id-number>" form for storage in their database. Examples:

  • bioRxiv <id-number> → "&rft_id=//doi.org/<id-number>#id-name=bioRxiv" → bioRxiv <id-number>
  • JFM <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>#id-name=JFM" → JFM <id-number>
  • Zbl <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>#id-name=Zbl" → Zbl <id-number>

There are some interesting concepts how to further encode information in URI fragments to describe a resource or make it automatically actionable on the client's side. If we'd find a low-footprint scheme formally describing the URL as a link to information related to a specific entity of a named identifier, this could be further refined.

--Matthiaspaul (talk) 17:36, 10 November 2020 (UTC) (updated 22:45, 10 November 2020 (UTC), updated 14:26, 16 November 2020 (UTC))

I believe one or another of your changes has caused the error in test_Zbl in Module talk:Citation/CS1/errors. --Izno (talk) 19:53, 10 November 2020 (UTC)
Thanks, according to Module_talk:Citation/CS1/testcases/errors this should be fixed now (but fixing this I spotted another issue in the existing code still to be fixed). --Matthiaspaul (talk) 23:57, 10 November 2020 (UTC)

URL in identifier[edit]

Bunce, Mrs. Oliver Bell (1 September 1897). "The Turkish Compassionate Fund". The Decorator and Furnisher. doi:10.2307/25585322. JSTOR http://www.jstor.org/stable/25585322.

|JSTOR= should emit an error. --Izno (talk) 18:49, 10 November 2020 (UTC)

|jstor= is one of three external identifiers that don't get some sort of check (the others are |osti= and |rfc=). |jstor= can hold a variety of identifiers:
And then there is stuff like this that doesn't work:
Because there is such a diversity of |jstor= identifiers, we may not be able to validate them.
I think that |osti= and |rfc= are simple numeric identifiers. Likely we have not bothered to check these because there are relatively few uses of these identifiers. |rfc= seems to be max number between 8000 and 9000. |osti= seems to be max number between 22000000 and 23000000. So these two could be given simple limit checks like we do for |pmc=.
Trappist the monk (talk) 23:53, 10 November 2020 (UTC)
Sounds about right for RFC. Not familiar with OSTI.
As for JSTOR, here's some ideas: looks like it has a URL, or has spaces, as errors. We should already have URL detection from title checking, which would have caught at least two pages. (Not sure about schemeless URLs?) --Izno (talk) 01:48, 11 November 2020 (UTC)
Cite book comparison
WT {{cite book |title=Title |rfc=1}}
Live Title. RFC 1.
Sandbox Title. RFC 1.
Cite book comparison
WT {{cite book |title=Title |rfc=10000}}
Live Title. RFC 10000.
Sandbox Title. RFC 10000 Check |rfc= value (help).
Cite book comparison
WT {{cite book |osti-access=free |title=Title |osti=1}}
Live Title. OSTI 1.
Sandbox Title. OSTI 1 Check |osti= value (help).
Cite book comparison
WT {{cite book |title=Title |osti=23000001}}
Live Title. OSTI 23000001.
Sandbox Title. OSTI 23000001 Check |osti= value (help).
Trappist the monk (talk) 00:14, 15 November 2020 (UTC)
Has anyone seen OSTIs lower than 1018? Otherwise we could raise the lower limit from 1 to 1018.
--Matthiaspaul (talk) 23:08, 15 November 2020 (UTC)
As so far I could not find lower OSTI numbers to be supported by the OSTI site and only found considerably higher numbers in WP, I now changed the lower bound to 1018 to catch at least some "stray digit" errors:
Extended content
Cite book comparison
WT {{cite book |title=Title |osti=0}}
Live Title. OSTI 0.
Sandbox Title. OSTI 0 Check |osti= value (help).
Cite book comparison
WT {{cite book |title=Title |osti=1017}}
Live Title. OSTI 1017.
Sandbox Title. OSTI 1017 Check |osti= value (help).
Cite book comparison
WT {{cite book |title=Title |osti=1018}}
Live Title. OSTI 1018.
Sandbox Title. OSTI 1018.
Cite book comparison
WT {{cite book |title=Title |rfc=0}}
Live Title. RFC 0.
Sandbox Title. RFC 0 Check |rfc= value (help).
Please report if you find a lower number somewhere.
--Matthiaspaul (talk) 23:59, 16 November 2020 (UTC)
Both, URL scheme and space detection could be useful, although I couldn't find any JSTORs starting with "http:", etc. (probably fixed by you already?). I found about 20 citations with invalid JSTORs starting with "www.jstor.org", though. So, an identifier value starting with the domain name from the URL prefix from /Configuration could be a good pattern as well in general, but, given that the other identifiers have more sophisticated validation checks already, it would only make sense to add to JSTOR - but still wouldn't catch someone just entering garbage...
--Matthiaspaul (talk) 16:10, 16 November 2020 (UTC)
Yeah, but at best it's a maintenance category or a properties category while we review to see what looks like trash. If we were to do something like that, we'd want to exclude obvious ones like DOI-like identifiers, as a first case. --Izno (talk) 16:31, 16 November 2020 (UTC)
A test for stray spaces and "http(s)://" at the start of the identifier string has been added to the JSTOR code.
Extended content
Cite book comparison
WT {{cite book |title=Title |jstor=141294}}
Live Title. JSTOR 141294.
Sandbox Title. JSTOR 141294.
Cite book comparison
WT {{cite book |title=Title |jstor=141 294}}
Live Title. JSTOR 294 141 294.
Sandbox Title. JSTOR 141 294 Check |jstor= value (help).
Cite book comparison
WT {{cite book |title=Title |jstor=141dfdfdf29 4}}
Live Title. JSTOR 4 141dfdfdf29 4.
Sandbox Title. JSTOR 141dfdfdf29 4 Check |jstor= value (help).
Cite book comparison
WT {{cite book |title=Title |jstor=http://141294}}
Live Title. JSTOR http://141294.
Sandbox Title. JSTOR http://141294 Check |jstor= value (help).
Cite book comparison
WT {{cite book |title=Title |jstor=http://141294}}
Live Title. JSTOR http://141294.
Sandbox Title. JSTOR http://141294 Check |jstor= value (help).
However, there is still an older bug invalidating strings with spaces (also present in the live code).
--Matthiaspaul (talk) 16:50, 19 November 2020 (UTC)
Should be fixed now by encoding the id as well.
--Matthiaspaul (talk) 20:22, 19 November 2020 (UTC)

Add an iaident parameter[edit]

CS1 templates are very complex and ever changing, and writing a bot to enhance certain references, such as book references, to make them more easily accessible to readers can have unintended side-effects, consequences that may actually make things worse. I propose adding two new parameters to the CS1 templates. The first one is iaident. When this is populated, the module can figure out where to put the link to archive.org. If a URL is lacking, it go where any URL would normally go, if it isn't, it can perhaps append it to the citation in some way like "View at archive.org" or something like that. The URL would be http://archive.org/details/<iaident>. The second parameter would be iaoffset. In certain cases where pages don't link properly, iaoffset would be used to direct the server to the correct page/location of the media being viewed. This is the raw location. When used the URL simply becomes http://archive.org/details/<iaident>/page/n<iaoffset>.

These two additions will have no impact on existing citations and will allow a more harmonious addition of readable page previews to citations without stepping on anyone's toes, or accidentally breaking something in an existing reference.—CYBERPOWER (Chat) 13:28, 16 November 2020 (UTC)

We already have provision for archive links - why do we need special provision for the Internet Archive? They don't need any further advertising here.Nigel Ish (talk) 14:07, 16 November 2020 (UTC)
Nigel Ish, what I proposed is not an archive link, it's a link to a book scan at Internet Archive for readers to preview in an attempt to improve verifiability. The addition of these links is already approved, so the claim they are advertising is false. Internet Archive has nothing to gain from "advertising" their service. They are not making any revenue off of it. For example, you have a Cite Book reference with no link to be able to view the book. That's what this will serve. It only serves to make it easier for readers and editors to verify a claim on Wikipedia. I don't see how this does anything but help Wikipedia's core principles. —CYBERPOWER (Chat) 14:43, 16 November 2020 (UTC)
I am not sure I understand. As noted above, there's an archive url parameter already, for works that can be found in an archive. And |via= can inform the reader that the version of the work they are reading is published in an archive. If the work is only found in an online archive, then what is cited is the archive, likely via {{cite web}}. The particulars of the citation will make this obvious. I don't know what this has to do with bots "enhancing references" or how complexity can be reduced by adding even more specialized parameters. 65.204.10.231 (talk) 14:13, 16 November 2020 (UTC)
To explain more clearly, archive URL is for archives of website. What I'm proposing is not an archive of a web page. It's a media URL of a book, magazine, whatever, that stored at Internet Archive. As it currently stands, these URLs are placed in the url section, but doing that may have other consequences such as clashing with title-link, or something else I, or another botop may be unaware of. The proposal is to just put this info in it's own parameter so the template can deal with it appropriately. —CYBERPOWER (Chat) 14:47, 16 November 2020 (UTC)
Archive URLs point to any item archived online, be it webpage, book, video etc. As mentioned previously, when one cites s scanned item at Internet Archive, one is actually citing the archive. The source (in this case a website) is the Internet Archive. The scanned item (they are all digitized by scanning or other means) is an entry (webpage location) in that website. There is no need for an identifier, and I still don't understand how bots enter into this. If you feel something like that is needed, you can always make a wrapper for {{cite web}} as a single-source/special purpose template for Internet Archive. There are several examples. 50.74.165.202 (talk) 16:44, 16 November 2020 (UTC)

There are over 600,000 citations that link scanned books. Examples. It does seem kind of silly we don't use the ID system for this, it is one of the most frequently linked things on enwiki. There are 3.7 million {{cite book}} templates and if all these were in cite books (most are) that is 16%. -- GreenC

Most identifier parameters do not contain "id" or "identifier" in their name, so if this is introduced please just call it "ia" or "internetarchive". Note that we already have OpenLibrary identifiers that can be used to link a large part of IA books (but not other content).
I have no opinion on whether using an identifier is preferable to using the URL, but I support the stated goal (to facilitate linking books). Maybe it can simply be achieved by some Lua transformations on the URL? Nemo 16:24, 16 November 2020 (UTC)
Which reminds me that we should put |ol= into the metadata to make it easier for third-parties to correlate the data. (The technical reason for why we don't include it already is because different OL identifiers require different prefixes and this doesn't fit very well into the current implementation.)
--Matthiaspaul (talk) 16:47, 16 November 2020 (UTC)
Nemo bis, No objections to the naming conventions. —CYBERPOWER (Chat) 17:01, 16 November 2020 (UTC)
(edit-conflict) So, what you both are asking for is basically an identifier for archive.org, so that it does not occupy the title link? I like this idea, and if this identifier would be included in the list of auto-linking targets, it would be as convenient to use as if it would occupy |url= by itself but only be considered by the template when |url= is not specified as well. This would free |url= for other uses. If this is what you propose, I would support it. Ideally, though, this parameter would not take a complete URL such as "http://archive.org/details/sixmonthsatwhit02carpgoog" as a value, but just an id (like "Identifier=sixmonthsatwhit02carpgoog"). How does this correspond with the "Identifier-ark=ark:/13960/t40s07c8h"? Is it possible to derive the former from the latter (ark)?
Is my assumption correct that these scanned documents do not need to be archived any more because they can be considered to be archived already, that is, these links will be permanent? This would be another argument for having a specific identifier parameter for them and leave |url= with its |archive-url= companion for links which actually need |archive-url= to prevent link-rot.
--Matthiaspaul (talk) 16:38, 16 November 2020 (UTC)
We are not in the business of developing identifiers, nor extracting homebrewed ones from URL fragments. Neither is this a novel idea, similar have been discussed before. It hasn't happened for the reasons already spelled out here. This is more or less superfluous. Adds complexity. Brings nothing extra to discovery. Hasn't anyone noticed that editors can insert custom ids? In |id= an editor can insert the source's own identifying scheme, if any. 50.74.165.202 (talk) 17:01, 16 November 2020 (UTC)
Matthiaspaul, everything at Internet Archive is intended to be there permanently. There are some very rare exceptions to that rule, but what is saved to the Internet Archive will generally stay there forever. —CYBERPOWER (Chat) 17:14, 16 November 2020 (UTC)
I'm actually not aware of Identifier-ark. What does it do? —CYBERPOWER (Chat) 17:16, 16 November 2020 (UTC)
On the page (http://archive.org/details/sixmonthsatwhit02carpgoog) I linked above (nothing special, just the first example I found writing this), the entry "Identifier" contains the value "sixmonthsatwhit02carpgoog", and the entry "Identifier-ark" the value "ark:/13960/t40s07c8h", respectively. I have seen those "ark" identifiers in other IA pages related to scanned books, that's why I am interested in how they are related. --Matthiaspaul (talk) 18:01, 16 November 2020 (UTC)
Matthiaspaul, okay, I just wanted to be sure, but they are completely unrelated. It is not possible to derive either value from the other. —CYBERPOWER (Chat) 13:05, 17 November 2020 (UTC)
I support the addition of a |ia= with the caveat that it should be documented to take the Internet Archive identifier (and, yes, these are unique identifiers assigned by IA; they just don't have a resolver that abstracts the identifier from the physical address (URL)) of the scan where the information it supports was found, rather than any old scan of some book that may or may not be the same work in the same edition in a copy sufficiently identical to the original to support WP:V. People will still use it sloppily of course, but if the definition is strict we at least pull the trend in the right direction over time. This also means we treat it as an identifier and not a convenience link (those can go in |url=). This means the derived URL should not be auto-promoted to the |url=. It also means the parameter should not be bot-populated unless other information in the template uniquely identifies the scan to which it refers. IA book scans are a great resource and we should take advantage of it to the fullest extent practical, but not uncritically and sloppily.
I don't see the case for the proposed |iaoffset= parameter, and at first blush it would seem to be conceptually in conflict with everything else in CS1. --Xover (talk) 18:57, 16 November 2020 (UTC)
Xover, iaoffset is needed in the event the page number itself is not providing a working link to the target page of the book. iaoffset will change the link to the raw location of the book you want to view, which will always work. It's hopefully not going to be needed often. Use cases are roman numerals or numberless pages being referenced. —CYBERPOWER (Chat) 13:07, 17 November 2020 (UTC)
I have seen digitized blobs of many journals/magazines/collections in one file. Would this |ia-offset= (provisional name) be useful to point to the start of the relevant work as well?
However, I'm not too fond of adding two parameters for this. Perhaps, in those cases where it is needed, it should be allowed to just append /page/n<iaoffset> to the identifier... '/' is obviously a character which can never occur in the identifier. Are there other "reserved" characters? What is the format of these identifiers (as RegEx or similar)?
--Matthiaspaul (talk) 13:44, 17 November 2020 (UTC)
Matthiaspaul, n<iaoffset> is a pointer to the raw page scan location of the work. For example, n5 would take you to the 5th image scan of the media, which would probably be the cover page, or book information and copyright. n10 may take you to a page in the book with the page number iii. Conversely, dropping the n will take you the book's page 10. In most cases the n prefix doesn't need to be used, but there are cases where they are required so the link goes straight to the desired page that has the information needed to verify the reference. —CYBERPOWER (Chat) 13:54, 17 November 2020 (UTC)
Is there a document describing the inner format (if there is any) of these identifiers for validation checks, or are they just strings of random length containing random characters without checksum or date information? Who composes these identifiers and according to which rules?
--Matthiaspaul (talk) 15:01, 17 November 2020 (UTC)
Matthiaspaul, nope. There is no hidden information in these strings. They're effectively almost random. —CYBERPOWER (Chat) 21:01, 17 November 2020 (UTC)
@Cyberpower678: I understand its intended functionality, but I still don't see the case for adding it. No other identifier supported in CS1 links directly to a specific page (caveat: there are some field-specific ones in there that I'm not that familiar with), but to the work as such or a specific copy of it, and that's quite good enough. Linking directly to a specific point in a source is at best a convenience, and in some contexts can even be a (very very minor) inconvenience. Matthiaspaul's example above (linking to a specific article within a magazine or a specific issue within a whole volume collection of a periodical) is the best use case for this, but even in those instances it falls into "convenience" territory and fails to justify the addition of a dedicated parameter IMO (and the same goes for the additional complexity of trying to encode it into the identifier; identifiers should generally be opaque). --Xover (talk) 14:34, 17 November 2020 (UTC)
It seems that we have heard this type of request before, particularly for a google books 'id'. If I remember correctly, those requests were rejected because the 'id' isn't a persistent id and in fact, isn't an id at all, but merely a token in the url query string. I also recall Semantic Scholar's wish for an identifier. They originally wanted us to use the forty character path element from their url:
http://www.semanticscholar.org/paper/041a49f7fdc8eef74ac2e52a768011ed0c29d0ce
Before we would let them have a cs1|2 identifier, we required them to create a simpler form, their corpus ID which they then map to whatever url they want:
http://api.semanticscholar.org/CorpusID:219352572
|s2cid=219352572
The |ia-identifier=sixmonthsatwhit02carpgoog seems a lot the same to me.
HathiTrust, uses the handle system to link to books and to specific places in that book. For example, their copy of Six Months at the White House with Abraham Lincoln is here:
http://hdl.handle.net/2027/uc1.$b301895
and to link to page 15 they give this as the handle:
http://hdl.handle.net/2027/uc1.$b301895?urlappend=%3Bseq=23
I could imagine an IA corpus ID (something with a check-digit would be good) so: |iacid=<corpus ID> for the book and if a particular scan is desired then perhaps something like |iacid=<corpus ID>.n<scan ID>. cs1|2 would then build a handle system url that internet archive can redirect to the appropriate location
Why isn't Internet Archive listed at Special:BookSources?
Trappist the monk (talk) 12:41, 18 November 2020 (UTC)
All this is well and good, but also a moot point since any such id is not necessary. It adds nothing that cannot easily be done now, without it. Instead of wasting time in trinkets, I would direct everybody's energies into fixing the many design and logical flaws in the cs1/cs2 system. 65.204.10.231 (talk) 13:42, 18 November 2020 (UTC)
(edit-conflict) I have run into cases in a citation where I wanted to include a "genuine" URL to some document/site but also had a link to a digitized copy of the work at Google Books or Internet Archive, so I had to append some of those links after the citation as convenience links. I have also seen editors or bots/scripts "fighting" over those entries by replacing the URL in |url= by one of the Google- or IA-type ones. It would have been much better, if those extra resources could be listed among the identifiers, so that they don't occupy the place of |url= any more and the bots would have a dedicated place where to put them without disturbing anyone. If parameters like |ia= or |gbooks= (provisional names) would be included in the list of auto-linking identifiers, they could still show up as title links if none of the other links take precedence.
However, as Trappist correctly pointed out, it only makes sense for "identifiers" which are established and stable long-term and don't need an archived link to prevent link-rot (because they are already sort-of-links-to-archived-copies). Also, it would be great if they would be shorter and follow some logical system (or we'd have to devise some way to link to them without showing the value)...
As Cyberpower and GreenC both have good connections to IA, they likely know who to ask at IA to make this happen.
--Matthiaspaul (talk) 16:28, 18 November 2020 (UTC)
Matthiaspaul, identifiers don't change. Once assigned, they are permanent. —CYBERPOWER (Chat) 21:44, 18 November 2020 (UTC)
BTW. They already have property assignments in Wikidata:
So, if we'd have corresponding parameters for them they could be used by {{cite Q}} as well.
--Matthiaspaul (talk) 17:34, 18 November 2020 (UTC)
Trappist the monk, IA identifiers however are persistent and do map to a specific scan. I'm not sure what exactly you are asking here. They are not tokens. The addition of /page/<page> further points to a specific location of said scan. This will never change. Further more the use of page, p, pp, pages, can be used by the module to assist in said pointing unless overriden by the offset parameter, or by the specification of /page/<page> in the identifier param. —CYBERPOWER (Around) 16:02, 18 November 2020 (UTC)
This will never change. Maybe; maybe not. Whatever mechanism IA uses is proprietary to IA. It seems better to me to avoid proprietary systems and use a system supported by many users so the handle system seems to fit; cs1|2 already supports |hdl= so we don't have to craft something special for IA.
I'm not sure that I see the need for a separate identifier. The primary use of cs1|2 templates is (supposed to be) to identify the source that the en.wiki editor consulted to support our article. I have never really felt comfortable with bots adding, and especially replacing, urls that the bot surmises may link to the source the editor consulted. Unless these bots have learned how to mindread, the bot does not and cannot know with any certainty what source the editor consulted. If editors want to blue-link titles to sources available at IA, they can use |url= to link to the source that they consulted.
The only question I asked, and that you did not answer, was: Why isn't Internet Archive listed at Special:BookSources?
Trappist the monk (talk) 20:20, 18 November 2020 (UTC)
Trappist the monk, I can't answer that question. I'm not familiar with the functions of Special:BookSources. I don't understand your argument of proprietary. The strings are arbitrary, and unique to the book scan it's linked to. A bot does not need to mind read to ISBN match a book to something stored at Internet Archive. ISBNs are also unique, so there's no mindreading going on here. A unique identifier to a book, added by a human, is being matched to a unique identifier at IA. —CYBERPOWER (Chat) 21:41, 18 November 2020 (UTC)
In concept ISBNs are unique. In practice, they are not always unique. In past discussion on this page, Editors noted that ISBNs are not always unique because different editions may have different pagination, different covers, etc. But ISBNs are why I asked about Special:BookSources. If it is possible to search IA with an ISBN then IA should be listed at Special:BookSources; if google and amazon, why not IA? Get IA listed at Special:BookSources and there will be no need for a special identifier in cs1|2. A listing at Special:BookSources does not prevent editors from adding direct links with |url= to the facsimile at IA, and may increase the use of IA urls for books; better to link to IA than to google or amazon, isn't it? Google and amazon are right there at the top of the list at Special:BookSources; is it any wonder that editors looking for courtesy links use them?
Does citoid know about books at IA? If not, why not? I know that citoid knows about worldcat which has abominably poor metadata. If you can demonstrate that the metadata at IA are as good or better than the metadata at world cat, I would think it a no brainer for citoid to use IA, especially because IA has copies of the books it indexes whereas worldcat does not.
The strings are arbitrary... Arbitrary. That's certainly part of it for me. The strings are arbitrary and, for the example in this discussion, sixmonthsatwhit02carpgoog, seem to suggest that google is where I will land if I click on that 'identifier'. Arbitrary does not look systematic, it does not look professional. Editors at discussions here and elsewhere have complained that readers won't click on identifiers because they don't understand the meaning of the initialisms and so are intimidated. I think that our readers smarter than that; especially readers who have gotten to the point of following an article far enough that the references matter.
I don't think that a proprietary system that uses arbitrary strings benefits en.wiki. I have a hard time believing it whenever anyone says [this] will never change. This is the internet; nothing on the internet is static. A non-proprietary system, supporting multiple users is, I think, a better long-term choice for en.wiki because the stable identifier abstracts to the actual url of the source. That url can change as source providers upgrade their technology and internal data handling without it impacting us.
Trappist the monk (talk) 00:23, 19 November 2020 (UTC)
A couple of points here…
I agree, and have previously suggested to both Cyberpower and Markjgraham, that they should first pursue options for making IA links easy for humans to add, specifically through Special:BookSources and Citoid. I am worried by their failure to pursue these options and read it as indication that they are only really interested in approaches that let them bulk-add links to IA via bot (cf. WP:VPP § Stop InternetArchiveBot from linking books and WP:BOTN § VPPOL discussion closed: linking by InternetArchiveBot). Bots are not a good match for this problem, and wishing screws were nails does not make the hammer any more suited.
That being said, the identifiers for works at IA have several of the important properties of identifiers (vs. addresses). They are unique, have a controlled syntax, are stable over time; and these properties are backed by guarantee from a generally well respected organisation of sufficient demonstrated longevity for our purposes. The properties it lacks are abstraction (it maps directly to an address in a static way) and a facility for resolving the identifier to an address other than the resource's current canonical address. It is also a proprietary identifier, and one backed by only a single organisation. However, this is no worse than |jstor=, and in some ways better because unlike JSTOR's "Stable URL", IA does actually treat this as an identifier. It is picked by the uploader, often according to a suggested schema, but it it assigned and managed by IA; and, crucially, it shows up in various APIs on their side where e.g. JSTOR would have used the URL (i.e. they actually treat it as an identifier in practice). It would be better if IA registered a HDL or DOI for each scan, but I don't see this as a bright line. I don't think an identifier's visual appearance, or the presence of certain substrings, are fair objections. Identifiers should be opaque except any defined hierarchy (DOI prefixes and such), and if they are too long their display can be truncated (or people will choose not to add them).
Specific params for such identifiers also makes it easier for users to discover (and thus actually make use of) than generic ones, and makes it easier to add multiple links where that is relevant. Having spent far far too many hours manually cleaning up article references I very much appreciate every additional identifier available, because even nominally stable identifiers like DOIs die in the timescales we care about. I don't know any services mirroring IA specifically (unlike JSTOR and Project MUSE that often both have copies of a given journal issue), but just as an illustration we have a lot of IA works uploaded at Commons. Being able to point both at the original at archive.org and the alternate copy at Commons will save somebody's behind a decade down the line when IA decides to annoy the publishers enough to get sued out of existence (or whatever).
Finally, there is not a 1:1 relationship between an ISBN and a specific scan of a specific copy of a specific edition of a specific work. Starting from an ISBN you can get to a search that lists lots of these, but you can't point at only one. That's (part of) why bot adding these links is a bad idea and Special:BookSources is the most appropriate avenue for making IA accessible at volume. But starting in the other end, you certainly can add the identifier of the specific scan you consulted when adding the reference. And sometimes the ability to specify a copy of a book (there are multiple advanced academic degrees made based on the copy-to-copy differences in the First Folio), and even the scan used of that copy (the same copy scanned by both Google and IA may have material differences in quality (hint: Google's scanner operators exhibit not a single fig given about quality)), is important.
Bottom line, for me, is that while this is not a no brainer, I ultimately fall down on the side of wanting this parameter. I also wish IA would actually participate here, and discuss issues surrounding linking, discoverability, metadata (their's is almost as bad as Worldcat's, just in different ways), but absent that I'll settle for ways we can more effectively make use of IA as a resource. --Xover (talk) 09:35, 19 November 2020 (UTC)
And then there is this 'identifier': northangerabbeyb00aust_1. Apparently, accuracy in creating these 'identifiers' is not a criteria for their creation. Some sort of numerical corpus ID (just take the next available number) would be much better than seeing an identifier naming Northanger Abbey in a citation for Pride and Predjudice: http://archive.org/details/northangerabbeyb00aust_1. That url was added by bot. It does illustrate the offset issue. The cited page is vii so the page link that the bot added did not work (since removed) but, had the bot written [http://archive.org/details/northangerabbeyb00aust_1/page/n9 vii] it would have worked: vii.
Trappist the monk (talk) 14:16, 19 November 2020 (UTC)
Correct. Pages can be referred to by the physical leaf number, or the printed page number. For example anything without a printed page number, such as anything before printed "Page 1", it uses the "/page/n10" syntax eg. the 10th page leaf from the start. If the printed page number can't be asserted due to scanning errors, etc.. it uses the "n" leaf system. Determining (asserting) the printed page number from a OCR scan is not always possible, indeed technically challenging, so this is the default method to get to a page when page assertions are unavailable. -- GreenC 15:43, 19 November 2020 (UTC)

I wonder why this subject invites such elaborate discussion. All IA items are online. There is already a standardized, constantly utilized, familiar locator (the URL) to easily reach the referenced archive, as well as in-source locations such as specific pages (in the case of archived print media). Is there any reason for IA to have preferential treatment over other archives? Archives, just like any other source, are not automatically reliable. Afaik, IA's archiving protocols are opaque, and the resulting archives not vetted. Granted that the last time time I looked at IA governance was several years ago, but I was surprised to find out that there were no official "Archivist" positions at the organization. That is like having libraries without trained librarians. Not that university archiving operations are much better. I have seen horrible scans of well known works in such institutions. In some cases, really bad version control, with a different archive of the same original showing up seemingly randomly, no doubt thanks to some mysterious algorithm. But do go ahead and try to make sense of all this if that is your thing. 98.0.246.251 (talk) 01:59, 20 November 2020 (UTC)

Discussion is good, for as long as it remains constructive and aims at seeking the best solution to address a problem as this one.
I too am somewhat sceptical of unmanned bot actions for tasks where editorial judgement might be necessary.
I nevertheless support the addition of this identifier because it is also useful for editors manually improving citations. There is often more than one link that could be added to |url= and it would be good to have a separate place for at least the most common and established providers of content to free the |url= parameter and its companion |archive-url= for better purposes in order to improve the quality and usefulness of citations and to fight link-rot. Both, GB and IA identifiers have proven to be stable for many years (with minor exceptions), more stable than many URLs to other sites, but in the hyphothetical case that they would suddenly change their link formats, change their identifiers or change their services in unacceptable way, it would be trivially easy for us to centrally adjust or mute the corresponding template output, that is, it gives us more control.
Still, it would be great if IA could introduce some abstraction layer on top of their identifiers first, so that they become shorter and do not contain potentially misleading human-readable text fragments.
--Matthiaspaul (talk) 20:42, 21 November 2020 (UTC)
Well, my comment was centered on the opinion that there is no pressing problem to add anything. The idea that identifiers can be used as failovers for URLs, may not really hold water. For the simple reason that practically all ids are basically wrappers for, or reformatted abstractions of, URLs. One could argue that some ids may be using a different repository, or other (supposedly) authoritative service, or just simply a mirror that may stay up. But all of these can break too, and I do not know that we have a way to judge the future stability of the underlying infrastructure. I assume some, such as ISBNs (that resolve at web servers run by trade-affiliated entities) are more robust than others, simply because they are by now necessary for commerce. But even ISBN resolvers are known to have gone down. 98.0.246.242 (talk) 01:56, 22 November 2020 (UTC)
Obviously, we cannot predict the future. However, I don't know when they have been introduced originally, but both IA and GB identifiers have proven to be static for more than a decade already, and from the descriptions on their web sites they both see them as permanent long-term identifiers for use in public interfaces, not as short-time or only internal handles only accidently leaked to the outside world which could change/be renumbered the next time they set up their databases.
http://archive.org/services/docs/api/metadata-schema/index.html
http://blog.archive.org/2011/03/31/how-archive-org-items-are-structured/
http://developers.google.com/books/docs/v1/using#ids
So, it doesn't look as if they would intend to change them (to the better or worse) in the foreseeable future.
--Matthiaspaul (talk) 22:32, 23 November 2020 (UTC)
To reiterate, nobody will stop you if you wish to insert any "official" or semi-official identifier in |id=, regardless of whether such is well maintained or not. But there has to be a more compelling reason to formalize these into yet more parameters. Not every secondary identifier must be coded, documented and explained. This particular citation system is already overly complex and there is a good chance that the needs of the non-expert reader are not met. The litmus test: the most complex citation possible should be understood by the least knowledgeable reader possible. 107.14.54.1 (talk) 01:21, 24 November 2020 (UTC)
Matthiaspaul, It's that argument there why them shortening the idents is not likely to be changed. The static nature of the identifiers, once they are created they never change. —CYBERPOWER (Happy Thanksgiving) 13:56, 26 November 2020 (UTC)
Okay, I see that point, continuity is important, but given that the format is (almost) free-text at present, they could change it to become something more systematic and shorter for all future IDs and keep the existing ones as legacy. They could also assign a second ID following the new naming scheme to all of the older entries, keep the old IDs working forever but list the new IDs first. One ID for two targets would be a problem, but two IDs pointing at the same target is not.
This would allow external parties to slowly move to the new scheme, but would not break any old reference links from printed sources (if they exist) or from external parties which are not actively maintained and will keep pointing to the old ID forever as well.
--Matthiaspaul (talk) 14:12, 26 November 2020 (UTC)

ISBN line breaks[edit]

Moved from Template talk:Citation § ISBN line breaks: {{u|Sdkb}}talk 20:05, 16 November 2020 (UTC)
Screenshot; look at ref 114

During the ongoing FA review for Biblical criticism, I noticed that some ISBNs in the citations with dashes (e.g. Bauckham, currently ref 114) break onto multiple lines. This makes them marginally harder to read, so I think it would be preferable if they were non-breaking. Would it be possible to place a {{no wrap}} around the input for |ISBN= and other parameters that might have the same issue? {{u|Sdkb}}talk 18:09, 16 November 2020 (UTC)

In my browser, ISBNs and the "ISBN" text are always nowrapped, no matter how I modify the window width. Perhaps you could create a demonstration page in your sandbox, or upload a screen shot. – Jonesey95 (talk) 18:22, 16 November 2020 (UTC)
@Jonesey95: Screenshot added. {{u|Sdkb}}talk 18:34, 16 November 2020 (UTC)
reference info for Biblical criticism
unnamed refs 59
named refs 135
self closed 212
Refn templates 7
cs1 refs 198
cs1 templates 205
rp templates 283
use xxx dates dmy
cs1|2 dmy dates 6
cs1|2 last/first 191
cs1|2 author 2
List of cs1 templates

  • Cite book (1)
  • cite book (172)
  • cite encyclopedia (2)
  • Cite journal (1)
  • cite journal (15)
  • cite news (1)
  • cite web (13)
explanations
As far as I know, there has only been one previous discussion about preventing the rendered isbn from wrapping (there was an earlier discussion where it was mentioned). The discussion did not gain sufficient support.
Why now, all of a sudden? There are a lot of FAs that use cs1|2 and that have |isbn= with hyphenated isbns; the category has 5,848 articles of which 4,774 have hyphenated isbns; see this search.
A better venue is Help talk:Citation Style 1 because Biblical criticism does not use {{citation}}.
Trappist the monk (talk) 18:59, 16 November 2020 (UTC)
Trappist the monk, I wasn't aware of that previous discussion; thanks for the link. The "why now" is just that I happened to notice it now while doing that review. And I'll move this to that venue.
While there's not uniformity in the prior discussion, it does look like there's enough support that consensus might develop with further discussion. What I notice is that there is a non-breaking space between the ISBN label and the number itself. Surely that would be a better breaking spot than any of the hyphens within the number? We should either change that to a breaking space, make the number non-breaking, or both, but definitely not neither. {{u|Sdkb}}talk 20:02, 16 November 2020 (UTC)
We also recently touched this in Help_talk:Citation_Style_1#Nbsp_in_|author,_|last,_and_equivalents_for_other_contributors
We currently frame ISBNs in <bdi>.
I would support to make the numbers for ISBN, SBN, ISSN, EISSN and ISMN identifiers as well as all dates (except for in the |orig-date= parameter) in suitable date formats non-wrapping. If this wouldn't grow the length of the non-wrapping string too long, this would ideally include the identifier names as well, but at the minimum we should keep the numbers from wrapping.
--Matthiaspaul (talk) 20:49, 16 November 2020 (UTC)
Following the example of many other messages containing short symbols/abbreviations (for example with volumes), to avoid odd-looking line breaks the sandboxed template now utilizes &nbsp; in the message fragments used to display " et&nbsp;al.", "&nbsp;ed." (for edition) and "§&nbsp;" and "§§&nbsp;" (sections).
--Matthiaspaul (talk) 13:59, 17 November 2020 (UTC)
Matthiaspaul, I'm somewhat at a loss of how to push this forward. Should we start a survey to make consensus clearer, or is there some technical hurdle, or do we just need to make an edit request? {{u|Sdkb}}talk 21:17, 24 November 2020 (UTC)
Nowrapping things is a crutch. The web interface will never be perfectly typeset, and in almost all cases you will cause someone's (usually on mobile) experience to suffer from nowrapping various content. I generally oppose it, and don't see particular reason here to do so, especially given the length of identifier strings (which anyway have a separate introducer that is of sufficient length to get the point, unlike with page(s)). --Izno (talk) 21:32, 24 November 2020 (UTC)
That's part of the reason why I suggested to apply the no-wrapping only to a selected set of identifiers such as ISBN, ISSN, etc., not to identifiers with non-hyphenated values, not to those with longer values such as DOIs. And also to apply it only to their values, not the combination of name plus value as a whole. These value strings appear to be short enough to make it unlikely that they would force the browser into some horizontal scrolling mode. They are also still short enough to be often transscribed manually (for which it is particularly important for the eyes that the value gets displayed on a single line). So, these are the identifiers for which I see the largest user benefit of applying no-wrapping.
Either way, I would think that, on mobile or embedded devices with very narrow viewports and possibly even without scrolling capabilities, a dedicated browser would simply ignore <span class="nowrap">...</span> before it starts to scroll or truncate. For non-dedicated browsers, couldn't this be solved on Timeless skin-level (CSS)?
--Matthiaspaul (talk) 16:27, 26 November 2020 (UTC)

Cite OEIS generates invalid HTML[edit]

While updating Happy number, I tried to add "Cited in (an OEIS citation)", but noticed that every citation generates an id "CITEREFSloane" by default, which is incorrect HTML with more than one citation. When I tried to specify an explicit |ref= I got a cite error "Unrecognised parameter". I could not immediately see why that was, so I created the link by a bodge. This of course continued to annoy me, so I had another look this evening.

Apart from the constant id, there were two problems which are fixed in this (current) revision (testcases). The link after the final refs testcase jumps to the test citation for the live template and there are now no errors for the ref parameter displayed.

We also need to correct the default ref id. I propose a default id of

CITEREF<editor-last>_"<sequenceno>"

for which the user would add something like

{{sfn|Sloane "A12345"}} or {{harvtxt|Sloane "A12345"}}

to link to this, which seems both reasonably simple and clear. The quotes around the sequence number correspond to the quotes around the full entry title in the citation. You can see this in the (current) sandbox. In the testcases, the link after the next-to-last testcase for dates jumps to the test citation, but the live citation still has the incorrect id. Of course, I will update the documentation accordingly.

There may be other cite wrappers with the same problem now that cite * generate ids by default. Parameter check lists also need themselves to be checked.

Just as I finished preparing this, I notice that the testcases no longer display the missing error messages for the |foo= and |date= parameters. I can't see any reason for this at present. They appear in preview mode.

Comments welcome, especially "yes, please do it" of course. --Mirokado (talk) 22:54, 20 November 2020 (UTC)

{{Cite OEIS}} is not a cs1|2 template. Problems with that template are best addressed at its talk page. If there is something wrong with the underlying {{cite web}}, then we want to know about it.
Trappist the monk (talk) 23:09, 20 November 2020 (UTC)
OK, copied most of this to Template talk:Cite OEIS#Generates invalid HTML for further comments.
"Other cite wrappers causing the same problem now that cite * generate ids by default" is certainly something relevant to this page, even if there is no really easy central solution. If someone is bored on a wet Saturday afternoon, here is something for them to look at. --Mirokado (talk) 00:24, 21 November 2020 (UTC)
Those other wrapper templates, like {{Cite OEIS}}, must adapt if they haven't already done so. This is really no different from wrapper templates needing to adapt when old forms of parameter names that the wrappers use are deprecated and support for them withdrawn. The issue that you are complaining about, automatic CITEREF anchor creation, changed nothing because |ref=harv was specified with this edit to {{Cite OEIS}}. That setting became superfluous when cs1|2 began creating automatic CITEREF anchors. With this edit, {{Cite OEIS}} lost the superfluous |ref=harv setting and gained the ability to set the citation's CITEREF anchor externally.
Trappist the monk (talk) 00:59, 21 November 2020 (UTC)

Undated sources[edit]

At present a source without a stated date uses the format date=n.d., and displays as
The newspaper. n.d. Retrieved 6 December 2015.
This is rather obscure to the reader. I would suggest either that date=n.d. be retained in the cite parameters, but displayed to the reader as "Undated", or that date=undated be allowed and displayed. (A display of "No date" for parameter n.d. would be OK.)

A parameter that tells editors that a reference is undated also saves an attempt to find and add a date, in the same way as the recommended author=<!--not stated--> does.

Example with date=n.d.:
"Pooley Bridge, Cumbria". Britain Express. n.d. Retrieved 6 December 2015.

Example with unsupported date=Undated:
"Pooley Bridge, Cumbria". Britain Express. Undated. Retrieved 6 December 2015. Check date values in: |date= (help)

Best wishes, Pol098 (talk) 13:35, 23 November 2020 (UTC)

This is rather obscure to the reader. Really? Why do you believe that readers are incapable of understanding this rather common initialism? It is perfectly acceptable to omit |date= when the source is not dated. Similarly, it is perfectly acceptable to write |date=<!--no date--> for the benefit of editors if you think it appropriate.
Beyond incompetent readers, is there any substantive reason for cs1|2 to deviate from what is, apparently, accepted practice among the various external style guides?
Trappist the monk (talk) 13:53, 23 November 2020 (UTC)
"Beyond incompetent readers ..." Requiring readers to be "competent" (and not necessarily English speakers; English Wikipedia is used worldwide) is not a good idea. Dropping "n.d." into the middle of a reference isn't necessarily clear ("Date=n.d." would be clearer, though "Undated" is better). To answer the question as asked: there is no substantive reason beyond "incompetent readers"; but that is enough for what is a trivial change without consequences (unless I have missed something) which will help readability. Let's see what others say. Best wishes, Pol098 (talk) 14:58, 23 November 2020 (UTC)
Just adding "undated" to the set of allowed input values would in fact be trivial. However, thereby we could not only not achieve consistency in the output, but even decrease it, as the template would display whatever was given as parameter input.
What I envision is a bit more: To catch the allowed keywords as parameter input but display the same predefined text for all of them. I'm open in regard to if we would keep the "n.d." text and just add some tooltip to it (which has my preference at present) or to change it to "undated" or "no date" or whatever has consensus.
What would also be possible is to catch the various keywords on input, but only accept one of them as the new valid input (for this I would suggest |date=none for consistency with other parameters already using the none keyword) and issue "extra text" warnings for the other inputs (like "n.d.", "nd", etc.) so that existing citations could be updated accordingly. Still, the output would be the predefined text "n.d." plus tooltip, "no date" or whatever we decide.
This could also be implemented gradually so that there is enough time to adapt.
--Matthiaspaul (talk) 17:38, 26 November 2020 (UTC)
(edit-conflict) Our target audience includes "incompetent readers". Our goal as an encyclopedia for everyone is to improve their education and competence. (Personally, I would not call someone "incompetent" just for not knowing what "n.d." or "3 (12): 7–8" means.)
While "n.d." is one accepted practise to indicate a "no date given" condition, it is only one of them. There are different styles how to denote this, from variations on the abbreviation (with or without space, in different cases and with varying interpunctation) to spelling it out as "no date" or "undated" (in different cases and possibly bracketed). While most people who are not aware of the abbreviation should be able to guess that "n.d." means "no date" if given instead of a date, others might not ("not documented", "not displayed", "new data", "next date", "named date", "no dummy"?). Our general philosophy is to avoid abbreviations which might not be understood by everyone.
As I have stated in the past already, I'm all in favour of tokenizing such special cases (we already do this in some cases, f.e. with "et al." - although this one is special also in other ways). This has several other advantages as well:
  • Improved machine-readability
  • Consistency within articles and across the project in regard to how to indicate this condition
  • Control over the display output and metadata format should the recommended output format change over time (think of the discussions regarding how to display volumes, issues and pages) or if we would want to support other metadata standards in the future (beyond COinS) where this condition might be codified somehow. Even if we would not change the output format from "n.d.", it might be already helpful for readers if we'd display a tooltip with its expanded meaning. And in the metadata, it could be changed to "[n.d.]" to indicate a descriptive date rather than an actual date.
  • Easier localisation into other languages (for the same reason why we prefer |language=fr over |language=French). For example, in a German citation one would typically write "o. D." ("ohne Datum") rather than "n.d.", but "k. D." ("kein Datum") is seen as well. Likewise, there are abbreviations like "o. J." (without year), "o. O." (without location), "o. A." (without author) and "Anon." (for anonymous author(s)).
Regarding HTML comments, you wrote that author=<!--not stated--> would be the recommended form. It is possible that this has changed, but the last time I looked the recommended form was author=<!-- staff writer, no byline -->. Either way, this shows that HTML comments, as useful as they often are, are not a good method to indicate common states like this because they are more complicated to use for editors and therefore are not used consistently, thereby making it difficult to machine-read them. Special tokens such as |date=none, |author=none, |author=staff, |author=anon are much preferable to them.
--Matthiaspaul (talk) 17:14, 23 November 2020 (UTC)
Yeah, incompetent might be a bit strong, but en.wiki is one of two English language Wikipedias. For those who do not understand commonplace citation initialisms, abbreviations, and symbols used throughout the English language publishing world (and consequently in cs1|2), perhaps the other English language Wikipedia is a better choice. But, were it an issue, I would have thought that editors at simple.wiki would have tweaked (or asked us for assistance in tweaking) simple:Module:Citation/CS1/Configuration to accommodate their readers.
I have said in the past, and will likely say in the future, that cs1|2 is not APA, CMOS, Bluebook, or any other citation style. I am comfortable with cs1|2 not being any of those, but, I do not think that cs1|2 should be made to be so different from other citation styles that we abandon the commonly-used citation initialisms, abbreviations, and symbols that English-language readers have come to expect.
If it is to be believed that n.d. is rather obscure to the reader and must be fixed, it must follow that all of the other citation initialisms, abbreviations, and symbols used by cs1|2 are also rather obscure to the reader, mustn't it? If we believe that to be true, then we must discontinue use of all standard English-language citation initialisms, abbreviations, and symbols. We must replace: 'ed.' → editor, 'eds.' → editors, 'ed.' → edition, '§' → section, '§§' → sections, 'Vol.' → volume, 'no.' and 'No.' → issue or number, 'p.' → page, and 'pp.' → pages. And lest we forget it, 'et al.' → and others.
Trappist the monk (talk) 18:41, 23 November 2020 (UTC)
I agree with a lot of what you wrote above but not with the recommendation for the Wikipedia in Simple English - not knowing what "n.d." means does not necessarily mean that a user is a child, ancient, or illiterate, it does not even mean s/he is uneducated - as Pol mention above it could be as simple as that the user graduated from a university outside of the US or UK (possibly in pre-internet times), where other citation standards (were or) are more prevailing - they are similar, but different enough in the details that even a highly educated person might not be familiar with "n.d." at first. I would not want to point them to the Simple English WP, because they won't find what they are looking for over there, they even might feel offended. Of course, they will be able and willing to learn what "n.d." means.
I think the truth seldomly lies with the extremes. The fact that users repeatedly "complained" about "n.d." does not necessarily mean that we have to abandon all abbreviations. Still, it should let us think about options how to possibly improve the situation for them.
Perhaps all that would be needed is to add some tooltip to "n.d." explaining its meaning? We could try and see if this is already enough to address the problem. (However, given that this would require a predefined output instead of just passing through the input it would already require to tokenize the "no date" case, but, I think, it would be worth it also for the other advantages.)
--Matthiaspaul (talk) 17:38, 26 November 2020 (UTC)
The last time this topic was raised appears to be Help talk:Citation Style 1/Archive 55 § The n.d. keyword for undated sources (includes links to two other discussions).
Trappist the monk (talk) 15:31, 23 November 2020 (UTC)
(edit-conflict) Given that we already use the keyword "none" in various other places, I would suggest to, at the minimum, support something like |date=none. However, if there are more similar conditions (as in the none/staff/anon example for authors above), more keywords could be introduced for them as well.
The keyword "none", indicating that this information is not given in the source, should be distinguished from the condition, that the information should not be displayed but would still be used in reference anchor generation and be provided in the metadata (for which I suggested the keyword "off" recently introduced for |title=), and the condition, that the information is simply unknown to the editor at present (but might be given in the source), which should not be indicated by a special token, but is often indicated to other editors by providing an empty |date= parameter (which, however, is sometimes removed by other editors "cleaning up").
I'm open in regard to the best output format, be it "n.d.", "no date", or something else. However, the good thing is that once we would have introduce a tokenized input for this condition, we are free to centrally change the output any time later on would this become necessary.
--Matthiaspaul (talk) 17:14, 23 November 2020 (UTC)

Addition to generic title[edit]

Hello, I was wondering if articles with "Subscribe to read" in the reference title could be added to Category:CS1 errors: generic title. There are currently over 1,000 usages of these in titles. Thanks. Keith D (talk) 14:35, 23 November 2020 (UTC)

Appears to be associated with Financial Times:
Cite web comparison
WT {{cite web |url=http://www.ft.com/content/2d2a9afe-6829-11e5-97d0-1456a776a4f5 |website=Financial Times |title=Subscribe to read}}
Live "Subscribe to read". Financial Times.
Sandbox "Subscribe to read". Financial Times. Cite uses generic title (help)
Trappist the monk (talk) 15:46, 23 November 2020 (UTC)
Thanks for the change. Keith D (talk) 00:41, 25 November 2020 (UTC)

DOI errors[edit]

This

should emit an error. The DOI format is 10.[4 or 5 digits]/foobar. Headbomb {t · c · p · b} 15:32, 24 November 2020 (UTC)

Cite journal comparison
WT {{cite journal |date=1946 |doi=10.http://hdl.handle.net/2246/390 |last1=Colbert |first2=Harris |last2=Edwin |title=Hypsognathus, a Triassic reptile from New Jersey |journal=Bulletin of the American Museum of Natural History}}
Live Colbert; Edwin, Harris (1946). "Hypsognathus, a Triassic reptile from New Jersey". Bulletin of the American Museum of Natural History. doi:10.http://hdl.handle.net/2246/390.
Sandbox Colbert; Edwin, Harris (1946). "Hypsognathus, a Triassic reptile from New Jersey". Bulletin of the American Museum of Natural History. doi:10.http://hdl.handle.net/2246/390 Check |doi= value (help).

Trappist the monk (talk) 15:47, 24 November 2020 (UTC)

Podcasts published by newspaper[edit]

For {{Cite podcast}}, I'm trying to cite a podcast published by a newspaper. The documentation says to use |website= for the name of the podcast and |publisher= for the name of the publisher, but |publisher= won't let me italicize, and the name of a newspaper should always be italicized, even when it's acting as a publisher. What do I do here? {{u|Sdkb}}talk 20:34, 24 November 2020 (UTC)

Use the newspaper's publishing company instead. Alternatively, |via= is available, though I think I would prefer the former and not the latter solution. --Izno (talk) 20:42, 24 November 2020 (UTC)
It's a student newspaper, so it doesn't really have a publishing company. The {{Cite podcast}} template seems pretty underdeveloped, so I'd imagine there's probably a change we'll want to make at the template itself. {{u|Sdkb}}talk 21:07, 24 November 2020 (UTC)
I find it very hard to believe that anything made public (such as a student newspaper) does not have a publisher. Who or what makes it appear? It doesn't suddenly materialize. 69.94.58.75 (talk) 12:37, 25 November 2020 (UTC)
I mean, the newspaper itself has a staff who publish the podcast on the major podcast platforms. There's a printing company who prints it, and a student government that partially funds it, but putting either those in the |publisher= field and leaving out the name of the newspaper would be really weird. {{u|Sdkb}}talk 22:24, 27 November 2020 (UTC)
The trouble here may be the statement name of a newspaper should always be italicized. That is correct in prose. But in most citation systems (including this present one), italics are not used on specific variables (in your example a newspaper) but on the parameter field. Therefore, |publisher= is never italicized. |website= always is, as the work or source. I would fill in accordingly and let the software decide where to apply emphasis. The newspaper may be published by the Student Union.But the podcast is published by the Newspaper. 98.0.246.242 (talk) 19:04, 28 November 2020 (UTC)
I crossed out the last part above because it is not clear to me whether this is a freestanding podcast, or part of the newspaper. If it is a feature accessible through the newspaper website, then I would use
  • {{cite web|title=Podcast Title|department=Podcast|url=http://www.podcastwebpage.com|website=Newspaper|publisher=Publisher}} which renders
  • "Podcast Title". Podcast. Newspaper. Publisher.
Note that the podcast webpage is used in |url= instead of the including website. I would use the podcast date for |date=, and the podcaster, if any as the author. 98.0.246.242 (talk) 22:04, 28 November 2020 (UTC)

Italics 2[edit]

I know that sources like PBS, NPR, CNN, ABC, NBC, BBC, etc. should not be italicized. We save that for newspapers and magazines. Where is the MoS guideline for when NOT to italicize those listed news sources? An editor seems to think it makes no difference, and they are changing the citations all over the place so that those agencies are italicized. When I'm in doubt, I just look at the article for the source and see how it's done there, because I know that other editors have followed the MoS. -- Valjean (talk) 01:20, 25 November 2020 (UTC)

Can you provide specific examples. 98.0.246.242 (talk) 02:28, 25 November 2020 (UTC)
This is just one example which changes the correct format to italicized format. That editor does this a lot. I have tried to discuss this with them to no avail, hence my need for better information. Where is the MOS guideline for this? -- Valjean (talk) 02:36, 25 November 2020 (UTC)
It seems that the italics appear as a result of swapping |publisher= for |work=. The "work" is what is generally considered as the source, and this (unlike "publisher") is emphasized, in most cases with italic type. I would recommend a compromise: use the website (e.g. www.npr.org) as the "work", and NPR as the "publisher" of such work. And let the software format them accordingly. 69.94.58.75 (talk) 12:30, 25 November 2020 (UTC)
This is related to: User talk:Citation bot § Unhelpful changes? You did not like the answers that you got there so are asking elsewhere? Your posting here seems to be the same, more-or-less, as this: Wikipedia talk:Manual of Style § Italics...help. I'm pretty sure that you should not ask the same question in multiple venues because doing that is considered to be disruptive.
How do you know that sources like PBS, NPR, CNN, ABC, NBC, BBC, etc. should not be italicized? In cs1|2, the name of the source (the publisher's work) is italicized. If the source is a website, the name of the website is italicized; if the source is a magazine, journal, newspaper, or other periodical, the name of the magazine, journal, newspaper, or other periodical, is italicized; if the source is a book, the name of the book is italicized; if the source is a corporate entity initialism or broadcaster call-sign, the initialism or call-sign is italicized. This applies to both physical an online sources. It does not matter if the initialism of a cited source is the same as the initialism of the business that produced it.
I disagree to some extent with what the IP editor wrote. At Help:Citation Style 1 § Work and publisher is this:

Do not append ".com" or the like if the site's actual title does not include it ... and omit "www."

and this:

The "publisher" parameter should not be included for widely-known mainstream news sources, for major academic journals, or where it would be the same or mostly the same as the work.

Trappist the monk (talk) 14:54, 25 November 2020 (UTC)
Yes this is a pretty confusing guidance (re: website name). It should be made clear that what is expected is the dba (doing business as) name, not the website title (the index or main page html title tag) or the domain/subdomain FQDN. However I don't know if dbas are indexed. Page titles and domains do, and therefore should be easier/faster to find. 50.75.204.50 (talk) 19:48, 25 November 2020 (UTC)
Hmmmm.... So why not go to these articles (PBS, NPR, CNN, ABC, NBC, BBC) and try to italicize them? See what happens. Then take the resulting edit wars to ArbCom or some appropriate drama board. I'd like to see a final decision, because I keep getting conflicting answers. I'd really like to know, but I'm not sure where is the best place to ask. Some places don't answer. -- Valjean (talk) 16:45, 25 November 2020 (UTC)
So why not go to these articles (PBS, NPR, CNN, ABC, NBC, BBC) and try to italicize them? Why would I want to do that? The articles are about those entities as businesses; we do not cite the business, we cite the business' work (its programming, its articles, etc), and the work, in cs1|2, is italicized.
Trappist the monk (talk) 16:58, 25 November 2020 (UTC)
What's cs1|2? -- Valjean (talk) 17:19, 25 November 2020 (UTC)
Shorthand for "Citation Style 1 and 2". This help page is about them ie. the suite of templates such as cite web, cite news, etc..-- GreenC 17:23, 25 November 2020 (UTC)
Now I feel dumb. I have never used this page before. Thanks. -- Valjean (talk) 18:18, 25 November 2020 (UTC)
This has come up before – because the template only "allows" us to set websites in italics, and/or how every publishing organisation has been redefined, by cite template editors and partly through creepage at the MOS, as a "work". Plenty of examples were mentioned here in past discussion(s). The one that comes to mind is the music database AllMusic: as a result of the title being rendered italic in citations, some editors then italicise AllMusic in prose "for consistency". Which is ridiculous; and italicised BBC, NPR, PBS, etc, could well result from that also. It's not as if readers are left confused and dizzy by a roman (so-called) "work" in a citation, but that sort of rationale has been put forward here as a reason that each and every website must be italicised. Seems to me it's more a case of obsessiveness by editors who just think of cite templates in isolation (similar problem, eg, when editors focus solely on infoboxes from article to article, and not on how the infobox works with the article in question).
There was some discussion elsewhere, from memory, about coming up with a sort of "cite organisation" which would allow for non-italicised web sources. I think that would be a great idea. Until then, you default to writing out the relevant citations manually, avoiding the templates altogether. JG66 (talk) 17:32, 25 November 2020 (UTC)
JG66, this is an area where I confess to ignorance. I have been editing here since 2003, but have never gotten this fully explained. I don't fully understand the parameters in templates, but I know that publisher= and website= produce italics, and work= does not, website= and work= produce italics, and publisher= does not, so I use the one which will produce the "right" result, but that may not be the right approach.
I have gotten my cues (for how I should italicize in references) by looking at our articles. If the article uses italics, I use them in references and text, and if not, I don't. That's why I don't italicize sources like these (PBS, NPR, CNN, ABC, NBC, BBC), and do italicize The New York Times, The Guardian, etc. Am I wrong and/or totally naive? Is it really more complicated than that? I really appreciate the help, advice, and AGF. -- Valjean (talk) 18:30, 25 November 2020 (UTC)
I know that publisher= and website= produce italics, and work= does not. Umm, not true... |publisher= is not rendered in italics but both |website= and |work= are (along with their aliases |newspaper=, |magazine=, |periodical=, and |journal=).
Trappist the monk (talk) 18:36, 25 November 2020 (UTC)
Oops! I remembered wrong. That should be "website= and work= produce italics, and publisher= does not." Fixed above. Here are some examples:
  1. website= Mayer, Jane (November 25, 2019). "The Inside Story of Christopher Steele's Trump Dossier". The New Yorker. Retrieved November 27, 2019.
  2. website= Borger, Julian (October 7, 2017). "The Trump–Russia dossier: why its findings grow more significant by the day". The Guardian. Retrieved December 28, 2017.
  3. work= Elfrink, Tim; Flynn, Meagan (February 27, 2019). "Michael Cohen to testify that Trump knew of WikiLeaks plot". The Washington Post. Retrieved February 27, 2019.
  4. publisher= (plus manual italicizing for Fresh Air) Gross, Terry; Simpson, Glenn; Fritsch, Peter (November 26, 2019). "Fusion GPS Founders On Russian Efforts To Sow Discord: 'They Have Succeeded'". NPR. Fresh Air
So what's the best way to do this? -- Valjean (talk) 18:54, 25 November 2020 (UTC)
The New Yorker is a magazine so {{cite magazine}} and |magazine=. Both The Guardian and The Washington Post are newspapers so {{cite news}} and |newspaper=. Fresh Air isn't a news program nor is it a journal or a magazine or a newspaper, so {{cite web}} (because you included a url) or because it's aired periodically (daily where I live) you might use {{cite periodical}} (a redirect to {{cite magazine}}) and |work=[[Fresh Air]]. You can include or omit |publisher=[[NPR]] as you choose. Hanging the program name after the cs1|2 template as you did means that the name is not included in the citation's metadata (for those who consume our citations using various machine tools).
Trappist the monk (talk) 19:14, 25 November 2020 (UTC)
Valjean, that is the situation as I see it also – "If the article uses italics, I use them in references and text ..." It's logical (unless one's a template obsessive, it seems) and, as I've said, it ensures editors don't go and work the other way by deciding to italicise AllMusic, NPR, etc, because the word's italicised in references.
There was a discussion here early in the year which might be relevant: Help talk:Citation Style 1/Archive 63#This passage in the documentation. I believe it was Tenebrae (there or elsewhere) who outlined the "cite organisation" option, and laid out other reasons why italicising each and every website was either wrong or potentially confusing. JG66 (talk) 13:29, 28 November 2020 (UTC)
Yes, in spite of all the well-meaning advice above, I'm still confused. It can't be right that all sources in references should be italicized, but that's what some editors are doing. We need clearer guidelines, with very specific, site by site, instructions, a literal list, just like we have a specific list at WP:RS/P. There shouldn't be any rubbery wiggle room in those instructions.
Either The New York Times is always italicized or it's never italicized. What is it?
The list should also specify the ideal template to use. -- Valjean (talk) 16:30, 28 November 2020 (UTC)
The list should also specify the ideal template to use – my answer would be always use {{Citation}} for all citations. If you like full stops/periods all over the place, then add |mode=cs1. I'm sure that many editors would strongly disagree, which is why the list should not specify the ideal template to use. Peter coxhead (talk) 17:15, 28 November 2020 (UTC)
Sources in citations are not italicized, they are emphasized using italics. This is a not simply a typographical convention, it has semantic meaning, so a reader can immediately recognize what source it is they should be looking for. Formatted citations are terse, utilitarian statements employing a certain quasi-shorthand. Their style follows their syntax conventions. In that syntax, the source or work is paramount. But anyone is free to use a different citation format, or none (freehand). The objective is to give the reader a quick & easy way to verify what is claimed in text. The best-looking and best-articulated article means nothing if it cannot be verified. This does not require templates, formatted citations, or specific styles. As for the narrow case of template selection, I think one can only make recommendations. Different templates have slightly different format/syntax options or output, and there is a continuing effort to match them to the relevant sources. 98.0.246.242 (talk) 17:44, 28 November 2020 (UTC)

PMID numbers[edit]

Hello, at the Arecibo Observatory article there's a valid PMID of 33214727 despite the red ink saying 'Check |pmid= value'. I presume that the range of valid PMID numbers needs extending- please can someone fix this? TIA, Yadsalohcin (talk) 08:20, 25 November 2020 (UTC)

Thanks for reporting. The current limit is set to 33200000 per Help_talk:Citation_Style_1#PMID_limit.
I have increased it to 33500000 in the sandbox.
--Matthiaspaul (talk) 13:23, 25 November 2020 (UTC)
I'm guessing that 33500000 (>33214727!) should do it- does it propagate on automatically from the sandbox? Despite refreshing, there's still red ink at Arecibo Observatory Yadsalohcin (talk) 15:01, 25 November 2020 (UTC)
The live template gets updated from the sandbox every couple of months. The next update will probably happen in January.
As there have been quite a number of changes already, I would also support an earlier update, but it can be carried out by admins only. Also, as we are in the middle of a process to fade out many old parameter variants (not the functionality) we have to make sure that some old parameters have been updated in mainspace before we can roll out the next update.
Alternatively, we could just update the limits in the live template.
In the linked thread we are discussing possible means how to make it easier to update the limits so that keeping them tight does not cause inconvenience for editors. If you have ideas, your input is welcome there.
--Matthiaspaul (talk) 20:30, 25 November 2020 (UTC)

Meta proposal to globalize the CS1 templates[edit]

Someone has made a proposal to allow a more Wikimedia-wide usage of these CS templates. Putting a notice here in case folks are interested. Jo-Jo Eumerus (talk) 08:36, 25 November 2020 (UTC)

Cite_OED template needs an update[edit]

The {{Cite_OED}} template is in need of an update, and it would be great if someone could take a look. I've asked several times on the talk page at Template_talk:Cite_OED#Template_needs_updating, but that page probably doesn't get much exposure. Asking here following a recommendation at WP:VP/T. MichaelMaggs (talk) 10:18, 25 November 2020 (UTC)

Thanks User:Trappist the monk, that's much better. I wonder, though, whether it would be better not to have a default date. "September 2005" doesn't seem to appear on the site at all, and may give an incorrect impression that that's the date of the word entry. It isn't usual to tag a continually-updated web resource with the date that the resource first became available online. MichaelMaggs (talk) 14:15, 25 November 2020 (UTC)
Thanks again. MichaelMaggs (talk) 14:36, 25 November 2020 (UTC)

Should further reading sections have "retrieved by" dates?[edit]

 You are invited to join the discussion at Wikipedia talk:Further reading § Should further reading sections have "retrieved by" dates?. {{u|Sdkb}}talk 20:39, 25 November 2020 (UTC)

Nomination for deletion of Module:Citation/CS1/Arguments[edit]

Ambox warning blue.svgModule:Citation/CS1/Arguments has been nominated for deletion. You are invited to comment on the discussion at the entry on the Templates for discussion page. * Pppery * it has begun... 00:26, 26 November 2020 (UTC)