Page 1 of 2

Date internationalization (for Leonard)

Posted: Sat May 12, 2007 5:57 pm
by imslp
This is a discussion moved from the wiki, for Leonard Vertighel.

Hi there Leonard! Regarding date internationalization, I've already taken measures for this for the file entry submission dates, which uses the #makedate: parser hook. However, currently the parser hook is extremely basic, and only does English... I've wanted to make it do different languages, but have no idea where to start. The parser hook function is as follows:

Code: Select all

function renderIMSLPDate( &$parser, $year = FALSE, $month = FALSE, $date = FALSE )
{
	if( !$year or !$month or !$date ) return 'MKDATEERROR';
	else return date( 'F d\, Y', mktime( 0, 0, 0, (int) $month, (int) $date, (int) $year ) );
}
It should not be code-wise technically hard to make it do different languages... as long as you figure out how to translate it :) You can find the language of the request by using $wgLang->getCode() where $wgLang is the global object (add "global $wgLang;" to the function if you want to use it).

If you have any more questions, feel free to ask me :)

Posted: Sat May 12, 2007 6:40 pm
by Leonard Vertighel
I guess that using setlocale() may not be a good idea, so that a custom solution is required. Is that correct?

Posted: Sat May 12, 2007 7:04 pm
by imslp
Leonard Vertighel wrote:I guess that using setlocale() may not be a good idea, so that a custom solution is required. Is that correct?
Hmm... this is correct because as the Warning on the PHP.net page of setlocale() says, the change is per process and not per thread... and we can't have locale jumping around for no reason now can we ;)

Posted: Sat May 12, 2007 9:14 pm
by Leonard Vertighel
OK, the following bit of code should do what required. Let me know if there are any issues with it. Date and month are now optional. Next step would be to change the "add composer" interface to use the #makedate. For this, we would probably want to add a dropdown box for the month. In this case, the arrays should probably go somewhere else in the code, in order to access them from different points.

Code: Select all

function renderIMSLPDate( &$parser, $year = FALSE, $month = FALSE, $date = FALSE )
{
   global $wgLang;

   if( !$month && !$date ) return $year;

   // nominative case:
   $monthnames = array (
      'de' => array ('Januar', 'Februar', 'März', 'April', 'Mai', 'Juni', 'Juli', 'August', 'September', 'Oktober', 'November', 'Dezember'),
      'el' => array ('Ιανουάριος', 'Φεβρουάριος', 'Μάρτιος', 'Απρίλιος', 'Μάιος', 'Ιούνιος', 'Ιούλιος', 'Αύγουστος', 'Σεπτέμβριος', 'Οκτώβριος', 'Νοέμβριος', 'Δεκέμβριος'),
      'en' => array ('January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'),
      'es' => array ('enero', 'febrero', 'marzo', 'abril', 'mayo', 'junio', 'julio', 'agosto', 'septiembre', 'octubre', 'noviembre', 'diciembre'),
      'fr' => array ('janvier', 'février', 'mars', 'avril', 'mai', 'juin', 'août', 'septembre', 'octobre', 'novembre', 'décembre'),
      'it' => array ('gennaio', 'febbraio', 'marzo', 'aprile', 'maggio', 'giugno', 'luglio', 'agosto', 'settembre', 'ottobre', 'novembre', 'dicembre'),
      'nl' => array ('januari', 'februari', 'maart', 'april', 'mei', 'juni', 'juli', 'augustus', 'september', 'oktober', 'november', 'december'),
      'pl' => array ('styczeń', 'luty', 'marzec', 'kwiecień', 'maj', 'czerwiec', 'lipiec', 'sierpień', 'wrzesień', 'październik', 'listopad', 'grudzień'),
      'tr' => array ('Ocak', 'Şubat', 'Mart', 'Nisan', 'Mayıs', 'Haziran', 'Temmuz', 'Ağustos', 'Eylül', 'Ekim', 'Kasım', 'Aralık')
   );
   // where declined form differs from the nominative:
   $monthnames_d = array (
      'el' => array ('Ιανουαρίου', 'Φεβρουαρίου', 'Μαρτίου', 'Απριλίου', 'Μαΐου', 'Ιουνίου', 'Ιουλίου', 'Αυγούστου', 'Σεπτεμβρίου', 'Οκτωβρίου', 'Νοεμβρίου', 'Δεκεμβρίου'),
      'pl' => array ('stycznia', 'lutego', 'marca', 'kwietnia', 'maja', 'czerwca', 'lipca', 'sierpnia', 'września', 'października', 'listopada', 'grudnia')
   );
   // %1$u = year, %2$s = month name, %3$u = date
   $format = array (
      'de' => '%3$u. %2$s %1$u',
      'el' => '%3$u %2$s %1$u',
      'en' => '%2$s %3$u, %1$u',
      'es' => '%3$u de %2$s de %1$u',
      'fr' => '%3$u %2$s %1$u',
      'it' => '%3$u %2$s %1$u',
      'nl' => '%3$u %2$s %1$u',
      'pl' => '%3$u %2$s %1$u',
      'tr' => '%3$u %2$s, %1$u'
   );
   // default to English
   $lang = 'en';
   if( array_key_exists($wgLang->getCode(), $monthnames) )
      $lang = $wgLang->getCode();
   if( !array_key_exists($month-1, $monthnames[$lang]) ) return 'MKDATEERROR';
   $monthname = $monthnames[$lang][$month-1];
   $monthname_d = $monthname;
   if( array_key_exists($lang, $monthnames_d) )
      $monthname_d = $monthnames_d[$lang][$month-1];

   if( !$date ) return "$monthname $year";
   return sprintf($format[$lang], $year, $monthname_d, $date);
}

Posted: Sun May 13, 2007 12:14 am
by imslp
Ok. Before you read the following, please promise me that you won't hate me for this ;)

I was in the process of merging your change into the code when I thought that maybe MW already has an internationalization for dates, since MW has the best internationalization of all the wikis (and also part of why I chose MW instead of MoinMoin). After digging around the MW code for half an hour, I realized that it, indeed, does already have date internationalization... and so here's the new code:

Code: Select all

function renderIMSLPDate( &$parser, $year = 0, $month = 0, $date = 0 )
{
	global $wgLang;
	
	if( !$year or $month < 1 or $month > 12 or $date < 1 or $date > 31 or
		!($ts = mktime( 0, 0, 0, (int) $month, (int) $date, (int) $year )) ) return 'MKDATEERROR';
	else return $wgLang->date( wfTimestamp( TS_MW, $ts ), FALSE, FALSE, FALSE );
}
Complete with range checks so that mktime() doesn't take forever to process an out-of-bounds date :)

I'm very sorry that all the work you put into your function was for nothing... but this does happen quite often when one just begins to hack at MW; it happened to me many many times... so your not alone ;)

Posted: Sun May 13, 2007 7:24 am
by Leonard Vertighel
Never mind, should have looked for it myself. Anyway, what about the proposed change to the "add composer" interface? I think that for the average user, the dates of the composer's lives are actually more interesting than the date at which a score was submitted to IMSLP. Moreover, I have already seen someone put the date in Spanish (I think), which is still guessable for the English speaking users. But as soon as someone starts writing the date in Greek, Russian, Corean,... the other users will be lost. Conversely, for a Polish, Turkish, Finnish,... user who doesn't speak English, the English monthnames will be incomprehensible.

My function was designed to be applicable also in those cases where only the month or even only the year is known/given. I realize that this still doesn't cover all possible cases. Here are a few examples taken from actual IMSLP entries:

5. Dec. 1687 (baptized)
1540 (?)
ca. 1575
?
April or May 1562
1530/40
Coibra, 11 June 1704

(The last one leads to the question whether we generally want to have the birth/death places, too.) I guess that "baptized" could occur rather frequently, so that it could become e.g. a checkbox in the "add composer" interface. It should then be made to display "getauft" in German, etc., either via the parser function or a template parameter. Not sure how to deal with the date ranges, though.

Posted: Sun May 13, 2007 8:36 am
by imslp
Well... it is going to be a pain to cover all cases; I would just concern myself with covering the cases where Y/M/D, Y/M, M, or Y is given... everything else we will have to see as we go. $wgLang->date() should be able to deal with most, except the formatting of Y/M (the translation is there, but not the formatting). It'd be nice if we could somehow extract the Y/M formatting from Y/M/D, but I'm not sure if that's possible...

I guess the composer page date issue is more with the submission form itself; I'll have to add fields for Y/M/D... which I will do soon. If you can give me a nice way to lay out the fields on the composer page that'd be good, since I'm not very good with HTML :)

About birth and death places... one reason I didn't include that on the composer page is because it is not directly relevant to the copyright status of a composer; and I would like to prevent IMSLP from trying to do what Wikipedia has already done well (i.e. biography); which is why I have the wikipedia link entry in the composer template :)

Posted: Sun May 13, 2007 9:20 am
by Leonard Vertighel
imslp wrote:Well... it is going to be a pain to cover all cases; I would just concern myself with covering the cases where Y/M/D, Y/M, M, or Y is given... everything else we will have to see as we go. $wgLang->date() should be able to deal with most, except the formatting of Y/M (the translation is there, but not the formatting). It'd be nice if we could somehow extract the Y/M formatting from Y/M/D, but I'm not sure if that's possible...
Agree with covering only Y/M/D, Y/M and Y (M doesn't make much sense, or does it? "Born: May"?). I'm afraid that extracting Y/M from Y/M/D is impossible, e.g. in Polish, "January" is "styczeń", but "January 1" is "1 stycznia" (genitive case, I believe; from the technical point of view, a different word). I'll have to think about it some more.
imslp wrote:About birth and death places... one reason I didn't include that on the composer page is because it is not directly relevant to the copyright status of a composer; and I would like to prevent IMSLP from trying to do what Wikipedia has already done well (i.e. biography); which is why I have the wikipedia link entry in the composer template :)
Agree. Which leads me directly to the next question: Any idea how to present users with a link to the Wikipedia article in their own language? Obviously I do not expect users to put all the links when they add a composer (sheer madness), but maybe we could periodically run a bot to add the links for all the enabled languages (i.e. those that have at least a partial translation on IMSLP)? So the question is twofold: 1) how to implement: direct use of IFLANG in the composer page, a template, or a parser function? 2) how to manage: I think I could code a basic bot for the purpose (I've done it before: I had an officially flagged bot on the Italian Wikipedia for a while which worked fairly well). To extract the language links from Wikipedia, I already have some code that is almost ready for use.

Posted: Sun May 13, 2007 1:13 pm
by Leonard Vertighel
Need help... how do I assign the function to a hook (and to what hook) so I can test it?

Posted: Sun May 13, 2007 5:53 pm
by imslp
Here's the manual page, which explains it quite well, in addition to having a list of all the hooks :)

http://www.mediawiki.org/wiki/Manual:MediaWiki_hooks

Posted: Sun May 13, 2007 6:13 pm
by Leonard Vertighel
imslp wrote:Here's the manual page, which explains it quite well, in addition to having a list of all the hooks :)

http://www.mediawiki.org/wiki/Manual:MediaWiki_hooks
Yeees, I have seen that one before... now, maybe it is obvious, but I still don't understand which is the right hook for this function. Is it ParserBeforeStrip? Or ParserAfterStrip? Or something else? Sorry for being obtuse, but you shouldn't expect too much from a mathematician :D

Posted: Sun May 13, 2007 6:22 pm
by imslp
Hahaha... actually I forgot that parser hooks were not covered enough on that page. Basically, you have to use a special function to hook onto a parser function hook (parser tag hooks are different, and I'll not talk about here since that's not what you want).

Basic Example:

Code: Select all

//Hook onto the main extension function executor (will be executed on startup)
$wgExtensionFunctions[] = "RegisterHooks";
$wgHooks['LanguageGetMagic'][] = "RegisterLanguageGetMagic";

function RegisterHooks()
{
  global $wgParser;

  //Will register {{#functionname:}} parser function hook
  $wgParser->setFunctionHook( "functionname", "CallbackFunction" );
}

function RegisterLanguageGetMagic( &$magicWords, $langCode )
{
  $magicWords['functionname'] = array( 0, 'functionname' );
}

function CallbackFunction( &$parser ... etc ... )
{
  //Do stuff
}
In many cases where you don't know something about Mediawiki, you can look at existing extensions (I only found out how to do this by looking at the ParserFunctions extension). You can also talk directly to the MW devs on IRC (#mediawiki@freenode.net).

Posted: Sat May 19, 2007 6:51 pm
by Leonard Vertighel
imslp wrote:and so here's the new code:

Code: Select all

function renderIMSLPDate( &$parser, $year = 0, $month = 0, $date = 0 )
{
	global $wgLang;
	
	if( !$year or $month < 1 or $month > 12 or $date < 1 or $date > 31 or
		!($ts = mktime( 0, 0, 0, (int) $month, (int) $date, (int) $year )) ) return 'MKDATEERROR';
	else return $wgLang->date( wfTimestamp( TS_MW, $ts ), FALSE, FALSE, FALSE );
}
Is that the actual code you are using? When I was trying it, I had to replace mktime with gmmktime, or the output would be one day off.

(For the rest, still working on it...)

Posted: Sat May 19, 2007 8:41 pm
by imslp
Actually, I found out this exact problem after I posted the code, and already changed it :) I didn't bother reposting because I actually didn't realize it made the dates one day off in some cases (I thought it was just a paranoid fix on my part)...

Though, I think mktime does work "correctly" (as in, outputs the correct results, not that it is theoretically correct, which it is not) on the server because of the time zone settings. However, gmmktime is much more "correct", even if mktime outputs the correct results in this case.

But yes, otherwise it is the same code. :)

Posted: Sat May 19, 2007 9:24 pm
by Leonard Vertighel
And there I was wondering why I got the wrong output on my local server... took me a while to figure that out. Anyway, here is my newest attempt. This function should properly handle Y, YM and YMD. I assumed that the format for YM would be the same in all languages, otherwise we will have to modify the function somehow.

Code: Select all

function renderIMSLPDate( &$parser, $year = 0, $month = 0, $date = 0 )
{
   global $wgLang;

   if( $year < 1 or $month < 0 or $month > 12 or $date < 0 or $date > 31 ) return 'MKDATEERROR';
   if( !$month ) return $year;
   if( !($ts = gmmktime( 0, 0, 0, (int) $month, (int) max($date,1), (int) $year )) ) return 'MKDATEERROR';
   $tsmw = wfTimestamp( TS_MW, $ts );
   if( !$date ) return $wgLang->sprintfDate( 'F Y', $tsmw );
   return $wgLang->date( $tsmw, FALSE, FALSE, FALSE );
}
A minor issue I noticed is that for some languages (e.g. Italian), the output of $wgLang->date() is with the abbreviated month name (no idea why this is so). It will not look particularly pretty on the composer pages, but if there is no easy solution, I think we can live with it.