Page 1 of 1

#fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 12:51 am
by gardano
In trying to parse the wiki code embedded in a page, I'm trying to understand and to generalize the various coding options that I see.

In http://imslp.org/wiki/Symphony_No.9,_Op ... udwig_van), The FILES section begins with the tag

Code: Select all

{{#fte:imslpfile
However, Hindemith's works have this:

Code: Select all

{{#fte:Externalimslpfile
.

Are there any other such imslp tags that I should be aware of?

As I'm really in learning mode for MediaWiki, I'm not sure if these tags are links to template files, or what....

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 2:03 am
by daphnis
The latter template is for files hosted on other servers, ex. the US server (since Hindemith is not public domain in either Canada or the EU).

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 2:32 am
by pml
The Externalimslpfile template is now deprecated for uploads to IMSLPĀ·US, there is a new fte template

#fte:server-us

specially for these.

The Hindemith works should gradually be transitioned to using the new template, which is less fiddly to use...

Cheers, Philip

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 2:43 am
by imslp
There are only three FTE template types for file entries:

#fte:imslpfile
#fte:imslpaudio, and
#fte:server-us

As PML said, the Externalimslpfile template is deprecated. Its variable names are very strange, so I would advise against trying to parse it.

I think there are two things we can do for the iPad app:
1. ignore fte:server-us (these files are files PD only in the US), or
2. do a mass convert of Externalimslpfile into server-us and implement server-us; conversion maybe via a bot (the conversion is actually very simple, see below). Note that because the US server is not associated with the IMSLP site proper, US server files are not catalogued in the IMSLP index system. The download URL is also different.

For some background on what happened with server-us/externalimslpfile, and how to convert them (note that the bot must be logged in and send the correct login cookies):
viewtopic.php?f=7&t=4716&p=24425#p24455

Note that, while most of the variable names are the same, server-us has |File Path= and |File Size= instead of |File Name=, and |Date Submitted= format is different.

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 2:06 pm
by gardano
imslp wrote:There are only three FTE template types for file entries:

#fte:imslpfile
#fte:imslpaudio, and
#fte:server-us
OK, so just to be clear, I should write 2 parsers (or have 2 sets of variables in one parser, whichever is easier). One for #fte:imslpfile and one for #fte:server-us? (for PDF files, audio files will come later)

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 2:46 pm
by imslp
Well, most of the variables are the same between the two except for the ones I mentioned above. I'd just put them in one parser, but that's your call of course.

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 3:14 pm
by gardano
I'm noticing a number of (what I suppose to be) incorrect formatting. Here's an example:

Code: Select all

{{#fte:imslpfile
||File Name 1=PMLP45465-DAgnesi-sonata_per_cemballo-_I.pdf
Note the 2 "||"s before File Name 1. This only occurs for File Name 1, not File Name 2, etc...

Is this a new format, or is this wrong?

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Mon Apr 25, 2011 10:03 pm
by KGill
As far as I know, this was the result of some kind of glitch in the file uploader that occurred from late 2008 to early 2009. There are probably a few thousand pages (at most, I'd guess) that still have this extra pipe character. It is not new, and can be ignored wherever it appears (it doesn't affect the displaying of the template at all).

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Tue Apr 26, 2011 11:44 am
by gardano
KGill wrote: It is not new, and can be ignored wherever it appears (it doesn't affect the displaying of the template at all).
It's good to know that it's not new. Unfortunately, it's screwing up my importer!

I will log those entries and fix them as the logfile reports them...

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Wed Apr 27, 2011 12:33 am
by imslp
The best regex for matching the entries is probably (with // as the delimiter):

/\|\s*File Name\s*=\s*([^|]*)/

After which you strip the whitespace from the end of the captured string. This would be the most natural way of matching MediaWiki template variables using a single regex, however, it does not take into account nested templates and links, which can sometimes occur and include a pipe sign, breaking the above regex.

If you want to make a bullet-proof MediaWiki template parser, I would suggest looking at the Ruby-based parser in the Ruby bot I sent you a while back. That is an almost completely bulletproof parser.

(Of course, these are all just suggestions.)

Re: #fte: imslpfile, Externalimslpfile, etc

Posted: Wed Apr 27, 2011 4:12 am
by gardano
Great, thanks!

I will look into the parser again, and see about porting it to Objective-C.