Page 1 of 3

Icking Archive

Posted: Fri Apr 27, 2007 1:32 pm
by Leonard Vertighel
How should we deal with files from the Icking Archive in general? For example in Category talk:Mattheson, Johann, Feldmahler claimed that we could keep the files (even though to my knowledge, no explicit authorization by the orignal editor was provided). The WIMA copyright policy seems to suggest the contrary.

<offtopic>Personally I believe that it doesn't make much sense to randomly import single scores from large and continuously growing archives like WIMA and CPDL. While I do think that centralization is a good thing, in my opinion we neither can nor should try to incorporate all music scores of the web into IMSLP. And while I do agree with the general policy of not linking to external scores, maybe we should consider generally linking to the "big ones", much like the CPDL example link at Beethoven. (It's a pity that the IMSLP-CPDL crosslinking project never took off. But maybe if we go ahead in IMSLP, they might finally give us green light in CPDL, too...) At the same time, we should then discourage uploading files from the linked archives into IMSLP. What do you think?</offtopic>

Posted: Fri Apr 27, 2007 3:24 pm
by emeraldimp
I think--and correct me if I'm wrong--that the notice at the bottom of the file, "Copying allowed for non-commercial use only", gives us authorization to distribute it, since we're not a commercial venture. In general, though, you're right; WIMA scores should not be considered "up for grabs", as it were.

I tend to agree on the issue of whether to upload from other sites or not. I think that having the same pieces but different editions (that is, scanned editions as opposed to typeset) is okay, though, since we can then have actual discussions on the pieces (since that's part of the mission statement ;-) and the policy is to delete work pages without actual scores), and also since diversity is useful in any event. I also worry, of course, about what happens if one of the other archives gets taken offline, but that's a relatively remote possibility at the moment.

I do want to proceed with the crosslinking. I already crosslink any work pages I create with Wikipedia articles (if available). I tried to find some on CPDL that I'd already done, but I wasn't successful.

Posted: Fri Apr 27, 2007 4:11 pm
by Leonard Vertighel
You are bringing up an interesting question which can be generalized: If somebody wishes to post an analysis of a piece that is not yet available in IMSLP, does he have to find and submit a free electronic score first?

Another interesting point is the question about sites being taken offline - which by the way also applies to IMSLP: what if Feldmahler (or Mr. Ornes of CPDL, etc.) one day went crazy and just closed shop? I mean, we know it won't happen, but do we (and the other big community-driven projects) have some kind of "backup" for this unlikely case?

Re: Icking Archive

Posted: Fri Apr 27, 2007 4:31 pm
by imslp
Leonard Vertighel wrote:How should we deal with files from the Icking Archive in general? For example in Category talk:Mattheson, Johann, Feldmahler claimed that we could keep the files (even though to my knowledge, no explicit authorization by the orignal editor was provided). The WIMA copyright policy seems to suggest the contrary.
Actually, I didn't notice you said it was from the WIMA... I was just going by the copyright terms on the score itself. It may be a good idea to ask permission... though I tend to agree with Emeraldimp (which is why I said what I did on the talk page in the first place), but asking never hurts I suppose :)
<offtopic>Personally I believe that it doesn't make much sense to randomly import single scores from large and continuously growing archives like WIMA and CPDL. While I do think that centralization is a good thing, in my opinion we neither can nor should try to incorporate all music scores of the web into IMSLP.
I would somewhat agree with this. The qualification is because I do not want to set a precedent for the exclusion of otherwise submittable files onto IMSLP; many people have asked me to remove this or that file out of courtesy, and I have not complied to any of them, because as soon as we start doing so we are well on the slippery slope towards having pretty much nothing on IMSLP, just out of courtesy to the "copyright owners" (they actually don't technically own "copyright" in Canada). This is why the only reasons I would remove a file are 1. file is broken/duplicate, or 2. there is copyright infringement.

Another reason I won't explicitly say to avoid the submission of scores from other archives is that the IMSLP submission process is daunting enough as it is (with the copyright checks and all); it may be a good idea to not complicate it further by having people check through a list of sites they cannot submit stuff from... we want to attract contributors, not scare them away ;)

In any case, I would suggest not explicitly prohibiting the submission of scores from other archives, even though you may think differently about this (and I would agree). Of course, copyright infringement is an entirely different matter, and should be noted. I also do harbour somewhere in the remote reaches of my heart Emeraldimp's concern that one of those sites might go down at some point in time :P, so I'm not completely opposed to people submitting scores from those archives that they think are useful to IMSLP.
And while I do agree with the general policy of not linking to external scores, maybe we should consider generally linking to the "big ones", much like the CPDL example link at Beethoven. (It's a pity that the IMSLP-CPDL crosslinking project never took off. But maybe if we go ahead in IMSLP, they might finally give us green light in CPDL, too...) At the same time, we should then discourage uploading files from the linked archives into IMSLP. What do you think?</offtopic>
Well, the good thing is that there are very few CPDL scores even if we don't explicitly prohibit it, and I'd just suggest keeping it like this for the time being, and dealing with it when and if it actually happens. :) About crosslinking with CPDL, I'm myself undecided. I think currently the partnership is somewhat uneven, which is why CPDL hasn't responded as enthusiastically as they could. They have been online for 10 years, while we have been online for only 1.25; plus currently we still lag slightly behind them in terms of Alexa rank (though that should be reversed very soon). Perhaps when IMSLP gains more traction, CPDL would be more enthusiastic about linking with us, and we don't have to do all the single-sided work.

Posted: Fri Apr 27, 2007 4:42 pm
by imslp
Leonard Vertighel wrote:Another interesting point is the question about sites being taken offline - which by the way also applies to IMSLP: what if Feldmahler (or Mr. Ornes of CPDL, etc.) one day went crazy and just closed shop? I mean, we know it won't happen, but do we (and the other big community-driven projects) have some kind of "backup" for this unlikely case?
This is a good point; I was thinking of having public database dumps and file archives, but unlike Wikipedia, the bulk of IMSLP is based on files, so if I just gave them up for download, the amount of IMSLP contributors will likely decrease (people have asked me for zip files of the IMSLP image archive, probably with the idea that they can just download it and never visit IMSLP again). There may also be privacy concerns with user accounts (though I suppose I can find out how wikipedia is doing things and do it like them). Btw, the image archive itself is 10GB in size, so I would not be able to offer them through the server, and would probably have to do so through bt, but which would make it rather hard to update. I'm also slightly concerned with other people using the IMSLP for commercial purposes (it is not like there aren't already some sites who have tried to do this even by manually downloading IMSLP files, though most of them have given up because of the current size and expansion rate of the IMSLP archive).

So there are many things that need working out; but I won't be against some sort of 3rd party backup. One interesting idea (which was proposed earlier) was to release CDs/DVDs of the archive, much like the Wikipedia 1.0 project, but that would likely require a HUGE amount of work.

Nonetheless, I do think centralization at least decreases the chance of files being lost (since there are duplicates on the net anyway)... but yes, this is a real concern that is very hard to address at this point in time.

Posted: Fri Apr 27, 2007 4:46 pm
by imslp
Leonard Vertighel wrote:Another interesting point is the question about sites being taken offline - which by the way also applies to IMSLP: what if Feldmahler (or Mr. Ornes of CPDL, etc.) one day went crazy and just closed shop? I mean, we know it won't happen, but do we (and the other big community-driven projects) have some kind of "backup" for this unlikely case?
Also, not sure if this is any comfort, but I've rather committed myself to sticking with IMSLP as long as I can. If for some reason I cannot continue to host IMSLP, I would offer the entire IMSLP site dump and file dump for download (including code modifications to Mediawiki), so that other people can continue on :) Though, I've spent a *hell* of alot of time on this project, and I'm not just going to throw it all away for trivial reasons.

Re: Icking Archive

Posted: Fri Apr 27, 2007 5:24 pm
by Leonard Vertighel
As I said, we know that you won't just shut shop like that (just be sure not to drink too much :D :D). Anyway, it's a topic I'm interested in, so keep us posted if there are any interesting news on this.
imslp wrote:[...]I think currently the partnership is somewhat uneven, which is why CPDL hasn't responded as enthusiastically as they could. They have been online for 10 years, while we have been online for only 1.25; plus currently we still lag slightly behind them in terms of Alexa rank (though that should be reversed very soon).
Plus, we have over 1900 links from Wikipedia (yeah, I've been busy :D) as opposed to less than 250 for CPDL. And who knows if we are still enthusiastic about crosslinking once the ranking has been reversed... Ahem, OK, I'll stop being mean right now.

Anyway, let's leave everything as is right now. Maybe we can bring the subject up again e.g. after IMSLP's 2nd birthday. I'm always trying to see this also from the point of view of the user (as opposed to contributor - though hopefully many users will be contributors as well). IMSLP is clearly not a web directory, but somehow linking together the major resources would probably be useful.

Re: Icking Archive

Posted: Fri Apr 27, 2007 5:45 pm
by imslp
Leonard Vertighel wrote:As I said, we know that you won't just shut shop like that (just be sure not to drink too much :D :D).
Well... your in luck there 'cause I actually never drink ;) Just a funny habit of mine.
Anyway, it's a topic I'm interested in, so keep us posted if there are any interesting news on this.
Indeed, also feel free to throw out any ideas that you would have regarding this :) If you manage to (mostly) get around 1. other people using the backup for commercial purposes, and 2. bandwidth/up-to-date-ness issues, I would be willing to implement it :)

What I'm thinking is to let IMSLP grow some more, to the point where mirroring IMSLP becomes impractical or ineffective for commercial enterprises (sort of like how no one seriously tries to mirror Wikipedia for profit), and then release backups. I'm not sure how/when it will work yet.

Also, in case people are worried, there is always 3 complete versions of IMSLP (within 24 hour recency) on three different hard drives at any point in time, so if anyone is concerned about hardware failure, the worst that can happen is the loss of 24 hours worth of work.
imslp wrote:[...]I think currently the partnership is somewhat uneven, which is why CPDL hasn't responded as enthusiastically as they could. They have been online for 10 years, while we have been online for only 1.25; plus currently we still lag slightly behind them in terms of Alexa rank (though that should be reversed very soon).
Plus, we have over 1900 links from Wikipedia (yeah, I've been busy :D) as opposed to less than 250 for CPDL. And who knows if we are still enthusiastic about crosslinking once the ranking has been reversed... Ahem, OK, I'll stop being mean right now.
Hehehe ;)
Anyway, let's leave everything as is right now. Maybe we can bring the subject up again e.g. after IMSLP's 2nd birthday. I'm always trying to see this also from the point of view of the user (as opposed to contributor - though hopefully many users will be contributors as well). IMSLP is clearly not a web directory, but somehow linking together the major resources would probably be useful.
Indeed... we'll see clearer in a few months as to where everything is heading I suppose :)

Re: Icking Archive

Posted: Fri Apr 27, 2007 7:25 pm
by Leonard Vertighel
imslp wrote:Well... your in luck there 'cause I actually never drink ;) Just a funny habit of mine.
Don't worry, we all have our funny habits... I'm not totally addicted to not drinking, but I've never been drunk in my life either ;)
What I'm thinking is to let IMSLP grow some more, to the point where mirroring IMSLP becomes impractical or ineffective for commercial enterprises (sort of like how no one seriously tries to mirror Wikipedia for profit)
Apart from the fact that a GFDL compliant IMSLP mirror (even with advertising on it) would be perfectly legitimate from a licence point of view, I believe that you can't really prevent that anyway. There are actually lots of complete or partial Wikipedia mirrors out there (searching for a specific phrase taken from any not too new Wikipedia article usually turns up a few dozen copies). Some carry advertising. Some seem to exist only to draw visitors to a specific website. Some are abusively remote loading the content from Wikipedia, rather than using the freely available database dumps. I'm not sure that not having publicly available backups would help any. This doesn't mean that you have to provide them in a readily offline-browseable format (much like a Wikipedia dump requires an HTTP and a MySQL server, among other things). Just stating my view of the issue, of course...

Re: Icking Archive

Posted: Fri Apr 27, 2007 8:24 pm
by emeraldimp
Leonard Vertighel wrote:
imslp wrote:Well... your in luck there 'cause I actually never drink ;) Just a funny habit of mine.
Don't worry, we all have our funny habits... I'm not totally addicted to not drinking, but I've never been drunk in my life either ;)
Psshh, you all are missing out! (on the hangovers, the vomiting... yeah, I don't drink much either anymore...)
Leonard Vertighel wrote:
imslp wrote:What I'm thinking is to let IMSLP grow some more, to the point where mirroring IMSLP becomes impractical or ineffective for commercial enterprises (sort of like how no one seriously tries to mirror Wikipedia for profit)
Apart from the fact that a GFDL compliant IMSLP mirror (even with advertising on it) would be perfectly legitimate from a licence point of view, I believe that you can't really prevent that anyway. There are actually lots of complete or partial Wikipedia mirrors out there (searching for a specific phrase taken from any not too new Wikipedia article usually turns up a few dozen copies). Some carry advertising. Some seem to exist only to draw visitors to a specific website. Some are abusively remote loading the content from Wikipedia, rather than using the freely available database dumps. I'm not sure that not having publicly available backups would help any. This doesn't mean that you have to provide them in a readily offline-browseable format (much like a Wikipedia dump requires an HTTP and a MySQL server, among other things). Just stating my view of the issue, of course...
I tend to agree with Leonard... Look at about.com, for example. It seems as if they have a complete Wikipedia mirror. Personally, though, I'm less concerned with such mirrors (but then, I haven't invested any money in IMSLP) for a simple reason: any posts or improvements would happen here first, and we're free and the authoritative source (for IMSLP).

In regards to the site going offline: while we know you wouldn't just up and leave us, what if you were (goodness forbid) hit by a bus? It concerns me about the sites I manage, and none of them are on this scale. There aren't any solutions that I can think of that satisfy all of 1) ease of implementation, 2) up-to-date-ness, 3) low bandwidth usage and 4) avoiding commercial usage. My best idea is still DVDs sent to *insert favorite persons here*. One of the problems with that is still the potential copyright-infringing border-crossings. Maybe the planned indexing system can have some sort of copyright-status information and a copyright-status-reviewed-by-admin flag? (I really like dashes)

Lastly (this seems like a long post), the CPDL-linking. From my point of view, useful, relevant links are something of a rarity on the web. Any useful links, then, should be done, whether or not the pagerank is affected (unless the linked-to site contains a lot of spam links, of course). But I'm willing to wait and see where things go. Good job, btw, Leonard, on the links from wikipedia; do you have easy documentation of that? If so, it might help the justification to include IMSLP in wikipedia's interwiki map

Posted: Fri Apr 27, 2007 8:33 pm
by Leonard Vertighel
You can use the AntiSpam search (I'm not saying that our links are spam :D) Note that the results are limited to about 250 per language (at the moment, you have to add about a hundred from en.wp to the grand total). But don't mention that most of those links were made by only one user :D (only en and es were pretty well linked before).

Re: Icking Archive

Posted: Sat Apr 28, 2007 1:46 am
by imslp
Hmm... I think I found a way out of this. I will offer the images directory for download, along with a database dump (SQL or XML) devoid of user account information. I will include a white-page for the specifications of all the essential parser plugins (which should be rather easy to reprogram if one wants), but will not provide the actual code because of potential security issues. I know security by obscurity is bad, and I've already done a lot to prevent possible security issues in the code, but in the absence of a good enough developer base (who will actually look at the code), obscurity probably is necessary. I may change my mind about this later on.

I can offer this on a torrent. But here's the other problem: since it is a torrent, there needs to be some committed seeders, but the potential stumbling block is the copyright issue. The only workaround I can think of for non-Canadian backups is for them to order a Canadian shared hosting account and back it up there. Ironically, I do not know of any other Canadian IMSLP contributors besides myself (well, there's my family but I don't think that's what you meant). Plus, you would probably want some dedicated IMSLP contributors (who will actually take the trouble to set up IMSLP again after I die ;)) to download it, instead of just hit and run anonymous people.

About the commercial mirroring, I was against it because the last time another site tried to rip IMSLP off wholesale (about half a year ago) there were some IMSLP contributors who were quite agitated about it. But I suppose it is about as easy doing it from the dump as the site itself.

In any case, the only problem that remains currently is the copyright issue, and how many dedicated IMSLP contributors would actually be willing to download it...

Another way to set this up is to make the backups available to select people (maybe admins?), and so can dispense with the torrent altogether, or make it a private torrent.

These are some ideas; if either of you want to back up IMSLP privately, I can grant you special permission to do this (since you are admins), if you are against the public torrent idea.

Posted: Sat Apr 28, 2007 2:26 am
by Carolus
As for the Icking Archive, I was under the impression that most things there are re-engravings and arrangements of older works using Finale, Sibelius and other programs. In general, the emphasis here at IMSLP is more on scans of older editions. So, apart from some works that are not readily available in older editions (like Mattheson), it would appear there isn't much of a need to duplicate Icking's titles over here.

Posted: Sat Apr 28, 2007 2:32 am
by imslp
Actually, I have a better idea! Currently IMSLP uses Netfirms Advantage hosting (http://www.netfirms.ca/web-hosting/web- ... advantage/) for the image mirror server (which currently takes 50% of the bandwidth). The cost is $155.40 CAD/yr (about $140 US or 100 Euro) currently. I don't think that is too large a sum of money (at least not compared to the dedicated main server, which costs $55/mo), so one way of doing this (especially since my term with them would end in three months anyway; though I do need the bandwidth it provides, so I would renew if no one else did) would be if one of you purchase a package, and I can just upload the backup there, and if anything happens to me one of you would have control of the server. Plus, the setup can stay the same, with that server taking 50% of the image bandwidth (and using the imslp.ca domain). You can also try to host other sites there, though I warn you that they can really only host small-traffic sites (think about how IMSLP was like before I moved to the dedicated server).

This is just a suggestion; I completely understand if at this point in time you guys don't want to pay $155.40 CAD per year to handle part of IMSLP's expenses... :)

Posted: Sat Apr 28, 2007 2:34 am
by imslp
Carolus wrote:As for the Icking Archive, I was under the impression that most things there are re-engravings and arrangements of older works using Finale, Sibelius and other programs. In general, the emphasis here at IMSLP is more on scans of older editions. So, apart from some works that are not readily available in older editions (like Mattheson), it would appear there isn't much of a need to duplicate Icking's titles over here.
This is true :) But I think it may be a good idea to not explicitly forbid the submission of WIMA stuff just so that new contributors don't get scared about all the hoops they have to jump through just to submit stuff. Plus, someone may find some of the stuff there hard to find and useful for IMSLP (as you've mentioned).