Yay, my entry for the Discovery & DevCSI Developers Competition– Lodopac– was awarded a commendation for its use of the Cambridge University Library (CUL) dataset. During the judging I was asked for searches which were known to work well- the timeout issues I discussed under Limitations being not insignificant, especially with author or title searches. I submitted a version of the following brief general notes which I hope are helpful to anyone else who wants to play:
The British National Bibliography (BNB) server is generally more responsive than the Cambridge University Library one; title seems to work better than author. The following are hopefully useful examples useful:
- author=”fisher” seems to work (at least in BNB) so long as the number of hits to return is small (e.g. 5).
- title=”fisher” works well for both.
- ISBN=”0709039425″ should very quickly give you a single hit in both BNB and CUL.
I would really like to try and think of ways of improving free text regular expression search times for things like author and title in Sparql* although I doubt there is one that doesn’t rely on the configuration, processing power, or indexing of the server being searched.
* thinking aloud, some ideas might include: downloading a larger imprecise set for further local searching (e.g. for an author/title search downloading the title matches and searching the authors locally: although this would also be slow, it would get round the timeout at least); forcing a look-up in a controlled vocab first in order to get an exact string match (esp for authors, although even if this is possible, this forces a user to do more work, which isn’t the point); local indexing of the triple store (this is probably the best way but I’m not sure how to go about it, whether I really have the server capabilities to do it, and can be committed to the updating required).