Talk 11
From Dev8D
Contents |
Eprints 3.2 API/DATA Challenge
Speaker Name: Chris Gutteridge/Partick McSweeney
Speaker ID: User:patrickmcsweene
Start Time: Wednesday, 16:30
Eprints API/DATA CHallenges is to create something cool to promote the new features of 3.2.0 especially looking at the use of linked data. There will be four small prizes handed out of Amazon vouchers. 16:45
Session: Expert Talks Wed PM
Bounties
Install EPrints
REWARD: a pack of haribo and your name imortalised on the eprints wiki for all to see. and potentially more for really cool implimentations
Install EPrints 3.2 and write the hello world plugin. Couldnt be easier!
Start by
- reading The eprints installation guide
- then read how to write plugins
Metadata extraction
REWARD: £50 of Amazon monies + a pack of haribo and potentially more for really cool implimentations
EPrints has piles of pdfs in it. One of the biggest challenges we face is automatic extraction of metadata from the text of documents. You can save yourself a lot of pain entering metadata this way. We are looking a tool which extracts text from a pdf and does some clever text parsing on the pdf to extract metadata from a documents "Reference section". Best entries will cleanly cover some of the major citation formats and be able to tell what citation format a document is using. Metadata should include:
- title
- firstname/initial familyname pairs
- publication
- year of publication
additionally you might try and extract
- publication type
- pages it appears on
Semantic EPrints
REWARD: £50 of Amazon monies + a pack of haribo and potentially more for really cool implimentations
EPrints 3.2 (currently rc2) makes tonnes of linked data in N3, N-Triples and RDF+XML. We want to connect eprints.ecs with any external linked data source! Find some interesting information or make a cool visualisation or both. Infact we will take what ever your offering!
start by
- Export a search or a record from eprints.ecs.
- Read a php tutorial about parsing linkdata from eprints http://lemur.ecs.soton.ac.uk/~cjg/Graphite/eprints.php
NOTE: you dont need to know much about EPrints to do this. You certainly dont need to know any perl
Disambiguating EPrints
REWARD: £50 of Amazon monies + a pack of haribo and potentially more for really cool implimentations
write a cool visualisation tool which helps a repository manager decide if two items in the repository are the same or versions of each other. You might want to lay this out as graph! If you are really flash maybe you could do a similar thing with author names. David Williams wrote this paper and D. Williams wrote this other paper. Are David Williams and D. Williams the same? what evidence do you have? They were uploaded by the same person? They are on the same subject? They have both published other papers in similar conferences/journals? They both use the same latex stylesheet?
Good entries have clever mechanisms for working out if things are the same and or a visualisation which makes it easy for a person to tell if two people are the same.
| Begin Time | 24 February 2010 16:30:00 + |
| Day | Wed + |
| Description | Eprints API/DATA CHallenges is to create something cool to promote the new features of 3.2.0 especially looking at the use of linked data. There will be four small prizes handed out of Amazon vouchers. + |
| End Time | 24 February 2010 16:45:00 + |
| Event Level | 5 + |
| Location | Expert Zone + |
| Speaker | Patrickmcsweene + |
| Speaker ID | Patrickmcsweene + |
| Speaker Name | Chris Gutteridge/Partick McSweeney + |
| Time | 16:30 + |
| Title | Eprints 3.2 API/DATA Challenge + |

