Mar 082011

Project AIR was concerned with the automatic and hands off population of OAI-PMH compliant Open Research Archives with linked data taken from institutional web sites .  A high aim indeed which is recognised and mollerated.  The tool crawls a web site to find research output for academics local to the institution.  Machine learning combined with a web crawler make up the meat of the project.  It is hard to extract linked data from a content management system where the content is at best prescribed and at worst free prose but either way not compatible with any system linking to a repository.  Another way to do this is by developing a culture around the submission of research outputs to the group responsible for the repository.  Cultures are very fluid things, they fade in and out of existence in different parts of an organisation and therefore a tool like this helps even if it doesn’t complete the effort.  The application of this tool to DORA will provide another way of adding outputs to the repository without manual input and repeated effort which is always of benefit to all users.

