Earlier this year the DOAJ began offering a new schema for registered articles that significantly improves the value of OAI-PMH harvested article content. Prior to this addition the only scheme available was Dublin Core, which as a metadata schema for describing article content is woefully inadequate. (Dublin Core, of course, was never designed to handle the complexity of the description of an average article.) The new schema (graphically represented here
— select thumbnail to see a larger version) includes elements for ISSN/eISSN, volume/issue, start/end page numbers, and author affiliation. There is also a <fullTextUrl> element that is a link to the article content itself (not the splash page of the article on the publisher’s site).
Article content using this schema is harvestable through the DOAJ OAI-PMH provider site (for instance, using a ListRecords verb with a doajArticle metadata prefix against the PMH URL). This is, in fact, the same schema journal publishers use to submit article content to the DOAJ article database. With these pieces in place, it is now conceivable to harvest open access journal article content through the DOAJ and add it to a local journal article repository (such as the Electronic Journal Center in the case of OhioLINK).
Thanks go out to the DOAJ folks for making this available!





5 Comments
Yep, kudos to DOAJ.
I saw this a week or two ago, and while I did not take advantage of their article-specific metadata scheme, I did use the Dublin Core metadata scheme to harvest about 54,000 of the articles and save them into a MyLibrary instance. I then used an indexer called Kinosearch to make them searchable. Finally I created a rudimentary searchable/browsable interface to the whole thing. See:
Ah, the possibilities are almost endless!
–
Eric Lease Morgan
University Libraries of Notre Dame
Neat, Eric! Thanks for posting the demo link. Another great idea for the feed…
Is the relationship between two articles in a journal defined by the start and end pages of each article?
I suppose it would be; I haven’t stopped to think about it much. The order of elements matters in XML, so it could be an accurate representation of the way a journal issue is put together. The database in which the citation data is stored would need to preserve that order, of course.
This website is very essential for me and make a helpful guidline.
3 Trackbacks
Journal articles, metadata formats and woes…
In a post on his Digital Library Technology Jester weblog, Peter Murray of OhioLINK points to an XML format developed by the Directory of Open Access Journals (DOAJ) for representing descriptions of journal articles. First, I think I’d qualify Peter’…
Post a Comment