Jester's Cap

Disruptive Library Technology Jester

We're Disrupted, We're Librarians, and We're Not Going to Take It Anymore

Main menu

Skip to primary content
Skip to secondary content
  • About the Blog
  • About the Author
  • About the Tagline
  • Comment Policy
  • Contact

Post navigation

← Previous Next →

Thursday Threads: RDF, Digital Document Tampering, and Amazon’s Mechanical Turk

Posted on October 21, 2010 by Peter Murray
This entry was posted in Thursday Threads and tagged Amazon, Amazon Mechanical Turk, description, Federal Library Depository Program, government documents, Jenn Riley, MARC, metadata, ProPublica, RDF, semantic web by Peter Murray. Bookmark the permalink.

Enter your email address to receive DLTJ Thursday Threads:

Delivered by FeedBurner

This is definitely becoming a habit…welcome to the fourth edition of DLTJ‘s Thursday Threads. If you find these interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the left. If you would like a more raw and immediate version of these types of stories, watch my FriendFeed stream (or subscribe to its feed in your feed reader). Comments, as always, are welcome.

Defining Linked Data By Analogy

RDF is the grammar for a language of data. URIs are the words of that language. As in natural language, these words (i.e., the URIs) belong to grammatical categories. RDF properties (such as “isReferencedBy”) function a bit like verbs, RDF classes like nouns.

As in natural languages, where utterances are meaningful only if they follow a sentence grammar, RDF statements follow a simple and consistent three-part grammar of subject, predicate, and object. Analogously to paragraphs, RDF statements are aggregated into RDF graphs.

This is a posting from Thomas Baker on the W3C Library Linked Data exploratory group mailing list. It compares RDF to natural languages using analogies of grammar, words, sentences, and paragraphs. I think this is a useful way to think about RDF and linked data, although as initial introduction to the topic, you might want to see the presentation below.

RDF For Librarians presentation recording

The RDF model underlying Semantic Web technologies is frequently described as the future of structured metadata. Its adoption in libraries has been slow, however. This is due in no small part to fundamental differences in the modeling approach that RDF takes, representing a “bottom up” architecture where a description is distributed and can be made up of any features deemed necessary, whereas the record-centric approach taken by libraries tends to be more “top down” relying on prespecified feature sets that all should strive to make the best use of. This presentation will delve deeply into the differences between these two approaches to explore why the RDF approach has proven difficult for libraries, look at some RDF-based initiatives that are happening in libraries and how they are allowing different uses of this metadata than was previously possible, and pose some questions about how libraries might best.

Jenn Riley gave this hour-long presentation to the Indiana University Digital Library Brown Bag earlier this month. The URL to the slides synchronized to the audio recording is http://breeze.iu.edu/p48776227/. The presentation slides and the handout from the session are available as well. I highly recommend spending an hour with this presentation to learn about how linked data compares and contrasts with MARC records. (via Diane Hillmann)

The Future of the Federal Depository Libraries

[ProPublica's Dafna] Linzer’s expose of government tampering with a court docket is an example of the problem on which the LOCKSS Program has been working for more than a decade, how to make the digital record resistant to tampering and other threats. The only reason this case was detected was because Linzer created and kept a copy of the information the government published, and this copy was not under their control. Maintaining copies under multiple independent administrations (i.e. not all under control of the original publisher) is a fundamental requirement for any scheme that can recover from tampering (and in practice from many other threats).

David Rosenthal summarizes a story about how a published document from the U.S. government was changed and why we need highly-distributed copies of government documents to detect and recover from tampering. There are big implications here for the future of government documents depository programs.

ProPublica’s Guide to Mechanical Turk

Amazon Mechanical Turk – or mTurk – is an online marketplace, set up by the online shopping site Amazon, where anyone can hire workers to complete short, simple tasks over the Internet. Amazon originally developed it as an in-house tool, and commercialized it in 2005. The mTurk workforce now numbers more than 100,000 workers in 200 countries, according to Amazon. At ProPublica, we use it for tasks like collecting, reformatting, and de-duplicating data. This is a guide to journalists looking to use Mechanical Turk in their data projects. It’s meant for users who are already familiar with mTurk and are looking for ways to improve their results.

Do you have repetitive digital conversion or analysis jobs that can be broken down into manageable-sized chunks? ProPublica published this guide on using Amazon’s Mechanical Turk service to outsource this activity.

Link to this post!

Share this:

(This post was updated on 21-Oct-2010.)

Links in "Thursday Threads: RDF, Digital Document Tampering, and Amazon’s Mechanical Turk"

Tags for "Thursday Threads: RDF, Digital Document Tampering, and Amazon’s Mechanical Turk"

Find Related Content: within DLTJ Technorati del.icio.us Wikipedia
Amazon Find posts tagged 'Amazon' in DLTJ Find posts tagged 'Amazon' in Technorati Find posts tagged 'Amazon' in del.icio.us Find posts tagged 'Amazon' in Wikipedia (English)
Amazon Mechanical Turk Find posts tagged 'Amazon Mechanical Turk' in DLTJ Find posts tagged 'Amazon Mechanical Turk' in Technorati Find posts tagged 'Amazon Mechanical Turk' in del.icio.us Find posts tagged 'Amazon Mechanical Turk' in Wikipedia (English)
description Find posts tagged 'description' in DLTJ Find posts tagged 'description' in Technorati Find posts tagged 'description' in del.icio.us Find posts tagged 'description' in Wikipedia (English)
Federal Library Depository Program Find posts tagged 'Federal Library Depository Program' in DLTJ Find posts tagged 'Federal Library Depository Program' in Technorati Find posts tagged 'Federal Library Depository Program' in del.icio.us Find posts tagged 'Federal Library Depository Program' in Wikipedia (English)
government documents Find posts tagged 'government documents' in DLTJ Find posts tagged 'government documents' in Technorati Find posts tagged 'government documents' in del.icio.us Find posts tagged 'government documents' in Wikipedia (English)
Jenn Riley Find posts tagged 'Jenn Riley' in DLTJ Find posts tagged 'Jenn Riley' in Technorati Find posts tagged 'Jenn Riley' in del.icio.us Find posts tagged 'Jenn Riley' in Wikipedia (English)
MARC Find posts tagged 'MARC' in DLTJ Find posts tagged 'MARC' in Technorati Find posts tagged 'MARC' in del.icio.us Find posts tagged 'MARC' in Wikipedia (English)
metadata Find posts tagged 'metadata' in DLTJ Find posts tagged 'metadata' in Technorati Find posts tagged 'metadata' in del.icio.us Find posts tagged 'metadata' in Wikipedia (English)
ProPublica Find posts tagged 'ProPublica' in DLTJ Find posts tagged 'ProPublica' in Technorati Find posts tagged 'ProPublica' in del.icio.us Find posts tagged 'ProPublica' in Wikipedia (English)
RDF Find posts tagged 'RDF' in DLTJ Find posts tagged 'RDF' in Technorati Find posts tagged 'RDF' in del.icio.us Find posts tagged 'RDF' in Wikipedia (English)
semantic web Find posts tagged 'semantic web' in DLTJ Find posts tagged 'semantic web' in Technorati Find posts tagged 'semantic web' in del.icio.us Find posts tagged 'semantic web' in Wikipedia (English)

Related Posts on Disruptive Library Technology Jester

No related posts.

Track and Share With Others

• Technorati iconTechnorati Cosmos

• TrackBack URI


Logging In...

Profile cancel

Sign in with Twitter Sign in with Facebook
or

Not published

  • 3 Replies
  • 2 Comments
  • 0 Tweets
  • 0 Facebook
  • 1 Pingback
Last reply was 11 months ago
  1. IT Blog Network
    View October 22, 2010

    #Semantic #Blogs Thursday Threads: RDF, Digital Document Tampering, and Amazon's …: The RDF model underlying Sem… http://bit.ly/aujpTw

    Reply
  2. Semantic Web Blogs
    View October 22, 2010

    #Semantic #Blogs Thursday Threads: RDF, Digital Document Tampering, and Amazon's …: The RDF model underlying Sem… http://bit.ly/aujpTw

    Reply
  3. Jenn Riley on RDF | Metadata Matters
    View 11 months ago

    Kramer auto Pingback[...] Pingback by Thursday Threads: RDF, Digital Document Tampering, and Amazon’s Mechanical Turk | Disruptive L… [...]

    Reply

Home

Search

Recent Posts

  • Code4Lib Journal Issue #20 Published; My Editorial: “It is Volunteers All the Way Down…”
  • Notes on the Code4Lib Virtual Lightning Talks
  • Interlibrary Loan Standards Undergoing Revision at the ISO Level
  • Vote for an ALA2013 Ignite Session on Open Source Communities
  • A Great iPad Keyboard/Case Combination: New Trent Airbender
  • ResourceSync Specification Draft Published for Comment

Archives

  • 2013: J F M A M J J A S O N D
  • 2012: J F M A M J J A S O N D
  • 2011: J F M A M J J A S O N D
  • 2010: J F M A M J J A S O N D
  • 2009: J F M A M J J A S O N D
  • 2008: J F M A M J J A S O N D
  • 2007: J F M A M J J A S O N D
  • 2006: J F M A M J J A S O N D
  • 2005: J F M A M J J A S O N D

Feeds and Such

  • Link to Podcast (RSS feed) for this blog
    Add Podcast to iTunes subscription
    Receive DLTJ by e-mail:


    Delivered by FeedBurner
  • View Peter Murray's profile on LinkedIn

Copyright

This work by Peter Murray is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States.

Creative Commons License
© 2013 | Theme based on Twenty Eleven by Wordpress.org | DLTJ strives for Standards Compliant XHTML & CSS | RSS Posts & Comments
From the Disruptive Library Technology Jester (http://dltj.org/), printed on Saturday the 25th of May 2013 at 10:44:12 AM UTC (+0000). The URL to this page is

[Creative Commons Logo] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
This work by Peter Murray is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States.