Two threads this week: the first is an announcement from the major search engine on a way they agree to discover machine-processable information in web pages. The search engines want this so they can do a better job understanding the information web pages, but it stomps on the linked data work that has been a hot topic in libraries recently. The second is a red-letter day in the history of the internet as major services tried out a new way for machines to connect. The test was successful, and its success means a big hurdle has been crossed as the internet grows up.
Welcome to the Disruptive Library Technology Jester. From here you can browse the musings and visions of a library technologist as he walks the fine line between the best of the library profession on one side and the best of technology on the other.
You can navigate through DLTJ several ways. Your first stop might be the introductory material about this blog and the jester himself under the "about" heading to the left. Another way would be to pick a facet below to browse: "by cagetory" for a rough categorization of postings, "by tags" for a finer granularity of topics, or "by date" for a chronological view. Third, use the search box in the left column as a keyword approach to content in DLTJ. And last, recent postings by the Jester can be found below the faceted list.
I hope you enjoy your visit. Please feel free to leave comments where you'd like or contact me directly.
Recent Posts
Open Repositories 2011 Report: Day 1 with Apache, Technology Trends, and Bolded Labels
Today was the first main conference day of the Open Repositories conference in Austin, Texas. There are 300 developers here from 20 countries and 30 states. I have lots of notes from the sessions, and I’ve tried to make sense of some of them below before I lose track of the entire context.
The meeting opened with the a keynote by Jim Jagielski, president of the Apache Software Foundation. He gave a presentation on what it means to be open source project with a focus on how Apache creates a community of developers and users around its projects.
Open Repositories 2011 Report: DSpace on Spring and DuraSpace
This week I am attending the Open Repositories conference in Austin, Texas, and yesterday was the second preconference day (and the first day I was in Austin). Coming in as I did I only had time to attend two preconference sessions: one on the integration — or maybe “invasion” of the Spring Framework — into DSpace and one on the introduction of the DuraCloud service and code.
Does the Google/Bing/Yahoo Schema.org Markup Promote Invalid HTML?
[Update on 10-Jun-2011: The answer to the question of the title is "not really" -- see the update at the bottom of this post and the comments for more information.]
Yesterday Google, Microsoft Bing, and Yahoo! announced a project to promote machine-readable markup for structured data on web pages.
Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure.
The problem is, I think, that the markup they describe on there site generates invalid HTML. Did they really do this?

