Thursday Threads: Beyond MARC, Library-controlled DRM, Spam Study

Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

Threads this week without commentary. (It has been a long week that included only one flight of four that actually happened without a delay, cancellation, or redirection.) Big announcements are one from the Library of Congress to re-envision the way bibliographic information travels, one from Douglas County (Colorado) Library’s experiment with taking ownership of ebooks and applying its own digital rights management, and a study on the ecosystem of spam.

Thursday Threads: Amazon Pressures Publishers, Academic Spam, Mechanical Turk Spam, Multispectral Imaging

Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

With the close of the year approaching, this issue marks the 14th week of DLTJ Thursday Threads. This issue has a publisher’s view of Amazon’s strong-arm tactics in book pricing, research into the possibility that academic authors could game Google Scholar with spam, demonstrations of how Amazon’s Mechanical Turk drives down the cost of enlisting humans to overwhelm anti-spam systems, and a story of multispectral imaging adding information in the process of digital preservation.

As the new year approaches, I wish you the best professionally and personally.

Books After Amazon

Attempting to Run Comments without reCAPTCHA

I’m trying an experiment over the next couple days/weeks. I’m turning off the reCAPTCHA requirement for blog commenters (the figure-out-these-words-and-type-them-in anti-spam scheme I turned on three and a half years ago). The only automated scheme in place now is Akismet. This change was made Friday night, and over the weekend a few spam comments got through to “approved” status while 550 were in the “spam” queue. With reCAPTCHA in place, I would typically only get 10 or so comments that would make it through reCAPTCHA only to get caught by Akismet (and none through to approved comments). I could easily go through 10 or so comments a day looking for ones that would accidentally get trapped (maybe one a month), but I’m not going through 200 or more a day. So, if you comment on DLTJ and don’t see it immediately posted, please do let me know and I’ll fetch it out of the spam queue.

On Being Fodder for Questionable Twitter Posts

Okay, I know this is starting to seem like an obsession, but I can’t figure out why someone(s) would be constructing tweets that consist of my blog post headlines and links back to my postings. I’m wondering how wide spread this problem is, so I constructed a list of URLs to blog posts based on the Planet Code4Lib Atom feed and pointed them to the Ubervu service. Ubervu has a view into the Twitter firehose, and constructs reports of Twitter mentions of URLs. For instance, I can see all of the odd headline tweets for my previous postings through this service. I can then easily scan through the list for other people that seem to be affected by this strange phenomenon.

Why I Need Twitter Distillation Tools

The following may not be news to those who regularly hang out in Twitter-land, but the extent of the problem recently became clear to me: there is a bunch of spam in Twitter. More specifically, there appear to be robots that do nothing but scan the web for keywords and create tweets with links back to them. There appear to be some that value this service (judging by the number of followers of these Twitter users), but for me it just adds to the general clutter I find in Twitter.