Skip to content
Solely for the Purpose of Catching $PAMRZ

On Being Fodder for Questionable Twitter Posts

Okay, I know this is starting to seem like an obsession, but I can’t figure out why someone(s) would be constructing tweets that consist of my blog post headlines and links back to my postings. I’m wondering how wide spread this problem is, so I constructed a list of URLs to blog posts based on the Planet Code4Lib Atom feed and pointed them to the Ubervu service. Ubervu has a view into the Twitter firehose, and constructs reports of Twitter mentions of URLs. For instance, I can see all of the odd headline tweets for my previous postings through this service. I can then easily scan through the list for other people that seem to be affected by this strange phenomenon.

Note!Eric Schnell has a great summary of these posts and related comments called Is a Twitterfarm Pranking the Jester? in his blog The Medium is the Message. Thank you, Eric.

Here are the results. In all cases except for one, the ‘twitterfeed’ service was used as the bridge between some feed of blog postings into individual tweets.

Interestingly, in one case — inkdroid.org/journal/2009/12/22/hacking-oreilly-rdfa/ — ‘twitterfeed’ seem to be legitimately used by Eqentia for a Twitter account called ‘semanticnews’. The bio on the twitter account says: “Tracking what’s new in the Semantic Web space. 2,500+ articles indexed via Eqentia’s semantic platform. Sign-up and experience Semantic-powered News”. Ubervu also shows that the ‘semanticnews’ tweet was the start of a Twitter thread of three other tweets on the same topic.

Analysis


Although others in the code4lib community seem to be affected by this, in this limited set none have come close to the reposting of my blog entries. I still can’t fathom a purpose behind this other than trying to mask other activities with what seems like legitimate activity. It doesn’t feel right, so I’d like to take steps to counteract it.

I went poking in my server’s access logs searching for occurrences of ‘twitterfeed’ and came back with a surprise: where I expected to see ‘twitterfeed’ in the User-Agent string, I actually found it more as a Google Analytics parameter on URL requests in these two forms:

  • ?utm_source=twitterfeed&utm_medium=twitter
  • ?utm_source=GAlert&utm_medium=twitterfeed&utm_campaign=CDT_RSS&utm_term=TechNews

At this point, I’m not sure what is introducing those parameters. I can’t find documentation for it in the Google Analytics help system, but I suspect it might be coming as a part of Feedburner. I’m pretty much a newbie when it comes to Google Analytics, so if anyone has any insights, I’d appreciate it.

There are two cases where ‘twitterfeed’ is being used as part of a user agent string (or "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3 twitterfeed" more specifically). I’m going to set up a honeypot for twitterfeed using mod_rewrite conditions on my server:

## Attempt to block twitterfeed
RewriteCond %{USER_AGENT} "twitterfeed"
RewriteRule feed.* /atom-feed-for-twitterfeed.xml [R=302,L]

The “atom-feed-for-twitterfeed.xml” file consists of:

< ?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
	<title>DLTJ Twitter Honeypot</title>
	<link rel="alternate" type="text/html" href="http://dltj.org"/>
	<id>http://dltj.org</id>
	<updated>2009-12-30T17:47:10+00:00</updated>
	<generator uri="http://dltj.org/about/">An annoyed jester</generator>
 
	<entry>
		<title>Twitter and Twitterfeed honeypot</title>
		<link rel="alternate" type="text/html"
			href="http://dltj.org/article/questionable-twitter-posts/"/>
		<id>http://dltj.org/article/questionable-twitter-posts/</id>
		<updated>2009-12-30T17:28:07+00:00</updated>
		<content type="html">&lt;p&gt;This is a honeypot to try to catch Twitterfeed when it injects postings into Twitter.  For more information on why I'm trying this, see &lt;a href="http://dltj.org/article/questionable-twitter-posts/"&gt;this blog post on &lt;acronym title="Disruptive Library Technology Jester"&gt;&lt;i&gt;DLTJ&lt;/i&gt;&lt;/acronym&gt;&lt;/a&gt;.&lt;/p&gt;</content>
		<author>
			<name>Murray, Peter</name>
			<uri>http://dltj.org/about</uri>
		</author>
	</entry>
</feed>

Yeah — I know I’m breaking the rules by giving different content for the same URI. But remember, this is just a honeypot.

With this, I’m going to see if my honeypot entry shows up in one of these Twitterfeed-injected posts. Am I showing signs of being obsessed with this? Yep, no doubt. But I really want to know how and where my content is being used. This itch definitely needs to be scratched.

(This post was updated on 06-Jan-2010.)

7 Comments 2 Other Comments

1 Trackback

  1. Kramer auto Pingback[...] feed using Twitterfeed and are then syndicating it into their own Twitter streams. He has since analyzed this hypothesis and uncovered other blogs which are also being tweeted by others.What is interesting is that the [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
Human Detection Scheme
(What's this?)
Comment Preview

Additional comments powered by BackType

Subscribe without commenting

From the Disruptive Library Technology Jester (http://dltj.org/), printed on Thursday the 2nd of September 2010 at 4:40:51 PM UTC (+0000). The URL to this page is http://dltj.org/article/questionable-twitter-posts/

[Creative Commons Logo] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.