Changes to “Emanuel African Methodist Episcopal Church” Wikipedia Page, Visualized

Posted on 2 minute read

× This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.

Edited after initial publication to add: My thoughts are with the people in and around Charleston, South Carolina, this evening. What is making it out of the media fog to me tonight is your compassion for each other. Please be well as you absorb, internalize, and recover from this shocking display of inhumanity. 

This afternoon, Ed Summers tweeted:

I wondered what 152 edits -- from stub page to fleshed out -- looked like, so I created this animation of the Emanuel African Methodist Episcopal Church Wikipedia page as it was created:

[caption width="350" id="attachment_animation" align="aligncenter"]
Animation of edits to the "Emanuel African Methodist Episcopal Church" Wikipedia Page (see full-sized version [86MB GIF])[/caption]

The page is very long, so in order to get it to scale to something that fits on a normal browser screen, the detail of what is happening is lost. Still, it is interesting to watch the reference section grow, the infoboxes be added, and the text grow in chunks and occasionally shrink as edits are made. Of course, watching the length of the page grow as more research is done and edits made. I echo Ed's kudos to the Wikipedia editors that worked quickly to create a great source of information about the history of this church in a time when many people would be looking for it.

Process

There doesn't seem to be a way to programmatically get a list of URLs (or even OIDs) from Wikipedia for one of its pages, so I captured the HTML source of the church's history page and ran a rather hairy regular expression on the unordered list of history entries, replacing:

^.*<a href="/article/emanuel-african-methodist-episcopal-church-wikipedia-page-visualized/" title="Emanuel African Methodist Episcopal Church" class="mw-changeslist-date">(\d\d):(\d\d), (\d\d) June 2015</a>&lrm; <span class="history-user"><a href="/article/emanuel-african-methodist-episcopal-church-wikipedia-page-visualized/">]*?>(.*?)</a>.*(<span class="comment">(.*)</span>)?.*</span>

with

https://en.wikipedia.org\t201506T\thttps://en.wikipedia.org\t\t\r

Then I used the webkit2png program to grab the full page captures of each version of the wiki page:

cat page-history.txt | \
 while read line; do \
   IFS=$'\t' read url timestamp editorurl editor < <<"$line"; \
   webkit2png -W 1400 -H 3800 -F -o raw/$timestamp $url; \
   sleep 5; \
 done
 

With the full page captures in place, I resized and annotated the top of each with the timestamp and the wiki editor's name using Imagemagick convert:

cat page-history.txt | \
 while read line; do \
   IFS=$'\t' read url timestamp editorurl editor < <<"$line"; \
   convert raw/$timestamp-full.png -resize 25% -background '#0008' -splice 0x20 \
   -pointsize 15 -fill white -annotate +10+16 "$timestamp  $editor" \
   labeled/$timestamp-labeled.png; \
 done
 

Finally, I also used Imagemagick to create the animated GIF:

convert -delay 50 -loop 0 labeled/*.png animation.gif