Changes to "Emanuel African Methodist Episcopal Church" Wikipedia Page, Visualized
Edited after initial publication to add: My thoughts are with the people in and around Charleston, South Carolina, this evening. What is making it out of the media fog to me tonight is your compassion for each other. Please be well as you absorb, internalize, and recover from this shocking display of inhumanity.
This afternoon, Ed Summers tweeted:
36 hours, 152 edits, 44 editors and now Emanuel AME Church has a solid Wikipedia page https://t.co/pDLaAzbJLE thanks @xor creating it
— Ed Summers (@edsu) June 19, 2015
I wondered what 152 edits -- from stub page to fleshed out -- looked like, so I created this animation of the Emanuel African Methodist Episcopal Church Wikipedia page as it was created:
[caption width="350" id="attachment_animation" align="aligncenter"]
Animation of edits to the "Emanuel African Methodist Episcopal Church" Wikipedia Page (see full-sized version [86MB GIF])[/caption]
The page is very long, so in order to get it to scale to something that fits on a normal browser screen, the detail of what is happening is lost. Still, it is interesting to watch the reference section grow, the infoboxes be added, and the text grow in chunks and occasionally shrink as edits are made. Of course, watching the length of the page grow as more research is done and edits made. I echo Ed's kudos to the Wikipedia editors that worked quickly to create a great source of information about the history of this church in a time when many people would be looking for it.
Process
There doesn't seem to be a way to programmatically get a list of URLs (or even OIDs) from Wikipedia for one of its pages, so I captured the HTML source of the church's history page and ran a rather hairy regular expression on the unordered list of history entries, replacing:
^.*<a href="/article/emanuel-african-methodist-episcopal-church-wikipedia-page-visualized/" title="Emanuel African Methodist Episcopal Church" class="mw-changeslist-date">(\d\d):(\d\d), (\d\d) June 2015</a>‎ <span class="history-user"><a href="/article/emanuel-african-methodist-episcopal-church-wikipedia-page-visualized/">]*?>(.*?)</a>.*(<span class="comment">(.*)</span>)?.*</span>
with
https://en.wikipedia.org\t201506T\thttps://en.wikipedia.org\t\t\r
Then I used the webkit2png program to grab the full page captures of each version of the wiki page:
cat page-history.txt | \
while read line; do \
IFS=$'\t' read url timestamp editorurl editor < <<"$line"; \
webkit2png -W 1400 -H 3800 -F -o raw/$timestamp $url; \
sleep 5; \
done
```
<p>With the full page captures in place, I resized and annotated the top of each with the timestamp and the wiki editor's name using <a href="http://www.imagemagick.org/script/convert.php" title="ImageMagick: Command-line Tools: Convert">Imagemagick <code>convert</code></a>:</p>
```bash
cat page-history.txt | \
while read line; do \
IFS=$'\t' read url timestamp editorurl editor < <<"$line"; \
convert raw/$timestamp-full.png -resize 25% -background '#0008' -splice 0x20 \
-pointsize 15 -fill white -annotate +10+16 "$timestamp $editor" \
labeled/$timestamp-labeled.png; \
done
```
<p>Finally, I also used Imagemagick to create the animated GIF:</p>
```bash
convert -delay 50 -loop 0 labeled/*.png animation.gif