What Does it Mean to Have Unlimited Storage in the Cloud?

We’ve seen big announcements recently about unlimited cloud storage offerings for a flat monthly or fee. Dropbox offers it for subscribers to its Business plan. Similarly, Google has unlimited storage for Google Apps for Business customers. In both cases, though, you have to be part of a business group of some sort. Then Microsoft unlimited storage for any subscriber of all Office 365 customers (Home, School, and soon Business) as bundled offering of OneDrive with the Office suite of products. Now comes word today from Amazon of unlimited storage to consumers…no need to be part of a business grouping or have bundled software come with it.

Today a colleague asked why all of this cloud storage couldn’t be used as file storage for the Islandora hosting service that is offered by LYRASIS. On the surface, it would seem to be a perfect backup strategy — particularly if you subscribed to multiple of these services and ran audits between them to make sure that they were truly in sync. Alas, the terms of service prevent you from doing something like that. Here is an excerpt from Amazon:

1.2 Using Your Files with the Service. You may use the Service only to store, retrieve, manage, and access Your Files for personal, non-commercial purposes using the features and functionality we make available. You may not use the Service to store, transfer or distribute content of or on behalf of third parties, to operate your own file storage application or service, to operate a photography business or other commercial service, or to resell any part of the Service. You are solely responsible for Your Files and for complying with all applicable copyright and other laws, including import and export control laws and regulations, and with the terms of any licenses or agreements to which you are bound. You must ensure that Your Files are free from any malware, viruses, Trojan horses, spyware, worms, or other malicious or harmful code.

Amazon Cloud Drive Terms of Use, Last updated March 25, 2015

It did get me wondering, though. Decades ago the technology community created RAID storage: Redundant Array of Inexpensive Disks. The concept is that if you copy your data across many different disks, you can survive the failure of one of those disks and rebuild the information from the remaining drives. We also have virtual storage systems like iRODS and distributed file systems like Google File System and Apache Hadoop Distributed File System. I wonder what it would take to layer these concepts together to have a cloud-independent, cloud-redundant storage array for personal backups. Sort of like a poor-man’s RAID over Dropbox/Amazon/Microsoft/Google. Something that would take care of the file verifications, the rebuilding from redundant copies, and the caching of content between services. Even if we couldn’t use it for our library services, it would be a darn good way to ensure the survivability of our cloud-stored files against the failure of a storage provider’s business model.

Thursday Threads: Windows XP end-of-life, Maturing open source models, Trashcans that track you

Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

Three groups of stories in this long-in-coming DLTJ Thursday Threads. First, we look at the pent-up risks of running Windows XP systems given that support for that operating system is scheduled to end in April 2014. Second, a pair of articles that look at the ups and downs of open source software governance as it relates to the Apache Foundation. And lastly, look out for that garbage can — it may be watching your every move.

DLTJ Thursday Threads has been on a long hiatus since its last issue was published in spring 2012. With so much happening in the world of technology today — both in general and related to libraries — I’ve felt this growing need to revive it. I hope this is the first of a new long streak of weekly article summaries.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

Still Running Windows XP? Its Days Are Numbered

After April 8, 2014, Microsoft has said it will retire Windows XP and stop serving security updates. The only exceptions: Companies and other organizations, such as government agencies, that pay exorbitant fees for custom support, which provides critical security updates for an operating system that’s officially been declared dead.

Because Microsoft will stop patching XP, hackers will hold zero-days they uncover between now and April, then sell them to criminals or loose them themselves on unprotected PCs after the deadline.

“When someone discovers a very reliable, remotely executable XP vulnerability, and publishes it today, Microsoft will patch it in a few weeks,” said [SANS security training Jason] Fossen. “But if they sit on a vulnerability, the price for it could very well double.”

Remember what you were doing in 2001? That was the year that Microsoft’s Windows XP operating system was released. Long vaunted as among the most stable and supported Windows operating systems, final support will be retired on April 8, 2014. Do you have systems using Windows XP? It might not be on your desktop, but could be in your libraries self-check machines or building management systems? It is certainly in airport arrival/departure display systems and is a big concern in the medical community. This article speculates that when Microsoft stops creating security patches for Windows XP that those systems will be ripe for takeover for botnets and other nefarious purposes. If you have any Windows XP systems in any way connected to any network, now is the time to think about how you will transition away from them or protect them.

The Maturing of Open Source Models

But tensions within the ASF and grumbling throughout the open source community have called into question whether the Apache Way is well suited to sponsoring the development of open source projects in today’s software world. Changing attitudes toward open source licensing, conflicts with the GPL, concerns about technical innovation under the Way, fallout from the foundation’s handling of specific projects in recent years — the ASF may soon find itself passed over by the kinds of projects that have helped make it such a central fixture in open source, thanks in some measure to the way the new wave of bootstrapped, decentralized projects on GitHub don’t require a foundationlike atmosphere to keep them vibrant or relevant.

Much of the time the Apache system works. You have interested people who start a project, get some code working, then propose it to Apache. One of these meritorious members shepherds it into the organization and helps build a community of developers. The “committers” on the project do their own stunts — the bulk of the marketing and evangelizing.

In the 15 years we’ve lived with the term “open source”1 we’ve seen the rise and fall of many open source projects and community platforms, and one of the stalwart constants has been the influence of the Apache Software Foundation (ASF). Starting with the HTTP server project (the most used web server software on the internet) and now encompassing over 100 top-level projects, the influence of the Apache Way on the internet as we know it today is undeniable. So it should be no surprise that the ASF has had its own series of growing pains. These article explore some of those pains and have insights on governance for projects big and small.

Passed This Way Before? The Trash Can and the Store Shelf Know

Renew … installed 100 recycling

Image from Renew’s marketing materials, via qz.com

Image from Renew’s marketing materials, via qz.com

bins with digital screens around London before the 2012 Olympics. Advertisers can buy space on the internet-connected bins, and the city gets 5% of the airtime to display public information. More recently, though, Renew outfitted a dozen of the bins with gadgets that track smartphones.

The idea is to bring internet tracking cookies to the real world. The bins record a unique identification number, known as a MAC address, for any nearby phones and other devices that have Wi-Fi turned on. That allows Renew to identify if the person walking by is the same one from yesterday, even her specific route down the street and how fast she is walking.

[Apple’s] iBeacons is a Bluetooth-based micro-locations system (think very accurate GPS that can be used indoors). But instead of being used by people to determine their own locations, it’s used by retailers, museums and businesses of all kinds to find out exactly where people are, so they can automatically serve up highly relevant interactions to customers’ phones.

Apple has not publicly revealed technical details about iBeacons, but it did tell developers what the technology is for and generally how it works. According to Apple, iBeacons is used for the following:

This one is just spooky. Many of the electronic devices we carry are constantly searching for things to connect with via WiFi and Bluetooth. As they do so, the broadcast their unique device identifiers. This raises locational privacy concerns. With tracking devices becoming smaller and cheaper (described as three for $99 in the second article), it is conceivable that every door knob and street corner may have one in a few years time. (Unlock your hotel room door by walking up to it with your smartphone? Probably possible in the not too distant future.) Do we want to bring this technology into libraries?


  1. The History of the Open Source Initiative page on opensource.org says that “the “open source” label was created at a strategy session held on February 3rd, 1998 in Palo Alto, California, shortly after the announcement of the release of the Netscape source code.” []

Thursday Threads: Google’s Social Strategy, Big Data, Patriot Act outside U.S., Frightening Copyright Revisited

Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

It might have been the week of the annual American Library Association meeting with all the news and announcements and programming that came from it — as well as getting into the dog days of summer — but interesting news at the intersection of technology and libraries did not take a pause. Google made a big splash this week with tantalizing tidbits about its new social media project; it is at a look-but-don’t-touch stage, but the look is enticing. Then there were two articles about really big data — what is produced in the high energy physics supercolider at CERN and what we produce as a society. And to go along with that data we produce as a society is another warning that much of it isn’t safe from the prying eyes of the USA PATRIOT Act. Finally, we revisit the Georgia State University copyright case with a comment on the potential chilling impacts on free speech.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my FriendFeed stream (or subscribe to its feed in your feed reader). Comments and tips, as always, are welcome.

Google Unveils its Social Media Project

Among the most basic of human needs is the need to connect with others. With a smile, a laugh, a whisper or a cheer, we connect with others every single day.

Today, the connections between people increasingly happen online. Yet the subtlety and substance of real-world interactions are lost in the rigidness of our online tools.

In this basic, human way, online sharing is awkward. Even broken. And we aim to fix it.

We’d like to bring the nuance and richness of real-life sharing to software. We want to make Google better by including you, your relationships, and your interests. And so begins the Google+ project.

The new Google+ service is temporarily out of capacity at the limited trial launch.

This week Google unveiled its latest plan for entering the social networking space. Called “Google+“, it is less a product and more of a series of services that will tie together existing Google products with new social binding tools. At the heart of the binding tools seems to be “Circles” — or the ability to create different social networks for the various kinds of social interactions one has in real life. This sort of social segmentation is possible with Facebook “groups”, but the introductory video and the online help make the point about how Circles is baked into the Google+ social networking structure. There are other tools in the announcement, too, like a video “hangout” space, “sparks” for surfacing threads of conversations, and ways for groups to “huddle” in a chat session.

Google+ is in very limited public roll-out at the moment. Some are speculating that this is a marketing strategy to build buzz around the project like they did with limited invites to GMail and Google Voice. I wonder, based on the “We’ve temporarily exceeded our capacity. Please try again soon” message on the signup page, whether they are having difficulties scaling up the service. In any case, they are taking measured and deliberate steps in rolling this out. If you want to learn more, there are about seven minutes of videos on the Google+ Project Overview page. Beyond that is an excellent 6,300-word article by Steven Levy on Wired.com; Steven has had inside access to the development of the project for months and there are a lot of insights in the article that I’m not seeing published elsewhere.

The Size of Big Data

Experiments at CERN are generating an entire petabyte of data every second as particles fired around the Large Hadron Collider (LHC) at velocities approaching the speed of light are smashed together. However, Francois Briard, control infrastructure section leader, beam department, explained that CERN doesn’t capture and save all of this data, instead using filters to save only the results of the collisions that are of interest to scientist at the facility….

This still means CERN is storing 25PB of data every year – the same as 1,000 years’ worth of DVD quality video – which can then be analysed and interrogated by scientists looking for clues to the structure and make-up of the universe.

CERN experiments generating one petabyte of data every second, by Dan Worth, IT News from V3.co.uk

In 2011 alone, 1.8 zettabytes (or 1.8 trillion gigabytes) of data will be created, the equivalent to every U.S. citizen writing 3 tweets per minute for 26,976 years. And over the next decade, the number of servers managing the world’s data stores will grow by ten times. Interestingly, the amount of data people create by writing email messages, taking photos, and downloading music and movies is minuscule compared to the amount of data being created about them, the EMC-sponsored study found.

The IDC study predicts that overall data will grow by 50 times by 2020, driven in large part by more embedded systems such as sensors in clothing, medical devices and structures like buildings and bridges.

These two reality checks came by way of ACM TechNews. Just in case you think you were dealing with some big hunks of data, just know that data in the library world is pretty miniscule. Now there are some that are having to deal with this sort of “big data” — particularly with regards to the new rules from the National Science Foundation.

Microsoft admits Patriot Act can access EU-based cloud data

At the Office 365 launch, Gordon Frazer, managing director of Microsoft UK, gave the first admission that cloud data — regardless of where it is in the world — is not protected against the USA PATRIOT Act… After a year of researching the Patriot Act’s breadth and ability to access data held within protected EU boundaries, Microsoft finally and openly admitted it…
Frazer explained that, as Microsoft is a U.S.-headquartered company, it has to comply with local laws (the United States, as well as any other location where one of its subsidiary companies is based).

This was a bit unexpected. If you are a U.S.-based entity and thought your data was safe from revealing through a U.S. National Security Letter because you were using a hosting service outside of the U.S., you may want to check with your lawyers again.

Closing the book on academic freedom

The scope of the proposed injunction in the [Georgia State University] litigation goes far beyond existing case law, as it limits all speech, by all actors, in any way associated with GSU. As such, it is not a limit on a particular instance of suspected infringement, but a limit on all potential speech going forward. Prior injunctions have been limited in scope and have stopped the publication of existing works; the proposed injunction chills all future expression coming out of GSU, and leaves no space for the comment, criticism, and dialogue that lies at the center of constitutionally protected speech. In order to open up a new business model, the plaintiffs ask the court to shake the foundations of the balance between incentive and expression; and the price of doing so is simply too high.

Closing the book on academic freedom, by Bobby Glushko on Paul Courant’s blog

Bobby Glushko, J.D., is the Associate Librarian in the Copyright Office of the University of Michigan Library. Following up the frightening scenario in a DLTJ Thursday Thread earlier this month, Mr. Glushko looks at the potential impact on First Amendment free speech if the litigation in the Georgia State University case goes in favor of the plaintiffs. It is a whole new level of frightening.

LC’s Adoption of Silverlight — Good Deal for Microsoft, Bad Deal for the Rest of Us

Earlier this year, Microsoft announced that it was giving $3 million in “funding, software, technological expertise, training and support services” to the Library of Congress to build on-site and online exhibits of LC historical collections. Others have commented on this. From a Jester’s point of view, I’ve got problems with this on two fronts: Microsoft using LC in a cheap marketing ploy and LC’s use of a new technology that impedes access for no good technical reason.
Continue reading

Microsoft Giving Away Developer Software to Students

Stu Hicks, one of OhioLINK’s systems engineers, told the OhioLINK staff last night about a new program at Microsoft called DreamSpark. Through this program, post-secondary students around the world who are attending accredited schools or universities can download some of Microsoft’s big developer and designer tools free of charge. At the time and place this post is being written, the list of software is:

  • Visual Studio 2008 Professional Edition
  • Windows Server 2003 Standard Edition
  • SQL Server 2005 Developers Edition
  • Expression Studio
  • XNA Game Studio
  • Visual Studio 2005 Professional Edition
  • Visual C# 2005 Express Edition
  • Visual C++ 2005 Express Edition
  • Visual Basic 2005 Express Edition
  • SQL Server 2005 Express Edition
  • Visual Web Developer 2005 Express Edition
  • Visual J# 2005 Express Edition
  • Virtual PC 2007

Eligibility is determined by either a Shibboleth or a Windows CardSpace identity provider on the student’s campus. One must link a Windows Live ID account with that campus identity provider and renew that eligibility about once every 12 months. They are using Shibboleth for what it was designed for; it is actually nice to see Microsoft recognize that only a true/false response from the campus is required to determine eligibility and that no personally-identifying attributes are passed from the campus to the Microsoft server to make this happen. There are FAQs for students and for higher education administrators.

The blog post announcing the program has an video interview with Bill Gates, but unfortunately one needs Microsoft’s Flash alternative called Silverlight to watch it.