Monthly Archives: August 2012

Amazon Glacier update

It looks like after 24 hours, my data are all there:

And my bill will be $0.15 this month.

Overall, this was a smooth process. Now, I think we’ll start freezing monthly versions of my Time Machine backups in Glacier. There is one small catch that I managed to miss in my original post. Glacier charges a minimum of three months of data storage, and prorates that amount if you delete your archives earlier. That really just means I’ll be keeping three of the most recent versions of my monthly backups, and deleting anything older than three months.

Cold storage: freezing my backups in Amazon Glacier

A couple of days ago, Amazon sent me an email about a new AWS service called “Glacier“. What a boring name, huh? It’s not caller something sexy like “FlexStore”, or “FireVault”, and that’s by design. The idea behind Glacier is long term, low power (and therefore lower cost) offsite storage. Facebook recently announced that they are moving to a similar solution for backups. You see, spinning platter hard drives take a constant amount of energy to keep running. If you write data to a HDD, and then pull the plug, you cut out the cost of operating the drive until you need to retrieve your data once again. For long-term backups which may never be accessed this is an ideal solution. Unfortunately, Glacier just exists as an API for the moment. Peter Binkley wrote an excellent account of sending some data to the Glacier, and then retrieving it. I think I’ll do the same, using a Java application called glacierFreezer.

The first step for me is to create a public/private key pair through AWS IAM. I just created a user named “glacier” with access to all AWS functions, except IAM. This should be fairly safe, assuming I don’t kick off all kinds of unwanted services and rack up a huge bill. glacierFreezer also needs an AWS SimpleDB to interact with, and seems capable of creating a SimpleDB domain for us, but I’ll create one anyway with boto.

$ export AWS_ACCESS_KEY_ID=Your_AWS_Access_Key_ID
$ export AWS_SECRET_ACCESS_KEY=Your_AWS_Secret_Access_Key

$ python2
Python 2.7.3 (default, Apr 24 2012, 00:00:54) 
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto
>>> connection = boto.connect_sdb()
>>> domain = sdb_connection.create_domain('glacierPhotos')
>>> domain = connection.create_domain('glacierPhotos')
>>> domains = connection.get_all_domains()
>>> domains
[Domain:glacierPhotos]
>>>quit()

Next, after downloading the glacierFreezer jar, I need to think about what to back up. This will be “fire insurance” for our digital valuables, so let’s pick something I would lose in a fire: wedding photos. In the Glacier management console, I create a “vault” named “photos”. Now, let’s test our new backup system with a script that will send my wedding photos to Glacier:

#!/bin/sh                                                                                                                           
dir=$1

for file in `find $dir`
do
    java -jar glacierFreezer.jar 'accessKey' 'secretAccessKey' glacierPhotos photos $file
done

Run the script, and wait while 15 GB of jpegs are uploaded to Glacier. Theoretically, this will cost me $0.15 per month. The cost of retrieval is higher, and it seems like there is some difficulty in predicting the exact cost at this point in time. Since this is a sort of offsite data insurance plan, the cost of retrieval will be worth it.

 

A penny for your thoughts?

A recent post at the PLoS ONE blog caught my attention. Ethan Perlstein writes about his experience soliciting comments on his recent publication. His success rate? 4.6%. Of 166 professors working in his field. 49 of which he had previous contact with.

The most negative response was this remark from a full professor with whom I had no prior contact: “Most of us just let our published work speak for itself.”

Also,

The most recent statistics indicate that 90% of PLOSONE articles have zero comments, and that an article with 14 comments (like mine) is in the top 0.001%.

The conclusion is that converting online views into actionable items (comments or, in most other online scenarios sales) is exceedingly difficult. We should be incentivizing others in our research fields to interact with each other. Since egos run wild in science, maybe a “reputation” system would held (see Biostars as an excellent example). Or, you could just disable comments.