Thoughts on Google App Engine

Google App Engine is a new tool that lets web app developers run their codes on Google’s massively scalable infrastructure. It’s in private beta at the moment, with free accounts handed out to the first 10,000 developers who signed up. I was too late to get one, so I had to settle for playing around with the SDK and reading through some of the documentation yesterday. I’ve also done some research into Google’s infrastructure in the past, specifically MapReduce, the Google File System, and Bigtable. The App Engine SDK doesn’t appear to allow any direct access to the first two, but its Datastore API is pretty clearly a wrapper around Bigtable.

Bigtable is not a full-fledged RDBMS; it turns out that certain features of relational databases (joins, for instance) are performance killers when you’re working at this scale. Nevertheless, it appears to do a great job of enabling developers who are familiar with relational databases to store up to petabytes of data in a distributed system. Part of me wonders if this Datastore API will become a de facto standard for petabyte-scale data storage in the future.

In short, Google App Engine looks like a slick solution for web application developers who want to scale up fast. It’s far less flexible than Amazon Web Services — App Engine is definitely not a grid computing solution. They say so right in the introduction, and the significant restrictions Google places on developer codes back that up. In contrast, AWS’s loosely-coupled combination of EC2, S3, and SimpleDB allows for a wider variety of applications with requirements that are much different than those of a traditional web app (say, HEP computing).

The other interesting thing about the announcement is the significant endorsement of Python and Django for web app development. In the private beta all App Engine applications must be written in pure Python. Google plans to support other programming languages in the future, but it seems to me that in these situations the first supported language is always the best-supported one. I’m always happy to see Python rising in popularity — switching my thesis analysis codes from C++ (the default language of HEP) to Python last year has been a huge productivity boon — but I could see where PHP or Rails developers would be upset with this turn of events.

2 Comments »

  1. gordonwatts said,

    April 10, 2008 @ 4:26 am

    Exactly right. I was thinking that. What I wonder, however, is if one could do the final data analysis aspect there? It is all about data mining, afterall.

  2. adamkocoloski said,

    April 10, 2008 @ 9:23 pm

    Good point. I imagine flat ntuples can be dropped into Bigtable without any significant modifications — it’d be interesting to see how a few GQL queries compare with something like PROOF.

    (P.S. Sorry about the comment moderation, I didn’t even realize it was turned on till today!)

RSS feed for comments on this post · TrackBack URI

Leave a Comment