Google App Engine Python scripting

Command line GAE

I am going to rely on the setup described in one of my previous posts. Not everything is needed – just the basic setup: * Google App Engine SDK for Python, * Application based on google-app-engine-django, * csvup sample from the above post.

Say you have some model in your app. I will use csvup.models.Person for this example. I had a need a few times before to make a simple Python command-line based applications that would rely on things provided by GAE. This allows for doing some of the things you can do in your views, but without the need to manually start the dev server and point your browser or use hacks like curl just to initiate the view action.

This is very simple with google-app-engine-django. All the code needed is actually provided in main.py file in the root of the distribution and boils down to this:

from appengine_django import InstallAppengineHelperForDjango
InstallAppengineHelperForDjango()

from csvup.models import Person

for p in Person.all():
  print p

This will initialize the helper and display all the Person instances currently in your production datastore. Neat.

LoadSdk

If you, for some reason, need to access GAE SDK code, you would not be able to do that before you call InstallAppengineHelperForDjango(). That can be a problem, as we will soon determine. There’s a handy function for that, too:

from appengine_django import LoadSdk
LoadSdk()

This will initialize the SDK itself (e.g. set up the paths to it based on the .google_appengine symlink you created), so you can import google.appengine packages.

Using your own datastore

Why is LoadSdk important? Because you can do this:

from appengine_django import LoadSdk
LoadSdk()

from google.appengine.tools import dev_appserver_main as dae
da = dae.DEFAULT_ARGS
da[dae.ARG_DATASTORE_PATH] = '/tmp/my-datastore-path.datastore'

from appengine_django import InstallAppengineHelperForDjango
InstallAppengineHelperForDjango()

from csvup.models import Person
from datetime import date

Person(name='John', birthdate=date.today(), countriesVisitedCount=3, everMarried=True).save()
for p in Person.all():
  print p
print 'done'

Digging inside the GAE helper, you can see that is was made from the manage.py perspective, quite reasonable and expected. This, however, restricts its use, as manage.py doesn’t have all the options that “original” dev_appserver.py provides (have a peek here). There are some bugs, like this one, that reflect this feature.

For some of these, it uses the default arguments specified in GAE SDK’s google/appengine/tools/dev_appserver_main.py. If you take a look, you will find out there is a dict called DEFAULT_ARGS, containing some settings that you might consider changing to suit your needs. This is exactly what the above code does – it changes the datastore path. This can be useful if your script does some destructive operations on the datastore which warrant a separate datastore file just for it.


Google App Engine Django CSV upload, part 2

[In the previous part I talked about transforming CSV into GAE models. Now it’s time to upload that to your GAE application. Let’s see how that can be done.

Basic CSV input view & template

Let’s make the following view in csvup/views.py:

from django.http import HttpResponse, Http404

def default(request):
  return HttpResponse("it works")

Make an url pointers in csvup/urls.py as necessary to get to the above view:

from django.conf.urls.defaults import *

urlpatterns = patterns('csvup.views',
  (r'^$', 'default'),
)

Fire up your development server with python manage.py runserver and go to http://localhost:8000/. You should see our “it works” text.

Instead of this, let’s consider a text box where you could easily copy and paste (or even type for small number of entries) some CSV data to be uploaded. To do this, we’ll use some templating. Put this into csvup/templates/csvup/default.htm template:

<html>
<head>
<title>GAE custom CSV upload</title>
</head>
<body>
<h1>GAE custom CSV upload</h1>
</body>
</html>

and change the view to:

from django.shortcuts import render_to_response

def default(request):
  return render_to_response('csvup/default.htm')

Refresh your browser – you should see our new template rendered.

Django provides Forms for easy HTML form manipulation. We’ll use this to get the CSV the user enters. Here’s the form – put it in csvup/models.py:

from django.form import Form, CharField, Textarea

class CsvupForm(SlugForm):
  csv = CharField(required=True, widget=Textarea(attrs=dict(cols=80,rows=15)))

We need to add the following to the body of our template, too:

<!-- ... -->
<body>
<h1>GAE custom CSV upload</h1>
<form method="post" action="{% url csvup.views.default %}">
  {{ form.as_p }}
  <input type="submit"/>
</form>
</body>
<!-- ... -->

Change your view to include a newly used form:

from django.shortcuts import render_to_response
from csvup.models import Person, CsvupForm

def default(request):
  return render_to_response('csvup/default.htm', dict(form=CsvupForm())) 

Refresh your browser, you should see this:

This would allow us to enter copy & paste the CSV into the box, then import it by submitting the form.

Importing data

Csvup class that we made is model-agnostic. We will make our view in the same way. Let’s change the URL mappings like this:

urlpatterns = patterns('csvup.views',
  (r'^(?P<mklass>[^/]+)', 'default'),
)

and our view like this:

def default(request, mklass):
  if request.method == 'POST':
    rowsCnt = 0
    return render_to_response('csvup/uploaded.htm', dict(rowsCnt=rowsCnt, mklass=mklass)) 
  return render_to_response('csvup/default.htm', dict(form=CsvupForm(), mklass=mklass)) 

Add the following to csvup/templates/csvup/uploaded.htm:

<html>
<head>
<title>GAE custom CSV upload - finished</title>
</head>
<body>
You've just uploaded to the following model klass: {{ mklass }}<br/>
Go <a href="{% url csvup.views.default mklass %}">back</a> to upload some more.
</body>
</html>

and change the default.htm template like this to satisfy the url pattern change we made:

<!-- ... -->
<form method="post" action="{% url csvup.views.default mklass %}">
<!-- ... -->

This allows us to go to http://localhost:8000/csvup/csvup.models.Person and it will allow us to upload CSV to the Person model. Go try it out to confirm it’s working. You, of course, won’t get any uploads yet, but the base is there. Except one more step…

Python Class.forName

A bit Java-centric, won’t you say? Basically, the above link would lead us to getting mklass = “csvup.models.Person”. What’s wrong with that? It’s a string – we need a Python class. With a bit help from hasen j on Stack Overflow, we get this:

def getClass(kls):
  parts = kls.split('.')
  module = ".".join(parts[:-1])
  m = __import__(module)
  for comp in parts[1:]: 
    m = getattr(m, comp)
  return m

getClass(“csvup.models.Person”) will actually give us csvup.models.Person class. We are all set to glue up the pieces.

Actual upload

Here’s a view that assembles all the above:

from google.appengine.ext import db

# ...

def default(request, mklass):
  form = CsvupForm()
  if request.method == 'POST':
    form = CsvupForm(request.POST)
    if form.is_valid(): 
      csv = form.cleaned_data['csv']
      pyMklass = getClass(mklass)
      csvup = Csvup(csv, pyMklass)
      models = csvup.makeModels()
      db.put(models)
      rowsCnt = len(models)
      return render_to_response('csvup/uploaded.htm', dict(rowsCnt=rowsCnt, mklass=mklass)) 
  return render_to_response('csvup/default.htm', dict(form=form, mklass=mklass)) 

What it says is: if it’s a POST and the form is valid, parse the CSV into models, save them, then tell the user how many rows were uploaded.

Try it out – here’s the sample from the first part:

name,birthdate,countriesVisitedCount,everMarried
Joe,1990/10/01,3,n
Maria,1970/06/05,10,y

Put it there and you should get a nice message all has been uploaded.

Listing the Persons

Now we have all uploaded and we can see absolutely nothing different about the app’s output because – there’s no output… Let’s make a quick view to show that the uploads really worked:

def list(request, mklass):
  pyMklass = getClass(mklass)
  items = pyMklass.all().fetch(1000)
  return render_to_response('csvup/list.htm', dict(mklass=mklass, items=items)) 

one more url pattern:

urlpatterns = patterns('csvup.views',
  (r'^list/(?P<mklass>[^/]+)', 'list'),
  (r'^(?P<mklass>[^/]+)', 'default'),
)

and a new template:

<html>
<head>
<title>GAE model lister</title>
</head>
<body>
<body>
<h1>All {{ mklass }} ({{ items|length }})</h1>
{% for i in items %}
  {{ i }}<br/>
{% endfor %}
</body>
</body>
</html>

Point your browser to http://localhost:8000/csvup/list/csvup.models.Person and admire your newly uploaded Persons. This is also generic, so you can use it to list any model, even in your separate application, unrelated to CSV upload.

Using the usual file upload

Instead of using text box to enter the CSV, you can use the usual file upload. That’s easy with Django. First, we make a form:

from django.form import Form, CharField, Textarea, FileField

# ...

class FileupForm(SlugForm):
  csvfile = FileField(required=True)

Then we add the following view:

from csvup.models import Person, CsvupForm, FileupForm

# ...

def fileup(request, mklass):
  form = FileupForm()
  if request.method == 'POST':
    form = FileupForm(request.POST, request.FILES)
    if form.is_valid(): 
      csv = request.FILES['csvfile'].read()
      pyMklass = getClass(mklass)
      csvup = Csvup(csv, pyMklass)
      models = csvup.makeModels()
      db.put(models)
      rowsCnt = len(models)
      return render_to_response('csvup/uploaded.htm', dict(rowsCnt=rowsCnt, mklass=mklass)) 
  return render_to_response('csvup/fileup.htm', dict(form=form, mklass=mklass)) 

and we add the url pattern for it:

from django.conf.urls.defaults import *

urlpatterns = patterns('csvup.views',
  (r'^fileup/(?P<mklass>[^/]+)', 'fileup'),
  (r'^list/(?P<mklass>[^/]+)', 'list'),
  (r'^(?P<mklass>[^/]+)', 'default'),
)

In the template, nothing much changes except the enctype of the form, which is a must:

<html>
<head>
<title>GAE custom CSV upload - fileup</title>
</head>
<body>
<h1>GAE custom CSV upload - fileup</h1>
<form method="post" action="{% url csvup.views.fileup mklass %}" enctype="multipart/form-data">
  {{ form.as_p }}
  <input type="submit"/>
</form>
</body>
</html>

As you can see, the changes are minimal. Try it out now: http://localhost:8000/csvup/fileup/csvup.models.Person.

Happy CSV uploads!

That’s about the core of CSV uploads to GAE. Some considerations:

  • Obviously, you’d think about security – you don’t want bulk upload functionality available to everybody. Typically, you restrict CSV uploads to admin users or at least logged in users,
  • Size needs to be considered. As of now, GAE allows up to 10MB files to be uploaded. Plenty for many things, but might be restrictive if your appetites are bigger,
  • Error checking was not considered in these examples,
  • It’s not pretty, so if you are a designer type go ahead and make it beautiful.

Hope this helps.