Sunday, October 9, 2016

Compute Engine 101: Auto-updating instance

We're using Google Compute Engine to run one-off tasks which require strong machines (32 cores), then shut down automatically after completing the task. Previously I started those machines manually, connected to it, ran the task and waited for it to complete in order to stop the instance afterwards. It took some time until it annoyed me enough to automate all the things, but here's how it works now.

After following the instructions you'll also be able to view your instance's logs via Google Cloud Logs Viewer, which is a huge advantage! You could say we're creating a more flexible version of Heroku (for one-off tasks) here. :)

Compute Engine

Source Repository

Mirror your existing git repository using Google Cloud Repositories so you can easily access it from within Compute Engine without fiddling around with SSH keys. The code in this example assumes you call the repository "worker", but you can call it whatever you want and adapt the code accordingly.

Instance Setup

Create your instance as you would normally do, but be careful when configuring "Identity and API access". Default access does not include access to Google Cloud Source Repositories, so you have two options:
  1. choose "Allow full access to all Cloud APIs". This might be dangerous because it, well, gives the instance access to all APIs and therefore allows a hacker to run havoc if he gets access to that instance.
  2. create a new service account and give it access to "Compute Instance Admin", "Logs Writer" and "Source Repositories Reader" 

After creating the instance connect to it and install your dependencies. Here's my script installing Java, Maven and Google Cloud Logging:

Afterwards shut down your instance and configure a startup-script. Here's mine, which fetches the latest code from git and executes the program via Maven:

That's it! Your program executes at its latest version now every time you start the instance.

AppEngine

Implement a servlet on AppEngine, which starts the Compute Engine instance using google-cloud-java:

You could, for example, trigger that servlet via a cronjob every day if need be. Definitely make sure to properly authenticate requests to this servlet as you don't want strangers / hackers to start your instance without you knowing.