Quick and Dirty Serverless Integer Programming

We all know that Python has risen above its humble beginnings such that it now powers billion dollar companies. Let’s not forget Python’s roots, though! It’s still an excellent language for running quick and dirty scripts that automate some task. While this works fine for automating my own tasks because I know how to navigate the command line, it’s a bit much to ask a layperson to somehow install python and dependencies, open Terminal on a Mac (god help you if they have a Windows computer), type a random string of characters, and hit enter. Ideally, you would give the layperson a button, they hit it, and they get their result.

I recently deployed a solution which allowed me to do just this. Even better - it was all free! In this post, I’ll talk about how I used Google Sheets as my input form, datasheets to convert Google Sheets to pandas, Zappa for deploying a serverless Flask app, and PuLP for solving a quick integer programming problem to make a simple and free ad hoc optimization service.

Note: all the code for this service is located on my github

FML¶

Every project should start as a problem, and mine was no different. My wife competes in fantasy movie league. This is like fantasy football for movie geeks. The rules are simple:

You are a fantasy movie theater owner. You must decide which movies to play on your 8 screens. Each movie costs a different amount to screen, and the goal is to generate the most box office revenue over the weekend given your available budget. Talking with her, I realized that, if one can do a good job predicting box office revenue for the weekend (the hard part of the problem), then deciding how many screens to play each movie becomes a simple integer programming allocation problem.

Requirements¶

Now that we have the problem, what are the requirements?

A method for inputting a bunch of data:

  1. Movie name

  2. Expected revenue

  3. Cost to screen

  4. Ability to run the allocation problem from a browser.

  5. A view of the solution

What’s the easiest input form that data scientists hate?

Excel

What’s worse than Excel?

Google Sheets

Datasheets¶

Thankfully, Squarespace created datasheets. This is a nice library that makes interactions between pandas and Google Sheets impressively painless. The library is worth it for the detailed OAuth page alone (I once spent 2 weeks struggling with Google OAuth pain and really wish this page had existed at that time). What’s particularly nice about the OAuth page is that it walks through setting up a service account which does not require the end-user to go through the typical OAuth dance of browser redirects to and from the Google login page. This is nce because these redirects can get messed up when moving from local development to production systems in the cloud (or at least they always get messed up when I try to do it!).

Anywho, the first step was to setup my Google Sheets credentials and download the client_secrets.json and service_key.json files. With these handy, I can now access my Google Sheets spreadsheet using datasheets. The spreadsheet is named FML, and the inputs tab looks like

We can pull this into a pandas DataFrame by setting some datasheets environment variables to point to our credentials and then creating a Client