Search Results for: virtualenv

Python and AWS Lambda – A match made in heaven

In recent months, I’ve begun moving some of my analytics functions to the cloud. Specifically, I’ve been moving them many of my python scripts and API’s to AWS’ Lambda platform using the Zappa framework.  In this post, I’ll share some basic information about Python and AWS Lambda…hopefully it will get everyone out there thinking about new ways to use platforms like Lambda.

Before we dive into an example of what I’m moving to Lambda, let’s spend some time talking about Lambda. When I first heard about, I was a confused…but once I ‘got’ it, I saw the value. Here’s the description of Lambda from AWS’ website:

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume – there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service – all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

Once I realized how easy it is to move code to lambda to use whenever/wherever I needed it, I jumped at the opportunity.  But…it took a while to get a good workflow in place to simplify deploying to lambda. I stumbled across Zappa and couldn’t be happier…it makes deploying to lambda simple (very simple).

OK.  So. Why would you want to move your code to Lambda?

Lots of reasons. Here’s a few:

  • Rather than host your own server to handle some API endpoints — move to Lambda
  • Rather than build out a complex development environment to support your complex system, move some of that complexity to Lambda and make a call to an API endpoint.
  • If you travel and want to downsize your travel laptop but still need to access your python data analytics stack move the stack to Lambda.
  • If you have a script that you run very irregularly and don’t want to pay $5 a month at Digital Ocean — move it to Lambda.

There are many other more sophisticated reasons of course, but these’ll do for now.

Let’s get started looking at python and AWS Lambda.  You’ll need an AWS account for this.

First – I’m going to talk a bit about building an API endpoint using Flask. You don’t have to use flask, but its an easy framework to use and you can quickly build an API endpoint with it with very little fuss.  With this example, I’m going to use Lambda to host an API endpoint that uses the Newspaper library to scrape a website, pull down the text and return that text to my local script.

Writing your first Flask + Lambda API

To get started, install Flask,Flask-Restful and Zappa.  You’ll want to do this in a fresh environment using virtualenv (see my previous posts about virtualenv and vagrant) because we’ll be moving this up to Lambda using Zappa.

pip install flask flask_restful zappa

Our flask driven API is going to be extremely simple and exist in less than 20 lines of code:

from flask import Flask
from newspaper import Article
from flask_restful import Resource, Api
app = Flask(__name__)
api = Api(app)

class hello(Resource):
    def get(self):
       return "Hello World"
api.add_resource(hello, '/hello')
if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=5001)

Note: The ‘host = 0.0.0.0’ and ‘port=50001’ are extranous and are how I use Flask with vagrant. If you keep this in and run it locally, you’d need to visit “`http://0.0.0.0:5001“` to view your app.

The last thing you need to do is build your “`requirements.txt“` file for Zappa to use when building your application files to send to Lambda. For a quick/dirty requirements file, I used the following:

zappa
newspaper
flask
flask_restful

Now…let’s get this up to lambda.  With zappa, its as easy as a couple of command line instructions.

First, run the init command from the command line in your virtualenv:

zappa init

You should see something similar to this:

zappa init screenshot

You’ll be asked a few questions, you can hit ‘enter’ to take the defaults or enter your own. For this eample, I used ‘dev’ for the environment name (you can set up multiple environments for dev, staging, production, etc) and made a S3 bucket for use with this application.

Zappa should realize you are working with Flask app and automatically set things up for you. It will ask you what the name of your Flask app’s main function is (in this case it is api.app). Lastly, Zappa will ask if you want to deploy to all AWS regions…I chose not to for this example. Once complete, you’ll have a zappa_settings.json file in your directory that will look something like the following:

{
    "dev": {
        "app_function": "api.app", 
        "profile_name": "default", 
        "s3_bucket": "DEV_BUCKET_NAME" #I removed the S3 bucket name for security purposes
    }
}

I’ve found that I need to add more information to this json file before I can successfully deploy. For some reason, Zappa doesn’t add the “region” to the settings file. I also like to add the “runtime” as well. Edit your json file to read (feel free to use whatever region you want):

{
    "dev": {
        "app_function": "api.app", 
        "profile_name": "default", 
        "s3_bucket": "DEV_BUCKET_NAME",
        "runtime": "python2.7",
        "aws_region": "us-east-1"
    }
}

Now…you are ready to deploy. You can do that with the following command:

zappa deploy dev

Zappa will set up all the necessary configurations and systems on AWS AND zip up your libraries and code and push it to Lambda.   I’ve not found another framework as easy to use as Zappa when it comes to deploying…if you know of one feel free to leave a comment.

After a minute or two, you should see a “Deployment Complete: …” message that includes the endpoint for your new API. In this case, Zappa built the following endpoint for me:

https://4wq2muonbb.execute-api.us-east-1.amazonaws.com/dev

If you make some changes to your code and need to update Lambda, Zappa makes it easy to do that with the following command:

zappa update dev

Additionally, if you want to add a ‘production’ lambda environment, all you need to do is add that new environment to your settings json file and deploy it. For this example, our settings file would change to:

{
    "dev": {
        "app_function": "api.app", 
        "profile_name": "default", 
        "s3_bucket": "DEV_BUCKET_NAME",
        "runtime": "python2.7",
        "aws_region": "us-east-1"
    }.
    "prod": {
        "app_function": "api.app", 
        "profile_name": "default", 
        "s3_bucket": "PROD_BUCKET_NAME",
        "runtime": "python2.7",
        "aws_region": "us-east-1"
    }
}

Next, do a “`deploy prod“` and your production environment is ready to go at a new endpoint.

zappa deploy prod

Interfacing with the API

Our code is pushed to Lambda and ready to start accepting requests.  In this example’s case, all we are doing is returning “hello world” but you can see the power in this for other functionality.  To check out the results, just open a browser and enter your Zappa Deployment URL and append /hello to the end of it like this:

https://4wq2muonbb.execute-api.us-east-1.amazonaws.com/dev/hello

You should see the standard “Hello World” response in your browser window.

You can find the code for the lambda api.py function here.

Note: At some point, I’ll pull this endpoint down…but will leave it up for a bit for users to play around with.


 

If you want to learn more about Lambda, there are two fairly good books on the topic – check them out (Amazon links):


 

Jupyter with Vagrant

I’ve written about using vagrant for 99.9% of my python work on here before (see here and here for examples).   In addition to vagrant, I use jupyter notebooks on 99.9% of the work that I do, so I figured I’d spend a little time describing how I use jupyter with vagrant.

First off, you’ll need to have vagrant set up and running (descriptions for linux, MacOS, Windows).   Once you have vagrant installed, we need to make a few changes to the VagrantFile to allow port forwarding from the vagrant virtual machine to the browser on your computer. If you followed the Vagrant on Windows post, you’ll have already set up the configuration that you need for vagrant to forward the necessary port for jupyter.   For those that haven’t read that post, below are the tweaks you need to make.

My default VagrantFile is shown in figure 1 below.

VagrantFile Example
Figure 1: VagrantFile Example

You’ll only need to change 1 line to get port forwarding working.   You’ll need to change the line that reads:

# config.vm.network "forwarded_port", guest: 80, host: 8080

to the following:

This line will forward port 8888 on the guest to port 8888 on the host. If you aren’t using the default port of 8888 for jupyter, you’ll need to change ‘8888’ to the port you wish to use.

Now that the VagrantFile is ready to go, do a quick ‘vagrant up’ and ‘vagrant ssh’ to start your vagrant VM and log into it. Next, set up any virtual environments that you want / need (I use virtualenv to set up a virtual environment for every project).  You can skip this step if you wish, but it is recommended.

If you set up a virtual environment, go ahead and source into it so that you are using a clean environment and then run the command below to install jupyter. If you didn’t go then you can just run the below to install jupyter.

pip install jupyter

You are all set.  Jupyter should be installed and ready to go. To run it so it is accessible from your browser, just run the following command:

jupyter notebook --ip=0.0.0.0

This command tells jupyter to listen on any IP address.

In your browser,  you should be able to visit your new fangled jupyter (via vagrant) instance by visiting the following url:

http://0.0.0.0:8888/tree

Now you’re ready to go with jupyter with vagrant.


Note: If you are wanting / needing to learn Jupyter, I highly recommend Learning IPython for Interactive Computing and Data Visualization (amazon affiliate link). I recommend it to all my clients who are just getting started with jupyter and ipython.

 


 

Vagrant on Windows

There are many different ways to install python and work with python on Windows. You can install Canopy or Anaconda to have an entire python ecosystem self-contained or you can install python directly onto your machine and configure all the bits and bytes yourself. My current recommendation is to use Vagrant on Windows combined with Virtualbox to virtualize your development environment.

While I use a mac or the majority of my development, I do find myself using Windows 10 more and more, and may be moving to a Windows machine in the future for my day-to-day laptop.  I have and do use Canopy and/or Anaconda but I’ve recently moved the majority of my python development on Windows into Linux (Ubuntu) virtual machines using Vagrant and Virtualbox. You can use other products like VMWare’s virtual machine platform, but Virtualbox is free and does a good enough job for day-to-day development.1

One Caveat: if you’re doing processor / memory intensive development with python, this may not be the best option for you. That said, it can work for those types of development efforts if you configure your virtual machine with enough RAM and processors.

To get started, you’ll need to download and install Vagrant and Virtualbox for your machine.   I am using Vagrant 1.90 and Virtualbox 5.1.10 at the time of this post.

Feel free to ‘run’ either of the programs, but there’s no need to enter either program just yet.    To really use the Vagrant and the linux virtual machine, you’ll need to download a *nix emulator to allow you to do the things you need to with vagrant.

I use Git’s “bash” to interface with my virtual machines and Vagrant.  You could use putty or CygWin or any other emulator, but I’ve found Git’s bash to be the easiest and simplest to install and use.  Jump over and download git for your machine and install it. At the time of writing, I’m using Git 2.11.0.

While installing Git, I recommend leaving everything checked on the ‘select’ components window if you don’t currently have any git applications installed. If you want to use other git applications, you can uncheck the “associate .git* configuration files…” option.  There is one ‘gotcha’ when installing git that you should be aware of.

On the “adjusting your path” section (see figure 1), you’ll need to think about how you want to use git on the command line.

git command line path
Figure 1: Adjusting your path

I selected the third option when installing git. I do not use git from the windows command line though…I use a git GUI along with git from the command line within my virtual environment.

Another screen to consider is the “Configuring the terminal emulator…” screen (figure 2).  I selected and use the MinTTY option because it gives me a much more *nix feel. This is personal preference. If you are going to be doing a lot of interactive python work in the console, you might want to select the 2nd option to use the windows default console window.

Configuring your terminal emulator
Figure 2: Configuring your terminal emulator

During the remainder of the installation, I left the rest of the options at the defaults.

Now that git (and bash) is installed, you can launch Git Bash to start working with Vagrant. You should see a window similar to Figure 3.

Git Bash
Figure 3: Git Bash

From this point, you can do your ‘vagrant init’, ‘vagrant up’ and ‘vagrant ssh’ to initialize, create and ssh into your vagrant machine.

Setting up Vagrant on Windows

For those of you that haven’t used Vagrant in the past, here’s how I set it up and use it. I generally use vagrant in this way to run jupyter, so I’ll walk you through setting things up for jupyter, pandas, etc.

First, set up a directory for your project. At the Bash command line, change into the directory you want to work from and type “mkdir vagrant_project” (or whatever name you want to use). Now, initialize your vagrant project by typing:

vagrant init

This will create a Vagrantfile in the directory you’re in. This will allow you to set the configuration of your virtual machine and Vagrant. The Vagrantfile should look something like this:

VagrantFile Example
Figure 4: VagrantFile Example

Before we go any further, open up your Vagrantfile and change the following line:

config.vm.box = "base"

change “base” to “ubuntu/xenial64” to run Ubuntu 16.04. The line should now read:

config.vm.box = "ubuntu/xenial64"

If you want to run other flavors of linux or other OS’s, you can find others at https://atlas.hashicorp.com/search.

Since I’m setting this VM up to work with jupyter, I also want to configure port forwarding in the Vagrantfile. Look for the line that reads:

# config.vm.network "forwarded_port", guest: 80, host: 8080

and add a line directly below that line to read:

config.vm.network "forwarded_port", guest: 8888, host: 8888

This addition creates a forwarded port on your system from port 8888 on your host (your windows machine) to port 8888 on your guest (the virtual machine). This will allow you to access your jupyter notebooks from your Windows browser.

At this point, you could also configure lots of other vagrant options, but these are the bare minimums that you need to get started.

At the Bash command line, you can now type “vagrant up” to build your virtual machine. Since this is the first time you’ve run the command on this directory, it will go out and download the ubuntu/xenial64 ‘box’ and then build the virtual machine with the defaults.  You might see a Windows alert asking to ‘approve’ vagrant to make some changes…go ahead and allow that.

Once the ‘vagrant up’ command is complete, you should see something similar to Figure 5 below.

Output of Vagrant Up
Figure 5: Output of ‘vagrant up’

Now, you can ‘vagrant ssh’ to get into the virtual machine.  You should then see something similar to Figure 6. Now your running vagrant on windows!

Vagrant SSH - Vagrant on Windows
Figure 6: Output of ‘vagrant ssh’

One of the really cool things that vagrant does by default is set up shared folders. This allows you to do your development work in your favorite IDE or editor and have the changes show up automatically in your vagrant virtual machine.

At the Bash command line, type:

cd /vagrant

You should see a directory listing that has your Vagrantfile and a log file. If you visit your project directory using windows explorer, you should see the same two files. Shared folders for the win! I know its just a small thing, but it makes things easier for initial setup.

You now have vagrant on windows!

Configure the Python Environment

Time to set up your python environment.

First, install pip.

sudo apt install python-pip

Even though you’ve set up a virtual machine for development, it is still a good idea to use virtualenv to separate multiple projects requirements.  Install install virtualenv  with the following command:

sudo apt install virtualenv

In your project directory, set up your virtual environment by typing:

virtualenv env

Note: You may run unto an error while running this command. It will be something like like the message below:

OSError: [Errno 71] Protocol error

If this happens, delete the ‘env’ folder and then add ‘–always-copy’ to the command and re-run it. See here for more details.

#### run this only if you had an error in the step above
virtualenv env --always-copy

Activate your virtualenv by typing:

source env/bin/activate

We’re ready to install pandas and jupyter using the command below. This will install both modules as well as their dependencies.

pip install pandas jupyter

Now you’re ready to run jupyter.

jupyter notebook --ip=0.0.0.0

In the above command, we start jupyter notebook with an extra config line of ‘–ip=0.0.0.0’. This tells jupyter to listen on any IP address. It may not always be necessary, but I find it cuts out a lot of issues when I’m running it in vagrant like this.

In your windows browser, visit ‘http://localhost:8888/tree’ and  – assuming everything went the way it should – you should see your jupyter notebook tree.

Jupyter via Vagrant VM
Figure 7: Jupyter via Vagrant VM

From here, you can create your notebooks and run them just like you would with any other platform.