Installing Python on OSX (and the necessary modules)

If you need help installing python on OSX, read on.

For the last three years, I’ve used a mac for all my development. I love the fact that everything ‘just works’ on the platform. That said, when you get into scientific computing and data analytics, especially with python, you can  run into some issues.

Just like linux, python is included with the operating system. Unlike linux, this can cause problems long-term for you due to upgrades and changes that Apple may make to the python ecosystem.

On OS X, I recommend those of you starting out to go with Anaconda or Enthought Canopy.  As I said in “Installing python on Windows“, I prefer Canopy over Anaconda for scientific computing / data analytics but either will work for you.  Installing Canopy on the mac is very similar to installing it on Windows…so I’ll let this post be your guide for installing Canopy.

If you want to get into the nitty-gritty and install and configure python and the modules yourself, you can easily do so, but be prepared to spend some time on the command line.

Before we get started installing python on your Mac, we need to install homebrew, which is a package manager for OS X (it acts similar to the ‘apt’ package manager on ubuntu / debian).

To install homebrew, open a terminal and paste the following:

This command installs the homebrew ecosystem onto your machine and preps your machine to be ready to install various packages, including python.

Installing Python on OSX

Step 1: Let’s get python installed via homebrew.  In your terminal, type:

This will install a version of python onto your machine and set up your environment to use that version. This helps mitigate any issues you might have down the road if / when Apple makes changes to the system provided python.   Additionally, brew installs pip into the system to make it easy to get the necessary modules onto your machine.

From this point on, we are generally going to follow exactly the same steps that I outline in Installing Python on Linux except we don’t need to install any additional tools.

Step 2: Not required, but highly recommended – install a virtual environment.  I recommend virtualenv. Install it with this command:

When you are ready to get started on a new project, type the below command to install python into a new virtual environment (the ‘env’ is the name of the environment). You only have to do this once per project. Note: You should use a folder per project to keep your virtual environments separated.

Whenever you want to work on a specific project, change into that folder and type the following. This will set up your environment with all of your installed python modules:

For the purpose of this walk-through let’s create a new directory, set up a new virtual environment and then install the necessary modules.

  • Create a folder in your home directory called ‘projects’.
  • Type “mkdir projects” to do this from the command line.
  • Change into that folder and then type “mkdir install_example” to create another folder inside the projects folder.
  • Type “virtualenv env” to create your virtual environment.
  • Type “source env/bin/activate” to begin using this environment

Now that we have our environment ready to go, we need to install some of the modules that are most often used when doing data work inside python. These modules are:

The above modules can be installed with one pip command.

You’re ready to start working with python for data analysis on your mac. Just remember, for each virtualenv you create, you’ll need to reinstall these modules if you wish to use them.

Check back here often for more information on using the above modules to actually DO something.

Installing Python on Linux (and the necessary modules)

Need help installing python on linux?  I can hopefully help.  To get started installing python on linux, there are a couple of options for you. The first option – which is most likely the easiest with the least headaches – is to go download Anaconda or Enthought Canopy.  Either of these routes will get python installed and configured in such a way that will allow you to step right in and just use it.

As I said in “Installing python on Windows“, I prefer Canopy over Anaconda for scientific computing / data analytics but either will work for you.  Installing Canopy on linux is very similar to installing it on Windows…so I’ll let this post be your guide for installing Canopy.

The second option for python is to  install all the pieces yourself using the command line and/or the package manager provided by your linux distribution.  I’m a fan of the command line and will provide that overview here.

I’m going to assume that you are on a recent flavor of ubuntu for this (I’m using 16.04.1). If you are on another distribution, contact me and I can give you the instructions for those distros.

Before we get started, you should know that every linux distribution that I know of has python already installed, and most have python 2.7 installed.  I prefer 2.7 for data analytics so we’ll stick with that during this installation process.

Installing python on Linux (ubuntu) from the command line

Step 1 – open a terminal window and type “python”.  Tada…you’re done! (not really). As I said above, python 2.7 is installed on most (all?) linux distributions. There’s more to getting ready to use python for scientific applications / data science than just having python though.

Installing Python on linux - checkType “exit()” into the python interpreter if you haven’t already closed it.

Installing Python on linux - exitStep 2 – install various development tools for python that may not be installed.  These include the ‘build-essential’ tools for linux, python’s ‘pip’ tool (to make python module installations easier) and ‘python-dev’ (needed for python headers, etc). In your terminal, type the following (note the ‘-y’ tells the apt-get command to install the items without asking for confirmation):

Step 3: Install a ‘virtual environment’.  This isn’t a requirement, but I strongly recommend it as it allows you to segregate the various types of installations and versions of your python modules. For example, lets say you do some development on python using pandas version 0.19 on all your projects.   In six months, pandas upgrades and deprecates something that causes your code to break. You downgrade to pandas 0.19 to keep your code working but then see that pandas 0.21 contains an absolute ‘must have’ for a new project. What do you do? Re-write all of your code to use 0.21 or stay with 0.19? With a virtual environment, you can do both.

I use and recommend ‘virtualenv’. There are other options out there (using docker, individual virtual machines, etc) but virtualenv is the simplest / quickest way to get things done.  With virtualenv installed, you can install specific versions of python modules for a project while using other versions of modules for other projects.

To install virtualenv type the following (note that we are using ‘pip’ now rather than apt-get):

Now, whenever you start a new project, type the following to install python into a new virtual environment (the ‘env’ is the name of the environment). You only have to do this once per project. Note: You should use a folder per project to keep your virtual environments separated.

Whenever you want to work on a specific project, change into that folder and type the following. This will set up your environment with all of your installed python modules:

For the purpose of this Installing python on linux walk-through perform the following commands:

  • Create a folder in your home directory called ‘projects’.
  • Type “mkdir projects” to do this from the command line.
  • Change into that folder and then type “mkdir install_example” to create another folder inside the projects folder.
  • Type “virtualenv env” to create your virtual environment.
  • Type “source env/bin/activate” to begin using this environment
  • You should see something similar to the below.

Installing Python on linux - virtualenv

Now that we have our environment ready to go, we need to install some of the modules that are most often used when doing data work inside python. These modules are:

The above modules can be installed with one pip command.

You’re ready to start working with python for data analysis. Just remember, for each virtualenv you create, you’ll need to reinstall these modules if you wish to use them.

Check back here often for more information on using the above modules to actually DO something.

Installing python on Windows

If you’ve done any work with python on Windows, you may be cringing right now at the thought of trying to do any type of python development work on the platform.  Have no fear though…there is hope for python developers on Windows, especially if you are only going to be using python for data analysis, machine learning, etc and not doing any major web development work (with flask, django, etc). In this post, I describe the steps necessary for installing python on Windows.

There’s really only one method for using / installing python on windows that is convenient and works for 99.9% of the people on Windows who are focused on scientific computing — downloading Enthought Canopy orAnaconda and installing it. For those of you getting started with data analytics, Canopy gets you started faster and makes it very easy to get modules like panda, numpy, scipy, etc installed and configured (in most cases, these are already installed when Canopy is installed).

For those of you running on Mac or Linux, you can also install Canopy for your platforms. I personally don’t use Canopy on the Mac or Linux platform, but only because I prefer to manage things a bit differently on those platforms. There’s nothing wrong with using Canopy on Mac or Linux, I just prefer not to.

Installing Python on Windows using Canopy

For the purposes of this post, we are going to install Canopy(accurate as of November 2016).

  • Step 1 – visit the Enthought Canopy website and click the “Get Canopy” button.
  • Step 2 – select the “download” option for Canopy Express – FREE. This lets you get the platform without paying any additional money. If you are going to be using Canopy for heavy duty scientific work, I’d recommend buying one of their subscriptions since you get more modules, etc to work with. If you are a student or work in academia, you can ask for an academic license for free.  Note – Direct link for downloading Canopy Express.
  • Click the “Download Canopy” button. A web form will pop up asking for information…you can ignore that. Your download has started. Note: Canopy is available in 64-bit and 32-bit versions. I recommend the 64-bit if you are on a modern computer / operating system.

 

Installing python on Windows - Canopy Download

  • Once your download completes, run the executable to begin the installation process. A wizard will be displayed…hit “next” through the wizard and install the software. Once installation is complete, the final screen (see below) will have a ‘finish’ button and a ‘Launch Canopy when setup exits” checkbox. Leave the checkbox selected and click “finish” to complete the installation and launch Canopy.

Installing python on Windows - Canopy Final Installation Screen

  • The first  time you run Canopy, you will be presented with an ‘environment’ window (see below).  You can leave this at the default or select another location to store your environment information. I suggest leaving it at default to begin with. Click “Continue” to begin using Canopy.

Installing python on Windows - Canopy Environment Window

  • The first time you load Canopy, it will take some time to load the various modules into memory and setting up your Canopy environment. Each time after this first start, the platform should load up fairly quickly.
  • Once Canopy completes loading your environment for the first time, you’ll be asked if you want to make Canopy your default Python environment. Select “Yes” and click “Start using Canopy”.  If you select “no”, you will have to do a some manual configuration to begin using Canopy.

Installing python on Windows - Canopy Default Python

  • When Canopy starts, you’ll see the window below.

Installing python on Windows - Canopy Start Page

  • You now have Canopy installed and ready to go.  To start programming, click the ‘Editor’ button and Canopy will load up an editor to you can begin work. Below is a screenshot of the editor window.

Installing python on Windows - Editor Window

Check out the other posts on this website for more information on how to get started actually DOING something with python.