Tag: installing python

Installing Python on Linux (and the necessary modules)

Need help installing python on linux?  I can hopefully help.  To get started installing python on linux, there are a couple of options for you. The first option – which is most likely the easiest with the least headaches – is to go download Anaconda or Enthought Canopy.  Either of these routes will get python installed and configured in such a way that will allow you to step right in and just use it.

As I said in “Installing python on Windows“, I prefer Canopy over Anaconda for scientific computing / data analytics but either will work for you.  Installing Canopy on linux is very similar to installing it on Windows…so I’ll let this post be your guide for installing Canopy.

The second option for python is to  install all the pieces yourself using the command line and/or the package manager provided by your linux distribution.  I’m a fan of the command line and will provide that overview here.

I’m going to assume that you are on a recent flavor of ubuntu for this (I’m using 16.04.1). If you are on another distribution, contact me and I can give you the instructions for those distros.

Before we get started, you should know that every linux distribution that I know of has python already installed, and most have python 2.7 installed.  I prefer 2.7 for data analytics so we’ll stick with that during this installation process.

Installing python on Linux (ubuntu) from the command line

Step 1 – open a terminal window and type “python”.  Tada…you’re done! (not really). As I said above, python 2.7 is installed on most (all?) linux distributions. There’s more to getting ready to use python for scientific applications / data science than just having python though.

Installing Python on linux - checkType “exit()” into the python interpreter if you haven’t already closed it.

Installing Python on linux - exitStep 2 – install various development tools for python that may not be installed.  These include the ‘build-essential’ tools for linux, python’s ‘pip’ tool (to make python module installations easier) and ‘python-dev’ (needed for python headers, etc). In your terminal, type the following (note the ‘-y’ tells the apt-get command to install the items without asking for confirmation):

sudo apt-get install build-essential python-pip python-dev -y

Step 3: Install a ‘virtual environment’.  This isn’t a requirement, but I strongly recommend it as it allows you to segregate the various types of installations and versions of your python modules. For example, lets say you do some development on python using pandas version 0.19 on all your projects.   In six months, pandas upgrades and deprecates something that causes your code to break. You downgrade to pandas 0.19 to keep your code working but then see that pandas 0.21 contains an absolute ‘must have’ for a new project. What do you do? Re-write all of your code to use 0.21 or stay with 0.19? With a virtual environment, you can do both.

I use and recommend ‘virtualenv’. There are other options out there (using docker, individual virtual machines, etc) but virtualenv is the simplest / quickest way to get things done.  With virtualenv installed, you can install specific versions of python modules for a project while using other versions of modules for other projects.

To install virtualenv type the following (note that we are using ‘pip’ now rather than apt-get):

sudo pip install virtualenv

Now, whenever you start a new project, type the following to install python into a new virtual environment (the ‘env’ is the name of the environment). You only have to do this once per project. Note: You should use a folder per project to keep your virtual environments separated.

virtualenv env

Whenever you want to work on a specific project, change into that folder and type the following. This will set up your environment with all of your installed python modules:

source env/bin/activate

For the purpose of this Installing python on linux walk-through perform the following commands:

  • Create a folder in your home directory called ‘projects’.
  • Type “mkdir projects” to do this from the command line.
  • Change into that folder and then type “mkdir install_example” to create another folder inside the projects folder.
  • Type “virtualenv env” to create your virtual environment.
  • Type “source env/bin/activate” to begin using this environment
  • You should see something similar to the below.

Installing Python on linux - virtualenv

Now that we have our environment ready to go, we need to install some of the modules that are most often used when doing data work inside python. These modules are:

The above modules can be installed with one pip command.

sudo pip install pandas scipy scikitlearn statsmodels sympy matplotlib jupyter

You’re ready to start working with python for data analysis. Just remember, for each virtualenv you create, you’ll need to reinstall these modules if you wish to use them.

Check back here often for more information on using the above modules to actually DO something.

Installing python on Windows

Note: Enthought Canopy is End-of-Life.  Rather than re-write this piece, I’ll just point readers to the Enthought End-of-Life note for more information on how to move to the new support version(s). When time permits, i’ll write up another post describing the installation process.

If you’ve done any work with python on Windows, you may be cringing right now at the thought of trying to do any type of python development work on the platform.  Have no fear though…there is hope for python developers on Windows, especially if you are only going to be using python for data analysis, machine learning, etc and not doing any major web development work (with flask, django, etc). In this post, I describe the steps necessary for installing python on Windows.

There’s really only one method for using / installing python on windows that is convenient and works for 99.9% of the people on Windows who are focused on scientific computing — downloading Enthought Canopy orAnaconda and installing it. For those of you getting started with data analytics, Canopy gets you started faster and makes it very easy to get modules like panda, numpy, scipy, etc installed and configured (in most cases, these are already installed when Canopy is installed).

For those of you running on Mac or Linux, you can also install Canopy for your platforms. I personally don’t use Canopy on the Mac or Linux platform, but only because I prefer to manage things a bit differently on those platforms. There’s nothing wrong with using Canopy on Mac or Linux, I just prefer not to.

Installing Python on Windows using Canopy

For the purposes of this post, we are going to install Canopy(accurate as of November 2016).

  • Step 1 – visit the Enthought Canopy website and click the “Get Canopy” button.
  • Step 2 – select the “download” option for Canopy Express – FREE. This lets you get the platform without paying any additional money. If you are going to be using Canopy for heavy duty scientific work, I’d recommend buying one of their subscriptions since you get more modules, etc to work with. If you are a student or work in academia, you can ask for an academic license for free.  Note – Direct link for downloading Canopy Express.
  • Click the “Download Canopy” button. A web form will pop up asking for information…you can ignore that. Your download has started. Note: Canopy is available in 64-bit and 32-bit versions. I recommend the 64-bit if you are on a modern computer / operating system.

 

Installing python on Windows - Canopy Download

  • Once your download completes, run the executable to begin the installation process. A wizard will be displayed…hit “next” through the wizard and install the software. Once installation is complete, the final screen (see below) will have a ‘finish’ button and a ‘Launch Canopy when setup exits” checkbox. Leave the checkbox selected and click “finish” to complete the installation and launch Canopy.

Installing python on Windows - Canopy Final Installation Screen

  • The first  time you run Canopy, you will be presented with an ‘environment’ window (see below).  You can leave this at the default or select another location to store your environment information. I suggest leaving it at default to begin with. Click “Continue” to begin using Canopy.

Installing python on Windows - Canopy Environment Window

  • The first time you load Canopy, it will take some time to load the various modules into memory and setting up your Canopy environment. Each time after this first start, the platform should load up fairly quickly.
  • Once Canopy completes loading your environment for the first time, you’ll be asked if you want to make Canopy your default Python environment. Select “Yes” and click “Start using Canopy”.  If you select “no”, you will have to do a some manual configuration to begin using Canopy.

Installing python on Windows - Canopy Default Python

  • When Canopy starts, you’ll see the window below.

Installing python on Windows - Canopy Start Page

  • You now have Canopy installed and ready to go.  To start programming, click the ‘Editor’ button and Canopy will load up an editor to you can begin work. Below is a screenshot of the editor window.

Installing python on Windows - Editor Window

Check out the other posts on this website for more information on how to get started actually DOING something with python.