Using Python

CSD3 provides central installations of both python 2 and python 3. The central installation provides many of the most common pacakges used for scientific computation and data analysis and additional packages can be added by users by using either virtual environments or conda environments.

Recent versions of the python interpreter can be accessed as either python-3.6.2-gcc-5.4.0-me5fsee (compiled by gcc) or python-3.6.2-intel-17.0.4-lv2lxsb (compiled by intel).

Using Virtual Environments

Virtualenv provides a method for installing new or upgraded python packages as a user without the need to ask support to make changes centrally. It is currently supported in the python/2.7.5 module.

Step-by-step guide

We have installed virtualenv into the centrally available python modules and it is also available for the native python installed as part of Scientific Linux 7 on CSD3. If you are happy to use the latter (which is version 2.7.5) there is no need to load a python module, otherwise, please load the desired python module first.

A guide to using virtualenv can be found here : https://pypi.python.org/pypi/virtualenv.

In short, you can create a sandboxed version, in the directory of your choice (and if does not exist you must create one first) e.g. YOUR_PYTHON, of Python via:

virtualenv YOUR_PYTHON

Then activate this via:

source YOUR_PYTHON/bin/activate

and deactivate via:

deactivate

You can get it to inherit the central packages via:

virtualenv --system-site-packages YOUR_PYTHON

Once the virtualenv environment is activated, use the normal methods for downloading and installing python packages (e.g. pip) and the packages will be installed into your YOUR_PYTHON directory, where they will override the contents of the central python/2.7.5 installation. Invoke python as normal and the new components should be visible (as long as the virtualenv environment is activated).

Additional notes

If you are not in the location of the filesystem where YOUR_PYTHON is present, you can use a full path instead of a relative path (e.g. /home/my-crsid/YOUR_PYTHON) to activate the virtual env from every location of the filesystem.

The command virtualenv must be done only once to create and initialize the sandbox. After that, you just have to activate and deactivate accordingly to your need.

Using Anaconda Python

Users have the standard system Python available by default. To setup your environment to use the Anaconda distributions you should use:

for python 2:

module load miniconda2-4.3.14-gcc-5.4.0-xjtq53h

or for python 3:

module load miniconda3-4.5.4-gcc-5.4.0-hivczbz

Note that due to a known incompatibility between the miniconda2 environment and the tcl modules system, loading the miniconda2-4.3.14-gcc-5.4.0-xjtq53h module will render further module commands inoperable. Logging out and back into a fresh environment is the best way to clear this problem.

You can verify the current version of Python with:

[user@login-e-17 ~]$ module load miniconda3-4.5.4-gcc-5.4.0-hivczbz
[user@login-e-15 ~]$ python3 --version
Python 3.6.5 :: Anaconda, Inc.

Installing additional Python modules

If the central installation does not have a package or module that you require, you can install this yourself for your use by using conda environments.

A conda environemnt is a local copy of the central install that you can then modify with additional modules/packages or even use different versions of existing packages.

Full documentation on using conda environments can be found online at Managing conda environments.

Below we show a short example of creating a local Python environment and installing the biopython package.

[user@login-e-1 ~]$ module load miniconda3-4.5.4-gcc-5.4.0-hivczbz
[user@login-e-1 ~]$ conda create --prefix ./bioenv biopython
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /local/js947/bioenv

  added / updated specs:
    - biopython


The following packages will be downloaded:
[... list of packages ...]

The following NEW packages will be INSTALLED:
[... list of packages ...]

Proceed ([y]/n)? y

[... download and install the packages ...]


[user@login-e-1 ~]$ source activate ./bioenv
[user@login-e-1 ~]$ python
[... python version info ...]
>>> from Bio.Seq import Seq
>>> from Bio.Alphabet.IUPAC import unambiguous_dna
>>> new_seq = Seq('GATCAGAAG', unambiguous_dna)
>>> new_seq[0:2]
Seq('GA', IUPACUnambiguousDNA())
>>> new_seq.translate()
Seq('DQK', IUPACProtein())
>>>