Using Python¶
CSD3 provides central installations of both Python 2 and Python 3. The central installation provides some of the most common packages used for scientific computation and data analysis and additional packages can be added by users by using either virtual environments or conda environments.
Recent versions of the Python interpreter can be accessed as python/2.7, python/3.5, python/3.6, python/3.7 or python/3.8. The default is currently python/3.6. These packages will be automatically upgraded to the latest point release.
Using Virtual Environments¶
Virtualenv provides a method for installing new or upgraded Python packages as a user without the need to ask support to make changes centrally. It is currently supported in all the Python modules.
Step-by-step guide¶
We have installed virtualenv into the centrally available Python modules and it is also available for the native Python installed as part of Scientific Linux 7 on CSD3. If you are happy to use the latter (which is version 2.7.5) there is no need to load a Python module, otherwise, please load the desired Python module first.
A guide to using virtualenv can be found here : https://pypi.Python.org/pypi/virtualenv.
In short, you can create a sandboxed version, in the directory of your choice (and if does not exist you must create one first) e.g. YOUR_PYTHON, of Python via:
virtualenv YOUR_PYTHON
Then activate this via:
source YOUR_PYTHON/bin/activate
and deactivate via:
deactivate
You can get it to inherit the central packages via:
virtualenv --system-site-packages YOUR_PYTHON
Once the virtualenv environment is activated, use the normal methods for downloading and installing Python packages (e.g. pip) and the packages will be installed into your YOUR_PYTHON directory, where they will override the contents of the central Python installation. Invoke Python as normal and the new components should be visible (as long as the virtualenv environment is activated).
Additional notes¶
If you are not in the location of the filesystem where YOUR_PYTHON
is
present, you can use a full path instead of a relative path (e.g.
/home/my-crsid/YOUR_PYTHON)
to activate the virtual env from every location
of the filesystem.
The command virtualenv
must be done only once to create and initialize the
sandbox. After that, you just have to activate and deactivate accordingly to
your need.
Using Anaconda Python¶
To setup your environment to use the Anaconda distributions you should use:
for Python 2:
[user@login-e-17 ~]$ module load miniconda/2
or for Python 3:
[user@login-e-17 ~]$ module load miniconda/3
You can verify the current version of Python with:
[user@login-e-17 ~]$ module load miniconda/3
[user@login-e-17 ~]$ python --version
Python 3.7.4
You can also verify the current version of conda with:
[user@login-e-17 ~]$ conda --version
conda 4.7.12
If this is the first time using Anaconda Python then it is important to
run the first time conda init
command to correctly prepare your shell
environemnt for using the full suite of conda
commands. This only needs
to be done once.
[user@login-e-17 ~]$ module load miniconda/3
[user@login-e-17 ~]$ which conda
/usr/local/software/master/miniconda/3/bin/conda
[user@login-e-17 ~]$ conda init
[... list of modifications not made ...]
modified /home/user/.bashrc
==> For changes to take effect, close and re-open your current shell. <==
Using Anaconda Python in SLURM scripts¶
Without running conda init
the commands conda activate
and conda deactivate
will present
the following warning.
[user@login-e-15 ~]$ conda activate
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell
See 'conda init --help' for more information and options.
IMPORTANT: You may need to close and restart your shell after running 'conda init'.
Since conda init
injects some logic into your .bashrc file it must be sourced. This is
not an issue for interactive shell sessions as it happens automatically when you login.
However this is not the case for SLURM jobs, therefore it is important to add source .bashrc
to your submission script to allow subsequent conda activate
or conda deactivate
commands
to work.
Installing additional Anaconda Python modules¶
If the central base
installation does not have a package or module that you require, you
can install this yourself by using conda environments.
A conda environemnt is a local copy of the central install that you can then modify with additional modules/packages or even use different versions of existing packages.
Full documentation on using conda environments can be found online at Managing conda environments.
Below we show a short example of creating a local Python environment and installing the biopython package.
[user@login-e-1 ~]$ module load miniconda/3
[user@login-e-1 ~]$ conda create -n biopython biopython
Collecting package metadata: done
Solving environment: done
## Package Plan ##
environment location: /home/user/.conda/envs/biopython
added / updated specs:
- biopython
The following packages will be downloaded:
[... list of packages ...]
The following NEW packages will be INSTALLED:
[... list of packages ...]
Proceed ([y]/n)? y
[... download and install the packages ...]
[user@login-e-1 ~]$ conda info --envs
# conda environments:
#
biopython /home/user/.conda/envs/biopython
base * /usr/local/software/archive/linux-scientific7-x86_64/gcc-9/miniconda3-4.7.12.1-rmuek6r3f6p3v6fdj7o2klyzta3qhslh
[user@login-e-1 ~]$ conda activate biopython
(biopython) [user@login-e-1 ~]$ python
Python 3.8.5 (default, Aug 5 2020, 08:36:46)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio.Seq import Seq
>>> from Bio.Alphabet.IUPAC import unambiguous_dna
>>> new_seq = Seq('GATCAGAAG', unambiguous_dna)
>>> new_seq[0:2]
Seq('GA', IUPACUnambiguousDNA())
>>> new_seq.translate()
Seq('DQK', IUPACProtein())
>>>