Tensorflow

From the Tensorflow website http://tensorflow.org

TensorFlow™ is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains.

Building GPU Tensorflow on CSD3

To deploy Tensorflow with GPU support execute the following:

# 1. Load appropriate modules and environment
ssh login-gpu.hpc.cam.ac.uk
module purge
module load rhel7/default-peta4
module load cuda/9.0 cudnn/7.3_cuda-9.0

# 2. Install latest Tensorflow (1.8) in a python virtualenv
virtualenv ./tensorflow-env
source ./tensorflow-env/bin/activate
pip install --upgrade pip
pip install --upgrade numpy scipy wheel
pip install cryptography
pip install tensorflow-gpu

# 3. Test your installation (be in a interative session
# via sintr or submit a job to a gpu compute node)
cat << 'EOF' > helloworld.py
#!/usr/bin/env python
import tensorflow as tf
hello = tf.constant('hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
EOF

Running Tensorflow

Tensorflow can be run with a sbatch script similar to:

#!/bin/bash
#SBATCH -A MYACCOUNT
#SBATCH -p pascal
#SBATCH -t 01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=3

module purge
module load rhel7/default-gpu
module load cuda/9.0 cudnn/7.0_cuda-9.0

# 1. execute using virtual environment install:
. ./tensorflow-env/bin/activate
python ./helloworld.py

# 2. execute using singularity image:
singularity exec --nv ./tensorflow-gpu.simg python ./helloworld.py

Building Tensorflow from source on CSD3

The following instructions have been adapted from the Tensorflow documentation here: https://www.tensorflow.org/install/install_sources.

  1. Repeat step 1 above and then clone the latest Tensorflow repository:

    git clone https://github.com/tensorflow/tensorflow
    
  2. The preceding git clone command creates a subdirectory named tensorflow. After cloning, you may optionally build a specific branch (such as a release branch) by invoking the following commands:

    cd tensorflow
    git checkout Branch # where Branch is the desired branch
    
  3. Repeat step 2 above: Activate a virtual-env and install all necessary dependencies using pip. Load Google’s Bazel and install mock:

    module load bazel-0.13.0-gcc-5.4.0-6hnokt7
    virtualenv --system-site-packages ./tensorflow-env
    source ./tensorflow-env/bin/activate
    pip install --upgrade pip
    pip install --upgrade numpy scipy wheel cryptography
    pip install --upgrade mock
    
  4. cd to the top-level directory created and run the configure script. The following is an example of tensorflow built with MPI emabled:

    cd tensorflow
    ./configure
    
    Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
    Found possible Python library paths:
      /usr/local/lib/python2.7/dist-packages
      /usr/lib/python2.7/dist-packages
    Please input the desired Python library path to use.  Default is [/usr/lib/python2.7/dist-packages]
    
    Using python library path: /usr/local/lib/python2.7/dist-packages
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
    Do you wish to use jemalloc as the malloc implementation? [Y/n] n
    no jemalloc support enabled
    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
    No Google Cloud Platform support will be enabled for TensorFlow
    Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
    No Hadoop File System support will be enabled for TensorFlow
    Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] N
    No XLA support will be enabled for TensorFlow
    Do you wish to build TensorFlow with VERBS support? [y/N] N
    No VERBS support will be enabled for TensorFlow
    Do you wish to build TensorFlow with OpenCL support? [y/N] N
    No OpenCL support will be enabled for TensorFlow
    Do you wish to build TensorFlow with CUDA support? [y/N] Y
    CUDA support will be enabled for TensorFlow
    Do you want to use clang as CUDA compiler? [y/N]
    nvcc will be used as CUDA compiler
    Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: (empty)
    Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
    /usr/local/Cluster-Apps/cuda/9.0/
    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
    /usr/local/software/spack/spack-0.11.2/opt/spack/linux-rhel7-x86_64/gcc-4.8.5/gcc-5.4.0-fis24ggupugiobii56fesif2y3qulpdr/bin/gcc
    Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7
    Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
    /usr/local/Cluster-Apps/cudnn/7.0_cuda-9.0/
    Please specify a list of comma-separated CUDA compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size.
    
    Do you wish to build TensorFlow with MPI support? [y/N] y
    MPI support will not be enabled for TensorFlow
    Configuration finished