From the Alphafold website

This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document.

We also provide an implementation of AlphaFold-Multimer. This represents a work in progress and AlphaFold-Multimer isn’t expected to be as stable as our monomer AlphaFold system. Read the guide for how to upgrade and update code.

AlphaFold data on CSD3

The 2.8TB dataset is stored in:


Note that you may need to ls the directory in order for it to be mounted. There are example sequences stored in the input directory.

The dataset has been recently updated (November 2023) so scripts may not work without pointing to new versions of the files. In addition, to address this issue, newer versions of the uniref can be found here /datasets/public/AlphaFold/data/uniref30/UniRef30_2023_02*.

Running AlphaFold2 on CSD3

There are various ways to run AlphaFold2 on CSD3. We encourage the use of:

To get up and running quickly on CSD3 it is possible to run the Singularity container provided as a module:

module load alphafold/2.3.2-singularity

See Singularity for more information. This is not performant as it runs the slow CPU step and GPU step in sequence meaning that 4 GPUs are sitting idle for most of the time. Instead, see the ParaFold section for instructions on how to obtain better performance.

If you would like us to support other implementations of AlphaFold2 or if anything here is unclear or incorrect please contact support.

Separating CPU and GPU steps using ParallelFold and Conda

ParaFold is a fork of AlphaFold2 that separates the CPU MSA step from the GPU prediction so that it can be executed in a two step process. This is more desirable because the GPU remains idle for most of the running time when using DeepMind’s Singularity build as shown below.

To install, create a Conda environment using CSD3’s Conda module or download it yourself:

module load miniconda/3
conda create -n parafold python=3.8

If downloading yourself, use Miniforge as it is shipped with mamba which is an optimised implementation on Conda.

Then follow the instructions, with usage information here. Note that this fork is an optimised form of the original to run on CSD3.

First we run the CPU MSA step on an Icelake node. The -f flag means that we only run the featurisation step:

#SBATCH -p icelake
#SBATCH --exclusive
#SBATCH -t 04:00:00

# source conda environment
module load miniconda/3
conda activate parafold


./ \
-d $DATA \
-o output \
-p monomer_ptm \
-i input/mono_set1/GB98.fasta \
-m model_1 \

The featurisation step will output feature.pkl and an MSA directory in the output directory. To run a monomer prediction, execute the following command on the GPU:

#SBATCH -p ampere
#SBATCH --gres=gpu:1
#SBATCH -t 02:00:00

# source conda environment
module load miniconda/3
conda activate parafold


./ \
-d $DATA \
-o output \
-m model_1,model_2,model_3,model_4,model_5 \
-p monomer_ptm \
-i input/mono_set1/GB98.fasta \
-t 1800-01-01

Running AlphaFold2 using Singularity on CSD3

Load the Singularity image which exposes a run_alphafold script into the environment. The script sets some default paths to the dataset.:

module load alphafold/2.3.2-singularity

Create a slurm script with the following contents to predict the structure of the T1050 sequence (779 residues). The script assumes that an input subdirectory exists containing T1050.fasta file:

#SBATCH -p ampere
#SBATCH --gres=gpu:1
#SBATCH -t 04:00:00

# load appropriate modules
module load rhel8/default-amp
module load alphafold/2.3.2-singularity

run_alphafold \
--pdb70_database_path=/data/pdb70/pdb70 \
--bfd_database_path /data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--output_dir  $PWD/output/T1050 \
--fasta_paths $PWD/input/T1050.fasta \
--max_template_date=2020-05-14 \
--db_preset=full_dbs \

OR execute the full singularity command:

# point to location of AlphaFold data

singularity run --env \
  -B $DATA:/data \
  -B .:/etc \
  --pwd /app/alphafold \
  --nv ${SIMAGE} \
  --data_dir /data/ \
  --fasta_paths $PWD/input/T1050.fasta \
  --output_dir $PWD/output/T1050/ \
  --use_gpu_relax=True \
  --max_template_date=2020-05-14 \
  --uniref90_database_path=/data/uniref90/uniref90.fasta \
  --mgnify_database_path /data/mgnify/mgy_clusters_2022_05.fa \
  --template_mmcif_dir=/data/pdb_mmcif/mmcif_files \
  --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat \
  --bfd_database_path /data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
  --uniref30_database_path /data/uniref30/UniRef30_2021_03 \

Timings are reported below running on 5 models for T1050 (779 residues) sequence:

real    149m52.862s
user    1111m23.014s
sys     22m55.353s
    "features": 5646.51958489418,
    "process_features_model_1": 95.72981929779053,
    "predict_and_compile_model_1": 233.02064847946167,
    "predict_benchmark_model_1": 130.08757734298706,
    "relax_model_1": 334.7365086078644,
    "process_features_model_2": 4.438706398010254,
    "predict_and_compile_model_2": 184.557687997818,
    "predict_benchmark_model_2": 116.91508865356445,
    "relax_model_2": 307.3584554195404,
    "process_features_model_3": 3.6764779090881348,
    "predict_and_compile_model_3": 163.3666865825653,
    "predict_benchmark_model_3": 121.80361533164978,
    "relax_model_3": 420.58361291885376,
    "process_features_model_4": 4.023890972137451,
    "predict_and_compile_model_4": 169.06972408294678
    "predict_benchmark_model_4": 121.70339488983154,
    "relax_model_4": 300.7459502220154,
    "process_features_model_5": 4.179120063781738,
    "predict_and_compile_model_5": 154.17626547813416,
    "predict_benchmark_model_5": 108.35132598876953,
    "relax_model_5": 329.9167058467865

The Singularity image was built from Deepmind’s docker script and has been tested on the A100 nodes. The MSA construction and model inference are done on the same node type - it isn’t easy to decouple the two steps without using separate implementations (see below). Users can choose to run on CPU but the inference step takes considerably longer than on a GPU. Running on a GPU means that the CPU preprocessing (MSA) step can dominate the running time (depending on the particular sequence we wish to predict the structure of).

Current Issues

We are aware of the slow preprocessing time of hhblits on CSD3 and are working to improve this. For the small database it’s possible to pre-stage the data on the local ssd drive (with rsync), but this is not possible for the full database as it exceeds the capacity of the local ssd drive.


Building the singularity image on CSD3:

git clone alphafold
cd ./alphafold
docker build -f docker/Dockerfile -t alphafold .
docker tag alphafold:latest ma595/alphafold:latest
docker push alphafold:latest