Backups & Redundancy

Research File Store offers highly resilient storage and consistent backups, keeping up to a month of data changes.

RFS is built on top of the ZFS filesystem which provides point-in-time-consistent whole-filesystem snapshots, which we leverage for all projects. RFS project snapshots are replicated to a secondary off-site location, so that there are two-copies on-disk of all data.

The diagram below illustrates a high level overview of the service architecture.

RFS Overview

High Level overview of the RFS architecture.

Redundancy

Data is replicated asynchronously at hourly intervals to the off-site storage system located in our Soulsby Data Centre. All connections to RFS are routed through a proxy, which enables the service to switch over to the off-site replica in the event of a major failure of our primary Data Centre, with a Recovery Point Objective (RPO) of one hour. This means that in the event of a catastrophic failure of the West Cambridge storage system, up to at most one-hour of data could be lost.

The ZFS pools underneath RFS are created from multiple Raid-Z2 disk groups spread across different storage enclosure, giving an industry-standard level of resiliency in the face of disk or enclosure-level failures.

Integrity

Research File Store has end-to-end data integrity built in, thanks to ZFS checksums providing protection from silent data corruption. In ZFS, all data blocks (not just metadata as in other filesystems) are checksummed and the checksum is stored in a Merkle tree, providing the ability for the filesystem to self-validate on every read and guarantee data integrity for the entire pool.

Backups

Data on RFS are backed up using a technology called snapshots. This is builtin thanks to the filesystem RFS is built on.

Key facts:

  • Read only. Data within snapshots cannot be modified, providing protection against cryptolocker and it’s many variants.
  • Scheduled Backups. Snapshots are taken on a schedule.
  • Easy Restore. Snapshots can be easily accessed by users.
  • One Month Restore Window. Users can easily restore data, going back a month.

Snapshot Policy

The snapshot retention periods are staggered with hourly, daily and weekly snapshots going back a month. By default all RFS projects have the same policy of:

  • 24 hourly snapshots
  • 7 daily snapshots
  • 5 weekly snapshots

How to Access Backups

Snapshots in RFS are self-service, meaning that users can access data from snapshots themselves, without needing to contact the Support team.

All snapshot’s are taken at the level of an individual RFS project.

Linux / *nix / Mac OSX Clients

Snapshots on these operating systems can be accessed through a special hidden directory that exists at the root of the SMB share, for example:

# At the root of the share is a special 'hidden' directory called
# '.zfs'. Inside this directory is a directory 'snapshot' that
# contains directories for each individual snapshot of the share
[matt@machine]:rfs-mjr208-testproject/ $ ls -1
[matt@machine]:hpr-2019-01-06-05-00-00-367/ $ ls -1
test-excel2.xlsx
test-excel.xlsx
testfile
testfile2
testfile3
test.txt
[matt@machine]:rfs-mjr208-testproject/ $ cd .zfs
[matt@machine]:.zfs/ $ ls -1
shares
snapshot
[matt@machine]:.zfs/ $ cd snapshot/
[matt@machine]:snapshot/ $ ls -1
hpr-2019-01-06-05-00-00-367  hpr-2019-02-02-04-00-00-744  hpr-2019-02-05-11-45-00-120  hpr-2019-02-05-17-45-00-152  hpr-2019-02-05-23-45-00-169  hpr-2019-02-06-04-45-00-203
hpr-2019-01-13-05-00-00-357  hpr-2019-02-03-04-00-00-742  hpr-2019-02-05-12-45-00-127  hpr-2019-02-05-18-45-00-151  hpr-2019-02-06-00-45-00-170  hpr-2019-02-06-05-45-00-205
hpr-2019-01-20-05-00-00-326  hpr-2019-02-03-05-00-00-733  hpr-2019-02-05-13-45-00-128  hpr-2019-02-05-19-45-00-154  hpr-2019-02-06-01-45-00-194  hpr-2019-02-06-06-45-00-209
hpr-2019-01-27-05-00-00-731  hpr-2019-02-04-04-00-00-742  hpr-2019-02-05-14-45-00-136  hpr-2019-02-05-20-45-00-159  hpr-2019-02-06-02-45-00-195  hpr-2019-02-06-07-45-00-211
hpr-2019-01-31-04-00-00-740  hpr-2019-02-05-04-00-00-743  hpr-2019-02-05-15-45-00-141  hpr-2019-02-05-21-45-00-162  hpr-2019-02-06-03-45-00-196  hpr-2019-02-06-08-45-00-214
hpr-2019-02-01-04-00-00-742  hpr-2019-02-05-10-45-00-107  hpr-2019-02-05-16-45-00-144  hpr-2019-02-05-22-45-00-169  hpr-2019-02-06-04-00-00-752  hpr-2019-02-06-09-45-00-223

# Each of these snapshots are a copy of the share at the time that
# they were taken - the date of the snapshot is in the directory name.
#
# Here we enter a snapshot taken on 6th Jan 2019, at 05:00am.
[matt@machine]:hpr-2019-01-06-05-00-00-367/ $ ls -1
project.dat
testfile
testfile2
testfile3
test.txt

# So above the 'project.dat' file is a file that was deleted sometime
# since this snapshot.
#
# To restore this file, you can simply copy this file with usual *nix
# tools like 'cp' or 'rsync'
[matt@machine]:hpr-2019-01-06-05-00-00-367/ $ cp project.dat ../../../

Windows 10

Note

The following information should work on Windows 7 as well, however it has not been tested (yet).

Within the Windows File Explorer, RFS snapshots can be accessed through the ‘Previous Versions’ feature, which appears when you right-click on a directory.

To be used optimally, it is recommended to open the ‘Previous Versions’ at the root of the share, since this works even if the underlying files or directories have changed names or been moved.

Below we show the process of opening this menu and locating the copy of the share at the previous snapshot:

Right-click on share to open 'Previous Versions'

The right-click context menu for ‘Previous Versions’ on a Share

Opening the ‘Restore previous versions’ menu, shows a list of all the snapshots as entries, with the ‘Date modified’ field showing when the snapshot was taken.

Example of 'Previous Versions' entries for a share

Example list of ‘Previous Versions’ on a Share

Double-clicking on any of these entries, will open a new File explorer window that then let’s you browse the share at the time the snapshot was taken. For example you can put both the original window and the snapshot window side-by-side to make comparing and restoring files easier. Files can then simply be dragged and dropped from the snapshot back into the current share to restore them.

Example of two explorer windows showing the snapshot next to the current share

Example of two file explorer windows side-by-side, with the snapshot copy on the left, and the current share on the right. Files can then be dragged and dropped from left to right to restore them.