Overview¶
The Data Management Module (DMM) is a component of a SRCP Platform that allows strict control and auditing over the upload and download of data. Secure platforms with the DMM enabled have a Data Management Server (DMS) provisioned alongside the other components of the platform. The DMS is isolated from all other servers in the platform, and exists solely to facilitate the movement of data.
To move data into a secure platform an SRCP User needs to access the DMS using an SFTP client. Then a Platform Manager is required to authorise and move the data. In the case of data upload, a Platform Manager will copy approved files and directories from the DMS onto the project area. The reverse for data download.
It is the responsibility of the SRCP User to notify a Platform Manager that data is awaiting approval so they can authorise and move the data.
Description |
Path |
Permissions |
---|---|---|
Platform internal storage |
/srv/data-manager/srcp-rfs /srv/home/srcp-rfs /srv/home/<USER> |
platform-managers rwx platform-users rwx <USER> rwx |
Staging directory |
/srv/data-manager/triage/<USER> /srv/data-manager/triage/* |
<USER> rwx platform-manager rwx |
Accessing the Data Management Server (DMS)¶
The DMS is available over SFTP. Please, contact your SRCP Platform Manager to get the correct server address. This is usually in the format of data-<PLATFORM_NAME>.srcp.hpc.cam.ac.uk.
SSH Public Key Registration¶
Authorised SRCP Users are required to provide an SSH public key to allow them accessing their DMS.
Please, submit or replace your SSH key by using the following form: https://www.hpc.cam.ac.uk/srcps-ssh-public-key-registration
Upload and Download Data¶
After logging into the DMS, users will be presented with a read-only home directory with two writable directories. These are the user unique staging directories for uploading and downloading data.
Connected to data-<PLATFORM_NAME>.srcp.hpc.cam.ac.uk.
sftp> ls -al
drwxrwx--- 2 user data-manager 2 Apr 8 15:49 download
drwxrwx--- 2 user data-manager 3 Apr 8 15:51 upload
Platform Managers have read and write permissions to both staging directories since they are required to approve the data by moving it; either into the secure platform if data is being uploaded, or into the download directory if data is being downloaded.
Warning
Usage of SFTP clients is beyond the scope of this documentation. Please refer to tools such FileZilla, WinSCP, CyberDuck, rsync and sftp.
Upload¶
An SRCP User needs to connect to the DMS and write new data to their own upload directory over SFTP.
Connected to data-<PLATFORM_NAME>.srcp.hpc.cam.ac.uk.
sftp> put README.md upload
Uploading README.md to /upload/README.md
README.md
Download¶
A Platform Manager makes a file available to download by a specific SRCP User by copying the data into the SRCP User’s download directory where they will be able to retrieve it by using the DMS.
Connected to data-<PLATFORM_NAME>.srcp.hpc.cam.ac.uk.
sftp> get download/README.md
Fetching /download/README.md to README.md
/download/README.md
Approval Through Data Movement¶
The Platform Manager is the authorised role for retrieving and controlling the data in and out of the SRCP platform. Approval is done by moving the data from a user’s staging directory to the platform’s internal storage. That ensures the data manager agrees and is aware of what is being pulled or pushed to the platform.
User uploaded data lives in /srv/data-manager/triage
.
A Platform Manager can copy/move data using the command line or Remote Desktop file manager.
[manager@srcp-login ~]$ cp /srv/data-manager/triage/user/upload/README.txt
/srv/projects/my_project/
Once the move is complete the SRCP User is able to access the data in the project area.
[user@srcp-login ~]$ ls ~/srcp-rfs/
README.txt