Overview

IT Research Computing provides its users a variety of centralized data storage resources and tools for effective research data management including:

  • Data Storage
  • Data Archive
  • Data Backup
  • External Data Sharing
File SystemQuotaBack upComments
/DataWaha20 TB/PI or 2 TB/userYes (daily incremental)Permanent Shared Storage
/WahaDrive2 TB/UserYes (daily incremental)Dropbox Style Storage
/home200 GB/userYes (daily incremental)Common home directory

Data Backup

IT Research Computing provides centralized data backup for KAUST research data, all the data under DataWahaWahaDrive,  Workstation home directory are automatically backed up daily. In addition, we provide backup to KSL NetApp Storage including KSL home directory.

Data backup options

  • Copy important research data under DataWaha
  • Use Dropbox style storage WahaDrive
  • Use Atempo Lina for personal device backup including desktop and laptops

Examples of data backup using DataWaha and WahaDrive

# Use 'rsync' to copy the data from any device

$ ssh <userid>@dm.kaust.edu.sa

$ cd /datawaha/<Your Folder>

# Setup automation using crontab

$ crontab -e rsync -avz .....

Upload data from Mac by connecting to your folder smb://datawaha.kaust.edu.sa/<Your Folder>

Upload data from Windows my mapping a new drive \\datawaha.kaust.edu.sa\<Your Folder>

Use Dropbox style storage and upload the data from any device 
URL https://wahadrive.kaust.edu.sa/ with your portal id and password. 

How to request data backup services

First check if your PI has a folder under /datawaha or if you can access https://wahadrive.kaust.edu.sa/; if not contact us at ithelpdesk@kaust.edu.sa, subject “Research Data Archive”

Data Archive

IT Research Computing provides a centralized data archive solution to indefinitely stored KAUST research data and preserve valuable research data for future use in scientific findings. We also provide low cost data storage to manage the data growth demand.

Data Archive Options

  • Copy or move the data under /datawaha/<Your Folder>/archive
If the data already exists under /datawaha
Login to Data Movers
$ ssh <userid>@dm.kaust.edu.sa
$ cd /datawaha/<Your Folder>/archive
$ mv /datawaha/<Your Folder>/<Dir To Archive> /datawaha/<Your Folder>/archive

How To Request Data Archive Services

First check if your PI has folder under /datawaha, if not contact us at ithelpdesk@kaust.edu.sa, subject “Research Data Backup”

Data Storage

DataWaha

DataWaha is a centralized data storage, a permanent home for KAUST research data. It is secure by providing protection against data loss. When digital research data is not stored in a managed and secure repository, chances of data loss increase due to equipment failure, corruption, or user errors.

DataWaha is available for all the KAUST researchers, faculties and students, default disk quota is 20TB per PI or 2TB per User. It is accessible from any device within KAUST network and thru VPN from outside the KAUST network. See how to access datawaha for more info.

Note: Use scp or rsync to transfer large amounts of data to & from datawaha. See file transfer tips and How to Access datawaha for more info.

How To Access datawaha

From Data Movers
$ ssh <userid>@dm.kaust.edu.sa
$ cd /datawaha/<Your Folder> 
From Remote Workstations
$ cd /datawaha/<Your Folder> 

From Linux workstations or login nodes (including Shaheen and/or IBEX login nodes) $ mkdir ~/datawaha/<Your Folder> $ sshfs -o cache=yes,kernel_cache,large_read,StrictHostKeyChecking=no \

<userid>@dm.kaust.edu.sa:/datawaha/<Your Folder> ~/datawaha/<Your Folder> Note: You must unmount the folder after you done $ fusermount -u ~/datawaha/<Your Folder>

From Mac smb://datawaha.kaust.edu.sa/<Your Folder>

From Windows \\datawaha.kaust.edu.sa\<Your Folder>

Request DataWaha Directory Please check first if your research group already has authorization to use /datawaha. If not, you need to fill in the following FORM, open a ticket with ithelpdesk@kaust.edu.sa, and attach a scanned copy of your form to the ticket.

WahaDrive

wahadrive is a Dropbox style storage for easy data sharing within KAUST. Users can share large amount of data using “share by link” option without moving the data. Similar to Dropbox, all the data in wahadrive keeps version and deleted files in the trash bin for 90 days. wahadrive is accessible using https://wahadrive.kaust.edu.sa/ with your portal id and password.

Open a ticket with ithelpdesk@kaust.edu.sa, with subject “WahaDrive Access”.

Home (Noor home)

Common login directory across all the Remote Workstations and data movers. Each user gets a 200G /home directory upon account creation that is setup automatically.

How To Access Noor Home

From Data Movers 
$ ssh dm.kaust.edu.sa
From Remote Workstations just open a terminal

From Linux workstations or login nodes (Including Shaheen and/or IBEX login nodes) $ mkdir ~/noorhome $ sshfs -o cache=yes,kernel_cache,large_read,StrictHostKeyChecking=no \

userid@dm.kaust.edu.sa:/home/<userid> ~/noorhome Note: You must unmount the noor-home after you done $ fusermount -u ~/noorhome

External data Transfer

IT Research Computing provides a centralized external data transfer solution for high speed external data transfer and external data collaborations.

External data transfer Options

External Drive - External data transfer and data collaboration using Dropbox-style storage. Temporary share via “public link” with an external user. No account is required, you can send the link with a temporary password. Copy the link send it the users with your “temporary password”.

Open a ticket with ithelpdesk@kaust.edu.sa, with subject “Data transfer or data sharing with external collaborators”.

Note: PI approval required for all external data transfer access request.

File Transfer Tips

scp

scp allows you to copy between two machines over a network. It uses ssh to transfer the data, requiring authentication in order to proceed with the transfer. It will even allow you to transfer files between two remote machines.

$ scp [options] [source] [destination]
  -C  option to enable file compression, which can sometimes improve your performance rates.
  -r  allows you to recursively copy entire directories.
$ scp -C -r /source/directory remotehost@destination:

When copying to/from a remote server, don't forget to include the colon (:) after the hostname or IP address.

Examples

From remote machine to local machine (there to here):
$ scp your_username@remotehost:/dir/somefile.txt /SomeLocalDirectory/
From the local machine to the remote machine (here to there):
$ scp somefile.txt your_username@remotehost:/some-remote-directory/
Transfer the data from remote host "datawaha"
Login to data mover
$ ssh dm.kaust.edu.sa 
$ scp your_username@remotehost:/SomeRemoteDirectory/somefile.txt /datawaha/<Your Folder>/

In case you have shell commands in your shell initialization files, you will need to modify your files with code as follows, in order for scp to work.

The following code is Bourne-shell compatible:

TTY=`/usr/bin/tty`
if [ $? = 0 ]; then
   /usr/bin/echo "interactive stuff goes here"
fi

The following code is C-shell compatible:

( /usr/bin/tty ) > /dev/null
if ( $status == 0 ) then
   /usr/bin/echo "interactive stuff goes here"
endif

For more information, type man scp at the command line.

rsync

rsync allows for synchronization between two sets of files across a network. It will only copy the data that is different between the two machines. It does not allow you to copy between two remote machines.

 rsync [options] [source] [destination]
      -a archive
      -v verbose

Examples

From local machine to remote machine (here to there): 
$ rsync -av /localdir/files* remoteuser@remotehost:/remote/dir/
From local machine to datawaha 
$ rsync -av /localdir/files* user@dm.kaust.edu.sa:/datawaha/directory/

For more information, type man rsync at the command line.

Some rsync options can be very resource intensive. Please do not run more than a single instance of rsync simultaneously on the same node, and avoid using rsync on directories with thousands of files. Processes may be killed if they cause a node to become unresponsive.

sftp

sftp is an interactive file transfer program which, like scp and  rsync, provides encrypted transfers between two machines over a network.

Open a session:
$ sftp userid@dm.kaust.edu.sa
  You will be asked for a password. Upon successful entry, you will see the sftp prompt:
sftp>
  To download/get a file from the remote machine and save it to your local machine:
sftp> get /remote/directory/file /local/directory/
  To upload/put a file from the local machine into the remote machine:
sftp> put /local/directory/file /remote/directory/
  To exit, type "exit". For more information, type "man sftp" at the command line.