Which Data to Archive
The archival service at CSCS offers 10 TB storage space. Users can contact us at help(at)cscs.ch if more is needed.
That said, it is obvious that, like all services, it should not be abused in order to preserve efficiency, stability, and reliability for all users.
As a general rule, users should use the archival service to store only important data that is meant to be preserved for a long time, such as historical series, observations, input to calculations and results.
Please note that home directories of production systems, as well as some other filesystems on some servers, already have backups!
Archiving such files would only create additional copies and waste space with no additional benefit.
It is advisable for all teams and groups to have a clear and common backup strategy to avoid multiple copies of the data.
Accessing Data
Usually all access to data is performed as just as it would on a normal filesesver: all data migration from/to the tape cartridge is done by the SamFS software transparently.
Due to the nature of HSM systems in general and, thus, for SamFS in particular, the operation of opening a file and reading its contents may imply a wait period that can be up to several minutes in some cases. That depends on where the file data is stored and the queue of files to be brought on-line.
Small files <128M should usually be accessed almost immediately, although a waiting time of 5-10 minutes can be still considered normal. For large files, a waiting time of 2 minutes is normal. Those waiting times refer to the wait for the first bytes to be available. The transfer of the rest of the file depends obviously on the file size.
It is highly recommended to pack "collection of files", or directories containing many small files, in tar or zip files.This avoids spreading data on different tapes and guarantees better performances for retrieving the totality of the data.
Pack data into tar-files of at least 1GB.
Getting a single big tarfile of 10GB may be 20 times faster than getting 1000 files of 10MB on the archive!
There is no need to lose time compressing the data you put to the archive, the drives have hardware compression.
I/O Limits
Please note that you should NEVER move more than 1000 GB in a day, and furthermore you always should split the task in smaller serial steps.
Excessive volume reading/staging and excessive volume of writing could create system or tape drives overload, causing extreme delays for other users or for normal service performance.
If you need to move large amount of data, please estimate your numbers and sizes beforehand carefully.


