General advice regarding storage¶
Home directory vs group shares vs scratch¶
Work related data doesn't belong in your home directory. Once you leave D-PHYS, your data will be abandoned and your group will not have access to it. Use a group share to store such data, only personal settings and 'private' stuff goes into your home (that's why it's rather small). For temporary data there's a scratch folder on every workstation.
Regular housekeeping¶
Delete what is no longer needed. Document folder structure and file locations for your future self and others.
Avoid too many files in the same directory¶
Don't store thousands of files in a single folder. This makes listing of the contents of the folder very slow. Create some folder hierarchy to split the files up for faster access.
Combine small files into single tar¶
Don't store your data in thousands of files of a only few kilobytes each. Even the smallest file will always allocate the minimal block size, and therefore waste disk space. Working with many small files also comes with a big overhead of system calls and disk seek operations. Combine such results into larger files using tar
(with or without compression).
Prefer binary formats to plain text¶
Consider optimized binary formats (eg hdf5) to store your data - but make sure they're open and well documented so you can still read your data in 5 year's time. They use less disk space and allow faster input/output than plain text files. Common formats have libraries for most programming languages to ease read/write operations.
File servers are not meant for backups¶
Do not use group shares to store backups of your machines. We provide a wide range of backup solutions. If you're unsure which one of those is suitable for you, please get in touch.