BwUniCluster2.0/File System Migration Guide

From bwHPC Wiki
Jump to navigation Jump to search

This page describes changes on the file systems between bwUniCluster 1 and bwUniCluster 2.0.

Lustre file systems

With bwUniCluster 2.0 new file systems for $HOME and workspaces have been purchased. The user data of the corresponding old file systems has been copied by KIT/SCC to these new file systems. The content of $WORK has to be migrated manually by the users to workspaces during the next 4 weeks, for details see below.

Changes on $HOME and workspace directories

You should see the same data as before but if this is not true - which is possible in rare cases since the data of expired users has been cleaned up - please contact the hotline or open a ticket. The virtual path to $HOME (/home/ORG/GROUP/USER) is still the same (except for University of Stuttgart users, see below) but the physical path (which can be seen by /bin/pwd) has changed. If one of yours scripts includes a physical path (wich is not recommended) you should modify these scripts. If applications complain about the old physical path (which started with /pfs/data1/ or /pfs/data2/) you should modify or recompile the application.

New user quota limits for $HOME

There are new user quota limits of 2 TiB and 10 million files on $HOME. Users which have currently more data have to reduce their data during the next 4 weeks. You can show your current usage and limits with the following command:

$ lfs quota -uh $(whoami) $HOME

Note that in addition to the user limit there is a limit of your organization which depends on the finiancial share. This limit existed before, too, but it is now enforced with so-called Lustre project quotas and before Lustre group quotas were used. You can show the current usage and limits of your organization with the following command:

lfs quota -ph $(grep $(echo $HOME | sed -e "s|/[^/]*/[^/]*$||") /pfs/data5/project_ids.txt | cut -f 1 -d\ ) $HOME

New user quota limits for the workspace file system

There are new user quota limits of 40 TiB and 30 million files on the workspace file system. Users which have currently more data have to reduce their data during the next 4 weeks. The limits include released and expired workspaces which are internally deleted after few weeks. You can show your current usage and limits with the following command:

$ lfs quota -uh $(whoami) /pfs/work7

If it is really needed users can request higher limits by opening a ticket or sending an email to the hotline. However, they have to clearly describe why higher limits are needed, how long these extended limits are required, and why other storage opportunities cannot be used.

New stripe count settings for the workspace file system

For the workspace file system directories with help of the new Lustre feature Progressive File Layouts has been used to define file striping parameters. This means that the stripe count is adapted if the file size is growing. In normal cases users no longer need to adapt file striping parameters in case they have very huge files or in order to reach better performance.

Hints for migration of $WORK data

Workspaces should be used instead of $WORK. Advantages of workspaces are a clear deletion policy and the ability to restore data from expired workspaces for few weeks. The old $WORK will be mounted in read-only mode for 4 weeks. If data on $WORK is still needed it has to be migrated by the users. Example for the migration to a workspace:

# create a new workspace
$ ws_allocate newwork 60
# define environment variable for old WORK
OLDWORK=$WORK
# define environment variable for workspace
WORK=$(ws_find newwork)
# Copy the data with rsync
rsync -a -A $OLDWORK/ $WORK/

Of course, if only a part of your old $WORK is needed you should only copy that part. As shown above you can redefine the WORK environment variable. For example, such a definition could be added to old scripts and then no further change on the script should be required.