Data Transfer/Rclone and BwUniCluster2.0: Difference between pages

From bwHPC Wiki
< Data Transfer(Difference between pages)
Jump to navigation Jump to search
m (Added unmount instruction + alternative to daemon flag)
 
No edit summary
 
Line 1: Line 1:
<!-- Old text: check for permanent removal -->
<!--{| style="width: 100%; border-spacing: 5px;"
| style="text-align:center; color:#000;vertical-align:middle;font-size:75%;" |
[[File:BwUniCluster_2.0_Feb2020.jpg|center|border|550px|Close-up of bwUniCluster by Simon Raffeiner, Copyright: KIT (SCC)]]
|-
| style="text-align:center; color:#000;vertical-align:middle;" |<span style="font-size:80%">Close-up of bwUniCluster © KIT (Simon Raffeiner/SCC)</span>
|}


On 17.03.2020, the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT) commissioned a new parallel computer system called "bwUniCluster 2.0+GFB-HPC" as a state service within the bwHPC framework. The bwUniCluster 2.0 replaces the predecessor system [https://www.scc.kit.edu/dienste/bwUniCluster.php bwUniCluster] and also includes the additional compute nodes which were procured as an extension to the bwUniCluster in November 2016.
[https://rclone.org/docs/ Rclone] is a command line tool to manage files on remote systems (e.g. cloud storage systems). Rclone either synchronizes in one direction only or its mounting functionality is used with <code>rclone mount</code>. Data can be piped between two completely remote locations, sometimes without local download. One advantage is that the transfer is multithreaded and it operates on a file level basis.
'''Caution:''' You can't use Rclone with 2FA.


The modern bwUniCluster 2.0 system consists of more than 840 SMP nodes with 64-bit Intel Xeon processors. It provides the universities of the state of Baden-Württemberg with general compute resources and can be used free of charge by the staff of all universities in Baden-Württemberg. Users who currently have access to bwUniCluster will automatically also have access to bwUniCluster 2.0. There is no need to apply for new entitlements or to re-register.
== Installation ==


Rclone is a Go program and comes as a single binary file.


-->
# Download the relevant binary.
# Extract the <code>rclone</code> executable, <code>rclone.exe</code> on Windows, from the archive.
# You can use the executables without further installation. For easy use, it is recommended to add the binary to your PATH environment variable. Information on how to do this can be found below.


<!--
Detailed information regarding different operating systems can be found here:
###########################################
## bwUniCluster: Maintenance Section ##
###########################################
## Comment out full section if there no upcoming maintenance


* Installation on [https://rclone.org/install/#windows Windows]
* Installation on [https://rclone.org/install/#macos macOS]
* Installation on [https://rclone.org/install/#script-installation Linux]


{| style=" background:#FEF4AB; width:100%;"
== Usage Rclone ==
| style="padding:8px; background:#FFE856; font-size:120%; font-weight:bold; text-align:left" | Next maintenance
|-
|
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from


21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM
To use Rclone you have to define a config file. Afterwards you can connect by using the name of your configured connections.


Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes
=== Configure Remote ===
|}
-->
<!--
###########################################
## bwUniCluster: News section ##
###########################################
## Comment out full section if there no news
-->
{| style=" background:#FEF4AB; width:100%;"
| style="padding:8px; background:#FFE856; font-size:120%; font-weight:bold; text-align:left" | Transition bwUniCluster 2.0 &rarr; bwUniCluster 3.0
|-
|
The HPC cluster '''bwUniCluster 3.0''' is the successor to bwUniCluster 2.0 and will go into '''operation on April 7, 2025'''.
<br>


The new bwUniCluster 3.0 is a hot water-cooled system that will be operated at KIT Campus North. The cooling process achieves a significantly more efficient and energy-saving cooling of the computer system compared to the previous system.
Before you can start using Rclone, you need to set up a remote. This means to configure a specific connection by providing authentication information, the network protocol that you want to use and a name for this configuration so that you can use it later on.
<br>


'''Please note''': bwUniCluster 3.0 will be equipped with a '''new file system''' for HOME and workspaces. An automatic migration of your data to the new system will '''not''' take place. We will provide detailed instructions on how to migrate data. Bash scripts will also be available with which you can initiate the migration.
To configure a remote for a specific service, you need the following information:
<br>


The file systems of the old system will remain in operation for a period of 3 months after the new system goes live. This leaves enough time to copy any data to be migrated from the HOME directory and the workspaces to the new file systems.
* <code>&lt;remotehost&gt;</code>
<br>
* <code>&lt;username&gt;</code>
* <code>&lt;servicePassword&gt;</code>


|}
Furthermore, you have to decide on:


* <code>network protocol</code> (for example webDAV, smb, sftp)
* <code>remote-name</code> (for example you can use the name of the service you want to connect to)


<!--
You have three different options to set up a new remote which are explained by the following sections.
###########################################
## Picture of bwUniCluster - right side ##
###########################################
-->
[[File:BwUniCluster_2.0_Feb2020_1024x423.jpg|right|frameless|thumb|alt=bwUniCluster2.0 |upright=1| bwUniCluster 2.0 ]]


<span style="font-size: 1.1em; text-decoration: underline;">'''Interactive Setup''' </span>


<!--
Execute:
###########################################
<pre>rclone config</pre>
## About bwUniCluster ##
This will guide you through an interactive setup process. You can find detailed instructions at the website:
###########################################
-->
The '''bwUniCluster 2.0''' is the joint high-performance computer system of Baden-Württemberg's Universities and Universities of Applied Sciences for '''general purpose and teaching''' and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 2.0 complements the four bwForClusters and their dedicated scientific areas.


* [https://rclone.org/webdav/ Connect via sftp]
* [https://rclone.org/smb/ Connect via smb]
* [https://rclone.org/sftp/ Connect via webdav]


{| style="background:#FFCCCC; width:100%;"
<span style="font-size: 1.1em; text-decoration: underline;">'''Oneliner''' </span>
| '''The following issue is known:''' Due to the hardware configuration, there is currently an already known problem with OpenMPI on the nodes in the "multiple_il" partition. It manifests itself in the warning "No OpenFabrics connection schemes reported" when starting an MPI application and refers to the device "mlx5_2". This is an Ethernet port, which is not supposed to be used by OpenMPI. The warning is informative, we are working on suppressing this message.
|}


<!--
Define all parameters in one command. For example:
###########################################
<syntaxhighlight lang="bash">rclone config create test sftp host=<remotehost> user=<username> pass=<password> --obscure</syntaxhighlight>
## bwUniCluster: Maintenance Section ##
###########################################
## Comment out full section if there no upcoming maintenance


<span style="font-size: 1.1em; text-decoration: underline;">'''Adjust Config File''' </span>


{| style=" background:#FEF4AB; width:100%;"
To see, where the file is run: <code>rclone config file</code>.
| style="padding:8px; background:#FFE856; font-size:120%; font-weight:bold; text-align:left" | Next maintenance
|-
|
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from


21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM
You can use the following snippet as template for your connections.
<syntaxhighlight>[<remote-name>]
type = webdav
url = <hostURL>
vendor = other
user = <userID>


Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes
[<remote-name>]
|}
type = sftp
-->
host = <hostname>
<!--
user = <userID>
###########################################
key_use_agent = false
## bwUniCluster: News section ##
</syntaxhighlight>
###########################################
To add the password, please use
## Comment out full section if there no news
<syntaxhighlight>rclone config update <remote-name> pass=<password> --obscure</syntaxhighlight>
-->

<!--
=== Use Remote ===
{| style=" background:#FEF4AB; width:100%;"

| style="padding:8px; background:#FFE856; font-size:120%; font-weight:bold; text-align:left" | News
The syntax to use Rclone is like this:
|-

|
<pre>rclone [options] subcommand <parameters> &lt;parameters...&gt;
* 2022-11-25: All new nodes from bwUniCluster 2.0 Stage 2 are now available.
</pre>
|}
List all directories/containers/buckets in the folder XX.
-->

<!--
<pre>rclone lsd <remote-name>:XX
###########################################
</pre>
## bwUniCluster: Training/Support section##
Copies /local/path to the remote path
###########################################

-->
<pre>rclone copy &lt;/local/path&gt; <remote-name>:&lt;remote/path&gt;
{| style=" background:#eeeefe; width:100%;"
</pre>
| style="padding:8px; background:#dedefe; font-size:120%; font-weight:bold; text-align:left" | Training & Support
Copies fom remote path to /local/path
|-

|
<pre>rclone copy <remote-name>:&lt;remote/path&gt; &lt;/local/path&gt;
* [[BwUniCluster2.0/First_Steps|Getting Started]]
</pre>
* [https://training.bwhpc.de E-Learning Courses]
Moves the contents of the source directory to the destination directory.
* [[BwUniCluster2.0/Support|Support]]

* [[BwUniCluster2.0/FAQ|FAQ]]
<pre>rclone move <remote-name>:&lt;source/path&gt; <remote-name>:&lt;destination/path&gt;
* Send [[:Category:Feedback|Feedback]] about Wiki pages
</pre>
|}
More subcommands can be found [https://rclone.org/docs/#subcommands here].
<!--

###########################################
== Usage Rclone Mount ==
## bwUniCluster: User Documentation ##

###########################################
Before you can follow the instructions in this chapter, you need to have set up a [[Data_Transfer/Rclone#Usage_Rclone | remote]].
-->
Detailed information on how to use rclone mount can be found [https://rclone.org/commands/rclone_mount/ here].
{| style=" background:#deffee; width:100%;"

| style="padding:8px; background:#cef2e0; font-size:120%; font-weight:bold; text-align:left" | User Documentation
=== Windows ===
|-

|
To run rclone mount on Windows, you will need to [https://winfsp.dev/rel/ download and install WinFsp]. To mount on drive letter X or a nonexistent subdirectory, use:
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster2.0/Jupyter|Using Jupyter]]

* [[BwUniCluster2.0/Login|Login]]
<pre>rclone mount &lt;remote-name&gt;:path/to/files X:
*
rclone mount &lt;remote-name&gt;:path/to/files C:\path\parent\mount</pre>
* [[BwUniCluster2.0/Hardware_and_Architecture|Hardware and Architecture]]
In contrast to Linux/Mac, there is no background mode.
** [[BwUniCluster2.0/Hardware_and_Architecture#File_Systems|File Systems and Workspaces]]

* [[BwUniCluster2.0/Software|Cluster Specific Software]]
=== MacOS & Linux ===
** [[BwUniCluster2.0/Containers|Using Containers]]

* [[BwUniCluster2.0/Slurm|Batch System]]
You can run mount in either foreground or background (aka daemon) mode. Mount runs in foreground mode by default. Use the <code>--daemon</code> flag to force background mode. If this doesn't work, you can put an <code>&</code> at the end of the command instead.
** [[BwUniCluster2.0/Batch_Queues|Queues and interactive Jobs]]

* [[BwUniCluster2.0/Maintenance|Operational Changes]]
Create an empty directory on your local machine and then execute
|}

<!--
<pre># to mount the root folder:
###########################################
rclone mount --vfs-cache-mode full &lt;remote-name&gt;: /path/to/empty/folder
## bwUniCluster: Acknowledgement ##
# to mount a subfolder:
###########################################
rclone mount --vfs-cache-mode full &lt;remote-name&gt;:folderX/folderY /path/to/empty/folder
-->
# to unmount:
{| style=" background:#e6e9eb; width:100%;"
fusermount -uz /path/to/mounted/folder </pre>
| style="padding:8px; background:#d1dadf; font-size:120%; font-weight:bold; text-align:left" | Cluster Funding

|-
== Best Practices ==
|

* Please [[BwUniCluster2.0/Acknowledgement|acknowledge]] bwUniCluster 2.0 in your publications.
Rclone has a lot of useful options.
|}

=== Performance ===

To be able to utilize a larger bandwidth, it is helpful to add the following options for increased performance:

<pre>
--transfers <int>
</pre>

Number of file transfers to run in parallel (default: 4). Depending on the local Network, read and write speeds on the file system, and current load, different values might be best. For large transfers, it is advised to test local performance with different values beforehand.

* In our tests, we observed the best results between 8 and 32.
* For regular use cases, we recommend 16 as the default.
* Values above 64 are not recommended and degrade performance.

<pre>
--multi-thread-streams <int>
</pre>

Number of streams to use for multithreaded downloads (default: 4). Only important on very large files. This will cause multithreaded up/download on chunk-sized bits of the file.

The optimal value is highly specific to the local network and used Hardware. For regular use cases, we recommend 4 as the default.

=== Debugging and Statistics ===
To get updates on current progress, use:
<pre>
--stats
</pre>
Interval between printing stats, e.g. 500ms, 60s, 5m (0 to disable) (default 1m0s).

To get debug information, use:
<pre>
--log-level=DEBUG
--stats-log-level=DEBUG
</pre>

Revision as of 08:42, 31 March 2025


Transition bwUniCluster 2.0 → bwUniCluster 3.0

The HPC cluster bwUniCluster 3.0 is the successor to bwUniCluster 2.0 and will go into operation on April 7, 2025.

The new bwUniCluster 3.0 is a hot water-cooled system that will be operated at KIT Campus North. The cooling process achieves a significantly more efficient and energy-saving cooling of the computer system compared to the previous system.

Please note: bwUniCluster 3.0 will be equipped with a new file system for HOME and workspaces. An automatic migration of your data to the new system will not take place. We will provide detailed instructions on how to migrate data. Bash scripts will also be available with which you can initiate the migration.

The file systems of the old system will remain in operation for a period of 3 months after the new system goes live. This leaves enough time to copy any data to be migrated from the HOME directory and the workspaces to the new file systems.


bwUniCluster2.0


The bwUniCluster 2.0 is the joint high-performance computer system of Baden-Württemberg's Universities and Universities of Applied Sciences for general purpose and teaching and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 2.0 complements the four bwForClusters and their dedicated scientific areas.


The following issue is known: Due to the hardware configuration, there is currently an already known problem with OpenMPI on the nodes in the "multiple_il" partition. It manifests itself in the warning "No OpenFabrics connection schemes reported" when starting an MPI application and refers to the device "mlx5_2". This is an Ethernet port, which is not supposed to be used by OpenMPI. The warning is informative, we are working on suppressing this message.
Training & Support
User Documentation
Cluster Funding
  • Please acknowledge bwUniCluster 2.0 in your publications.