SDS@hd/Access/WEBDAV

From bwHPC Wiki
< SDS@hd‎ | Access
Revision as of 13:07, 24 November 2023 by S Richling (talk | contribs)
Jump to navigation Jump to search

It is possible to access the SDS@hd service from Windows, Mac and Linux using the WebDAV protocol.

This enables easy access to SDS@hd without additional registration of your own computer. This way can also be useful if you are in a network in which, e.g., SMB and NFS are not available, e.g., due to firewall restrictions, but want a faster and more robust connection then SFTP.

Attention: In principle, however, the connection might not be suitable for permanent connections, since this depends highly on the used client if it is highly available. Because of this we advise the Rclone client.

Prerequisites

Attention: To access data served by SDS@hd, you need a Service Password. See details at SDS@hd/Registration.

Easy access via web browser

For easy access, it is possible to access SDS@hd in a web browser. Visit [1] and login with your SDS@hd username and service password. Here you can get an overview of the data in your "Speichervorhaben" and download single files. To be able to do more, like moving data, uploading new files, or downloading complete folders, a suitable client is needed as described in the next section.

Installing the WebDAV client Rclone

Rclone is a command line tool to manage files on cloud storage systems and can easily be used to access SDS@hd. Detailed instructions on how to download and install Rclone can be found here.

Quickstart

Rclone is a Go program and comes as a single binary file.

  • Download the relevant binary.
  • Extract the `rclone` executable, `rclone.exe` on Windows, from the archive.
  • You can start using the executables without further installing required. For easy use, it is recommended to add the binary to your PATH environment variable. Information on how to do this can be found below.
  • Run `rclone config` to set up SDS@hd connection. See rclone webdav config docs for more details.
  • Optionally configure automatic execution.

Detailed information regarding different operating systems can be found here:

Configuring the SDS@hd connection

To configure the SDS@hd WebDAV remote, you will need to use the following URL and have a valid username and password.

Config overview:

url = https://lsdf02-webdav.urz.uni-heidelberg.de
user = hd_xy123
pw = SERVICE_PASSWORD

To add the SDS@hd connection to rclone, simply run:

> rclone config

This will guide you through an interactive setup process. It is important to use the provided configuration values to get a working SDS@hd connection.

No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> sds-hd

Type of storage to configure.
Choose a number from below, or type in your own value
[snip]
XX / WebDAV
   \ "webdav"
[snip]
Storage> webdav

URL of http host to connect to
E.g. https://example.com
Enter a value
url> https://lsdf02-webdav.urz.uni-heidelberg.de

Name of the WebDAV site/service/software you are using
Choose a number from below, or type in your own value
 1 / Fastmail Files
   \ (fastmail)
 2 / Nextcloud
   \ (nextcloud)
 3 / Owncloud
   \ (owncloud)
 4 / Sharepoint Online, authenticated by Microsoft account
   \ (sharepoint)
 5 / Sharepoint with NTLM authentication, usually self-hosted or on-premises
   \ (sharepoint-ntlm)
 6 / Other site/service or software
   \ (other)
vendor> other

User name
user> <insert sds@hd username, eg. hd_xy123>

Password.
y) Yes type in my own password
g) Generate random password
n) No leave this optional password blank
y/g/n> y
Enter the password:
password: <enter service pwd>
Confirm the password:
password: <enter service pwd>

Bearer token instead of user/pass (e.g. a Macaroon)
bearer_token>

Remote config
--------------------
[sds-hd]
type = webdav
url = https://lsdf02-webdav.urz.uni-heidelberg.de
vendor = other
user = hd_xy123
pass = *** ENCRYPTED ***
bearer_token =
--------------------
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d> y

After this, you can exit the rclone config program.

Using Rclone client interactively

A detailed explanation on how to use Rclone can be found here!

In general, the syntax to use Rclone is like this:

Syntax: [options] subcommand <parameters> <parameters...>

Source and destination paths are specified by the name you gave the storage system in the config file (e.g. sds-hd ) then the subpath, e.g. "sds-hd:sd16a0001" to look at Speichervorhaben "sd16a001" on SDS@hd.

A few examples for an easy start

List all directories/containers/buckets in the Speichervorhaben sd16a001.

rclone lsd sds-hd:sd16a001 

Copies /local/path to the remote path on SDS@hd

 rclone copy </local/path> sds-hd:<remote/path> 

Copies fom remote path on SDS@hd to /local/path

 rclone copy sds-hd:<remote/path> </local/path> 

Moves the contents of the source directory to the destination directory.

 rclone move sds-hd:<source/path> sds-hd:<destination/path> 

More subcommands can be found here.

Using Rclone to create a local mount

Mount SDS@hd as a file system on a mount point. Detailed information on how to use rclone mount can be found here.

On Linux and macOS, you can run mount in either foreground or background (aka daemon) mode. Mount runs in foreground mode by default. Use the --daemon flag to force background mode. On Windows, you can run mount in the foreground only, the flag is ignored.

Using Rclone mount on Linux or macOS

On Linux or macOS, start the mount like this, where /path/to/local/mount is an empty existing directory:

rclone mount sds-hd:path/to/files /path/to/local/mount

Using Rclone mount on Windows

To run rclone mount on Windows, you will need to download and install WinFsp. More Information can be found here. To mount on drive letter X or a nonexistent subdirectory, use:

rclone mount sds-hd:path/to/files X:
rclone mount sds-hd:path/to/files C:\path\parent\mount

Best practices

Rclone/ WebDAV has a lot of useful options.

Performance

To be able to utilize a larger bandwidth, it is helpful to add the following options for increased performance:

--transfers <int>

Number of file transfers to run in parallel (default: 4). Depending on the local Network, read and write speeds on the file system, and current load, different values might be best. For large transfers, it is advised to test local performance with different values beforehand.

  • In our tests, we observed the best results between 8 and 32.
  • For regular use cases, we recommend 16 as the default.
  • Values above 64 are not recommended and degrade performance.
--multi-thread-streams <int> 

Number of streams to use for multithreaded downloads (default: 4). Only important on very large files. This will cause multithreaded up/download on chunk-sized bits of the file.

The optimal value is highly specific to the local network and used Hardware. For regular use cases, we recommend 4 as the default.

Debugging and Statistics

To get updates on current progress, use:

--stats

Interval between printing stats, e.g. 500ms, 60s, 5m (0 to disable) (default 1m0s).

To get debug information, use:

--log-level=DEBUG 
--stats-log-level=DEBUG