BinAC2/Project Data Organization: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
(Created page with "= Data Organization on BinAC 2 project directories = This guide explains how to organize and manage your data in the project directory at <code>/pfs/10/project</code> BinAC 2. Each project has its own dedicated directory with flexible permission systems to help you control access to your files and collaborate effectively with team members. '''Important''' Use workspaces for actual computations. Data in workspaces are stored on '''much''' faster storage! == Project Di...")
 
(No difference)

Latest revision as of 09:16, 17 September 2025

Data Organization on BinAC 2 project directories

This guide explains how to organize and manage your data in the project directory at /pfs/10/project BinAC 2. Each project has its own dedicated directory with flexible permission systems to help you control access to your files and collaborate effectively with team members.

Important Use workspaces for actual computations. Data in workspaces are stored on much faster storage!

Project Directory Structure

Every project gets a dedicated directory located at:

/pfs/10/project/<project_id>/

For example, if your project ID is bw16f003, your project directory would be:

/pfs/10/project/bw16f003/

Understanding Your Project Directory

When you list your project directory, you'll see something like this:

$ ls -ld /pfs/10/project/bw16f003/
drwxrwx---. 5 root bw16f003 33280 Jul 25 14:13 /pfs/10/project/bw16f003/

Let's break down what this means:

  • d: This is a directory (not a file)
  • rwxrwx---: The permissions (explained in detail below)
  • root: The owner of the directory
  • bw16f003: The group that owns the directory (your project group)
  • 33280: The size in bytes
  • Jul 25 14:13: Last modification date
  • /pfs/10/project/bw16f003/: The full path

Understanding Unix Permissions

Unix permissions control who can read, write, or execute files and directories. They are displayed as a 10-character string like drwxrwx---.

Permission Characters

  • r (read): Can view file contents or list directory contents
  • w (write): Can modify files or create/delete files in directories
  • x (execute): Can run files as programs or enter directories

Permission Groups

Permissions are shown for three groups of users:

  1. Owner (positions 2-4): The user who owns the file/directory
  2. Group (positions 5-7): Users who belong to the file's group
  3. Others (positions 8-10): Everyone else

Example Breakdown

For drwxrwx---:

  • d: Directory
  • rwx (owner): Owner can read, write, and execute
  • rwx (group): Group members can read, write, and execute
  • --- (others): Others have no permissions

Numeric Permission Values

Permissions on files and directories are changed with the tool chmod, which uses numerical values for describing permissions.

Number Binary Permissions Description
0 000 --- No permissions
1 001 --x Execute only
2 010 -w- Write only
3 011 -wx Write and execute
4 100 r-- Read only
5 101 r-x Read and execute
6 110 rw- Read and write
7 111 rwx Read, write, and execute

Common Permission Patterns

Pattern Use Case Description
700 Private files/directories Only owner can access
750 Shared read-only Owner full access, group read-only
770 Shared read-write Owner and group full access
644 Public readable files Owner can edit, others can read
755 Public executable directories Owner can edit, others can read/execute

Data Organization Strategies

Private Directories

Create directories that only you can access:

# Create a private directory
mkdir /pfs/10/project/<project_id>/$USER

# Set permissions so only you can access it
chmod 700 /pfs/10/project/<project_id>/$USER

Permission breakdown for 700:

  • Owner: rwx (7 = 4+2+1) - full access
  • Group: --- (0) - no access
  • Others: --- (0) - no access

Shared Project Directories

Create directories that all project members can access:

# Create a shared directory
mkdir /pfs/10/project/<project_id>/<your directory name>

# Set permissions for group access
chmod 770 /pfs/10/project/<project_id>/<your directory name>

Permission breakdown for 770:

  • Owner: rwx (7) - full access
  • Group: rwx (7) - full access for project members
  • Others: --- (0) - no access

Read-Only Shared Directories

Create directories where project members can read but not modify:

# Create a read-only shared directory
mkdir /pfs/10/project/<project_id>/<your directory name>

# Set read-only permissions for group
chmod 750 /pfs/10/project/<project_id>/<your directory name>

Permission breakdown for 750:

  • Owner: rwx (7) - full access
  • Group: r-x (5 = 4+1) - can read and enter, but not write
  • Others: --- (0) - no access

Advanced Access Control with ACLs

For more fine-grained control beyond basic Unix permissions, you can use Access Control Lists (ACLs). ACLs allow you to set permissions for specific users.

Checking Current ACLs

# Check if a directory has ACLs
getfacl /pfs/10/project/<project_id>/<your directory name>

Setting User-Specific Permissions

# Give user 'alice' read and execute permissions
setfacl -m u:alice:rx /pfs/10/project/<project_id>/<your directory name>

# Give user 'bob' full permissions
setfacl -m u:bob:rwx /pfs/10/project/<project_id>/<your directory name>

Removing ACLs

# Remove ACL for specific user
setfacl -x u:alice /pfs/10/project/<project_id>/<your directory name>

# Remove all ACLs and revert to basic permissions
setfacl -b /pfs/10/project/<project_id>/<your directory name>

Default ACLs for New Files

Set default ACLs that will be applied to new files created in a directory:

# Set default ACL for new files in directory
setfacl -d -m u:alice:rw /pfs/10/project/<project_id>/<your directory name>

Practical Examples

Example 1: Personal Workspace with Shared Results

# Create your personal workspace
mkdir /pfs/10/project/<project_id>/user_alice
chmod 700 /pfs/10/project/<project_id>/user_alice

# Create a results directory that others can read
mkdir /pfs/10/project/<project_id>/user_alice/results
chmod 755 /pfs/10/project/<project_id>/user_alice/results

Example 2: Collaborative Analysis Directory

# Create collaborative space
mkdir /pfs/10/project/<project_id>/collaborative_analysis
chmod 770 /pfs/10/project/<project_id>/collaborative_analysis

# Create subdirectories for different types of work
mkdir /pfs/10/project/<project_id>/collaborative_analysis/{scripts,data,results}
chmod 770 /pfs/10/project/<project_id>/collaborative_analysis/*

Example 3: Mixed Access with ACLs

# Create a directory with complex permissions
mkdir /pfs/10/project/<project_id>/complex_project
chmod 750 /pfs/10/project/<project_id>/complex_project

# Give specific users different levels of access
setfacl -m u:alice:rwx /pfs/10/project/<project_id>/complex_project
setfacl -m u:bob:rx /pfs/10/project/<project_id>/complex_project
setfacl -m g:external_collaborators:r /pfs/10/project/<project_id>/complex_project

Best Practices

Plan Your Directory Structure

Before creating directories, plan your organization:

  • Personal directories: For work-in-progress and private files
  • Shared directories: For collaboration and shared resources

Use Descriptive Names

Choose clear, descriptive directory names:

  • protein_folding_analysis
  • shared_reference_genomes
  • alice_optimization_scripts
  • stuff
  • temp
  • dir1

Set Permissions Appropriately

  • Start with restrictive permissions and open up as needed
  • Use 750 for directories you want to share read-only
  • Use 770 for full collaboration
  • Use ACLs for complex permission requirements

Regular Backups

Important The cluster does not provide automatic backups for project directories. You are responsible for backing up your important data.

Why Backups Matter:

Hardware failures can result in permanent data loss Human errors (accidental deletion, wrong commands) happen Corrupted files from failed jobs or system issues No recovery possible once data is lost from the cluster

Document Your Organization

Create a README file in your project directory explaining the structure:

# Create documentation
cat > /pfs/10/project/<project_id>/README.md << 'EOF'
# Project <project_id> Directory Structure

## Overview
This project focuses on protein folding simulation and analysis.

## Directory Structure
- `shared_data/`: Reference datasets (read-only for group)
- `collaborative_analysis/`: Shared analysis scripts and results
- `user_alice/`: Alice's personal workspace
- `user_bob/`: Bob's personal workspace
- `archive/`: Completed analysis and backups

## Permissions
- Group members have read access to shared_data/
- All group members can contribute to collaborative_analysis/
- Personal directories are private to each user
EOF

Troubleshooting

Permission Denied Errors

If you get "Permission denied" errors:

  1. Check the permissions:
ls -la /pfs/10/project/<project_id>/problematic_directory
  1. Check if you're in the correct group:
groups
  1. Check ACLs if basic permissions look correct:
getfacl /pfs/10/project/<project_id>/problematic_directory

Cannot Create Files in Directory

This usually means the directory lacks write permissions for your user/group:

# Check directory permissions
ls -ld /pfs/10/project/<project_id>/target_directory

# Fix if you own the directory
chmod g+w /pfs/10/project/<project_id>/target_directory

Accidentally Locked Yourself Out

If you accidentally removed your own permissions:

# As the owner, you can always restore permissions
chmod 755 /pfs/10/project/<project_id>/locked_directory