iRODS at TACC
Last update: November 28, 2017
Introduction
iRODS is a data grid/data management tool. iRODS allows you to store data in a unified namespace using multiple storage resources, to replicate data so that copies exist on multiple systems, and to store checksums and arbitrary metadata with a file. The TACC iRODS configuration supports accessing iRODS through either the native iRODS tools such as the UNIX "i-Commands" or via the HTTPS and WebDAV protocols.
Each storage system accessed through iRODS is referred to as a "resource". In TACC's current Corral configuration, there are two disk-based resources accessible through iRODS, one for each of the Corral filesystems. The unreplicated GPFS filesystem is referred to within iRODS as corral-tacc
and the replicated filesystem is referred to as corral-repl
. By default all data will be stored in the corral-repl
file system.
Table 1. iRODS Resource Names and Corresponding Filesystems
Resource Name | Storage System |
---|---|
corral-repl |
Replicated GPFS file system (corral-repl |
corral-tacc |
TACC-only GPFS file system (corral-tacc |
Access to the iRODS system on Corral is available only to allocated users who have requested it. If you wish to utilize iRODS for accessing Corral, please indicate so in your Corral allocation request. Users with existing allocations who wish to make use of iRODS should submit a user support request. You may also email data@tacc.utexas.edu to discuss your data needs and we will be happy to make a recommendation on the best tools for managing your data.
Use of the Corral resource may be subject to allocation constraints.
iRODS Setup
Command-line Usage
The IRODS command-line utilities, collectively referred to as "i-commands", use a JSON configuration file to store iRODS initialization information under a user's home directory. A template version of that file is shown below in Example 1 below. To set up your computing environment, copy and paste this text into a file called ~/.irods/irods_environment.json
, then edit the file, replacing each instance of "USERNAME" with your TACC username. You may also change the irods_default_resource
line to use a different default resource if you so choose. Once you have created and saved this file, you can issue the iinit
command to start your iRODS session, after which you can store and retrieve data normally using the i-commands as documented below. If you will be accessing iRODS only through the WebDAV or other Web-based mechanisms you do not need to create this configuration file.
Example 1. iRODS Configuration Template
{
"irods_host": "irods.corral.tacc.utexas.edu",
"irods_port": 1247,
"irods_authentication_scheme": "PAM",
"irods_ssl_verify_server": "none",
"irods_log_level": "DEBUG",
"irods_default_resource": "corral-repl",
"irods_home": "/corralZ/home/USERNAME",
"irods_cwd": "/corralZ/home/USERNAME",
"irods_user_name": "USERNAME",
"irods_zone_name": "corralZ",
"irods_client_server_negotiation": "request_server_negotiation",
"irods_client_server_policy": "CS_NEG_REQUIRE",
"irods_encryption_key_size": 32,
"irods_encryption_salt_size": 8,
"irods_encryption_num_hash_rounds": 16,
"irods_encryption_algorithm": "AES-256-CBC",
"irods_default_hash_scheme": "SHA256",
"irods_match_hash_policy": "compatible",
"irods_maximum_size_for_single_buffer_in_megabytes": 32,
"irods_default_number_of_transfer_threads": 4,
"irods_transfer_buffer_size_for_parallel_transfer_in_megabytes": 4
}
GUI Usage
iRODS can be accessed via a Desktop GUI through the native support in the cross-platform Cyberduck file transfer client. Cyberduck is free, works on all major desktop platforms, and supports drag-and-drop operations from your desktop using the native iRODS protocol for high-performance network transfers.
https://web.corral.tacc.utexas.edu/irods-webdav/<i>PATH</i>
where "PATH
" is replaced with the path to your data in iRODS e.g. /corralZ/home/username
.
You can download Cyberduck for free from https://cyberduck.io. To connect to iRODS using the native iRODS plug-in for Cyberduck, download this Cyberduck profile to your desktop and open it in Cyberduck, either by double-clicking it or by dragging and dropping it onto the Cyberduck icon. Fill in your TACC username, and enter your password when you first connect to the server, and you will see a directory listing. By default you will be placed in your home directory, but you can change the default starting location to be a project directory if you have a shared project folder you wish to use instead. Once you have initiated a connection to the iRODS server via Cyberduck, you can drag files or folders into the Cyberduck window to upload them, and drag them from the window onto the desktop to download them. Cyberduck will perform recursive uploads and downloads, i.e. dragging a folder into the Cyberduck window will upload both the folder and any files and folders contained within that folder.
Basic i-Commands
Once you have configured the ~/.irods/irods_environment.json
file, you can use i-commands to access and manipulate data in the system. The i-commands nomenclature mimcs that of UNIX but with an "i" prepended to the command name e.g. ils
, imkdir
, icd
. Generally, i-commands are functionally equivalent to their UNIX counterparts. Complete i-commands documentation can be found on the iRODS site. The following table summarizes some of the most common i-commands.
Table 2. Common i-Commands
i-Command | Syntax | Description |
---|---|---|
iinit |
iinit yourpassword |
initialize and start an iRODS session |
ils |
ils file or directory |
like UNIX ls , list files or directories |
ilsresc |
ilsresc |
list all iRODS resources |
iput |
iput -v file /corralZ/home/user/dir |
store file/s into the system |
iget |
iget file /corralZ/home/user/dir |
retrieve file/s from the system |
imkdir |
imkdir new_directory |
like UNIX mkdir , create a new directory (collection) |
icp |
icp source destination |
copy a file or directory, many options available |
irsync
Use the irsync
command to synchronize a local directory with iRODS, similar to the Unix rsync command. It can be used to make an exact copy of a directory hierarchy on a local disk within iRODS, or retrieve an exact copy of a directory hierarchy already stored in iRODS. It may also be used to create an exact copy of a file or directory within iRODS. iRODS paths are identified with an i: prefix in the irsync
command.
For example, if you have created a directory within iRODS called /corralZ/home/joeuser/myproject
, and you wish to retrieve an exact copy of that directory on Stampede, run the command:
login1$ irsync -r i:/corralZ/home/joeuser/myproject /path/to/joeusers/workdir
After editing the files on Stampede, you can then synchronize the data back into iRODS using the command:
login1$ irsync -r /path/to/joeusers/workdir i:/corralZ/home/joeuser/myproject
If you are storing or retrieving data to Ranch with the -R ranch-main
option, you should also use the -s
switch - this will use the size rather than the checksum of the file to determine whether synchronization is necessary, thereby avoiding the need to retrieve all the files from tape to compute checksums. This will greatly improve the performance of synchronization with Ranch.
irm
Use the irm
command to remove files. By default, the irm
command only moves files to a temporary "trash" folder, and these files can be restored at a later date or completely removed with the irmtrash
command. You can use the -f
switch to force deletion of the file immediately, but files removed with the force switch can not be recovered.
irm
options include:
-f
force data removal-v
for verbose-r
for recursive-h
for help
ichmod
All users can see all other users' collections and files but cannot access, (read, write or own), where they do not have permissions. The ichmod
command, like the UNIX chmod command, allows a user to give file access permission to other users or groups. Note that iRODS only supports access control lists (ACLs), rather than the Unix-style user/group/other permissions structure. More information about ACLs is available in other TACC tutorials.
-
Read Permission
login1$ ichmod read testuser testfile.txt login1$ ils -A testfile.txt
The
ils -A
command shows the Access Control List (ACL) for the filetestfile.txt
. Here,testuser
has been given read permission ontestfile
. -
Ownership Permission
Giving another user ownership permission will enable them to change ACL for the file or folder. For example, to give
testuser
ownership permissions meanstestuser
can then extend read/write/owner permissions to other users.login1$ ichmod own testuser testfile.txt
-
Removing Permissions
You can assign
null
to remove permissions from the ACL for a file or folder:login1$ ichmod null testuser testfile.txt
-
Inherit Permission
The "inherit" permission is a special type of permission you can assign to a directory, which causes all new files and folders created within that directory to have the same permissions as the parent. For example, you can grant read permissions to a specific user, add the inherit permission, then add a number of files to that directory, which will also have the read permissions assigned to the enclosing folder. This makes it easier to share files within a group or to copy complicated permissions structures to additional files.
login1$ ichmod inherit testuser myfolder
ichmod
command options include:
-v
for verbose-r
for recursive-R
for Resource
Installing the iRODS client
You may also use the iRODS i-Commands to view and manipulate your data, as well as storing and retrieving data, from your own desktop or laptop running a Redhat or Ubuntu-based version of Linux, by downloading and installing the iRODS client tools. iRODS binary package downloads and installation instructions are at https://irods.org/download/.
References
See also: