How to automate stripping setup on Lustre using iRODS for data ingest management.

If iRODS[1] is used to manage petabytes of data on high perfromance filesystem like Lustre[2]. You may want to set some specific parameters (like striping) before uploading the data set. This can usually be done by command-line tools, however, when you expect a high throughput of I/O calling os.system from python rule engine for every newly created directory is probably not the best approach. In this post you’ll see how to call lustre API directly from iRODS python rule engine.

Prepare the .so library providing a function with the functionality you need.

In my case the goal was to set striping of a directory on a specific Lustre pool, which can be achieved from the command line as:

lfs setstripe --pool=POOL_NAME /path/to/directory/on/lustre

Checking Lustre API documentation[3] you’ll very likely find llapi_file_open function allowing the setup of various striping parameters, but the pool name. Thanks to #opensource the other way around is to look into lfs source code, which directs to llapi_file_open_param function. After a quick look at liblustreapi.c[4] we can find a nice wrapper of it doing exactly what we want: llapi_file_open_pool. We can call it directly from our iRODS python rule engine, but let’s wrap it in our own C library so we can potentially add some error handling and construction of pool name at a lower level (prevent rule engine failures with python exceptions – our function always returns 0):

#include <lustre/lustreapi.h>
#include &ltstdio.h>
//To build: gcc -llustreapi /path/to/setStripe.c -fPIC -shared -o /etc/irods/setStripe.so
int setStripe(char *path, int offset)
{
int rc;
char pool[100];
sprintf(pool, "pool%02d", offset);
rc = llapi_file_open_pool(path, O_DIRECTORY, 0, 0, -1, -1, LOV_PATTERN_RAID0, pool);
if (rc < 0)
return rc;
return 0;
}
view raw setStripe.c hosted with ❤ by GitHub

Now we can just compile it (like in the source comment), putting the final .so into the default iRODS configuration directory.

gcc -llustreapi /path/to/setStripe.c -fPIC -shared -o /etc/irods/setStripe.so

Use your library in core.py

Setting stripping on a directory creation is now simple as using the above function in our core.py:

from ctypes import CDLL
lustreFunc = CDLL("/etc/irods/setStripe.so")
[…]
def Pyrule_resource_mkdir_post(rule_args,callback,rei):
dataObj = splitKV(rule_args[0])
physPath = str(dataObj['physical_path'])
rc = lustreFunc.setStripe(physPath, pool)
if rc:
callback.writeLine("serverLog","setting pool failed with:'"+str(rc)+"'");
view raw core.py hosted with ❤ by GitHub


That’s all šŸ™‚ We can easily implement any logic deciding which pool will be used in the python rule engine.


[1]https://irods.org/
[2]https://www.lustre.org/
[3]https://doc.lustre.org
[4]https://github.com/lustre

Leave a comment