Home‎ > ‎Server config‎ > ‎

Setting up lustre_rsync for Lustre filesystem replication

First, a changelog user must be defined:

Log into the MDS
then do:

[root@nmds ~]# lctl --device nlustre-MDT0000 changelog_register
nlustre-MDT0000: Registered changelog userid 'cl1'

All other operations hereafter must be done from the Lustre client, since MDT does not know about mount points etc.

To display the metadata changes on an MDT (the changelog records), run:

lfs changelog fsname-MDTnumber [startrec [endrec]]

 It is optional whether to specify the start and end records.

eg: lfs changelog nlustre-MDT0000

To deregister (unregister) a changelog user, run:

lctl --device nlustre-MDT0000 changelog_deregister cl1

changelog_deregister cl1  effectively does a changelog_clear cl1 0  as it deregisters.


Clearing Changelog Records

To notify a device that a specific user (cl1) no longer needs records (up to and including 3) -:

$ lfs changelog_clear nlustre-MDT0000 cl1 3

To confirm that the changelog_clear operation was successful, run lfs changelog; only records after id-3 are listed 

$ lfs changelog nlustre-MDT0000


To clear all records -:

$ lfs changelog_clear nlustre-MDT0000 cl1 0


Get source and target as close as possible in sync with each other:

Remember lustre_rsync does not look at the target at all, only at its changelog, so unlike rsync, it will not know if the source and target are identical.

So, we have to manually rsync source and target, then immediately afterwards clear the changelog. From then on, lustre_rsync will work (at anytime therafter), by referring to the changelog.

Of course, there will always be a few files that are missed - those that were written on the source during the manual rsync (after its scan), which can take a long time.

So:

Clear the changelog -:

lfs changelog_clear nlustre-MDT0000 cl1 0

Immediately start a manual rsync -:

rsync -raH --progress /nlustre/users/* /backup/replication/nlustre/users

rsync -raH --progress /nlustre/users/* /backup/replication/nlustre/users

(two in a row cause fewer files to be missed)


From now on, lustre_rsync can be used to sync source and target with a script, nlustre_replicate.sh -:

#!/bin/tcsh

# Script to replicate /nlustre/users to /backup/replication/nlustre/users

lustre_rsync --source=/nlustre/users --target=/backup/replication/nlustre/users --mdt=nlustre-MDT0000 --user=cl1 --statuslog /root/sysadmin/nlustre_rsync.log --verbose

# remove --verbose after debug

# Notify

echo "lustre_rsync cronjob completed on Tux"|mail -s "Nlustre rsync cron" johann.swart@up.ac.za


And cronjob -:

# JWS - Run lustre_rsync everyday at 18h00

0 18 * * * /root/sysadmin/lustre_rsync/nlustre_replicate.sh


All these scripts, cronjobs and temporary tools are at tux.bi.up.ac.za:/root/sysadmin/lustre_rsync.