GCP
This page is currently referencing the gcp released in fre/arkansas-11. It is currently available on the analysis nodes at GFDL and Gaea at ORNL. gcp currently functions under conditions specified in the matrix test results further down in this documentation. Additional functionality and simplicity will be implemented in the future. For more information on that or specific questions on gcp, please email Tara.McQueen@noaa.gov or put in a help desk ticket.
We are currently suggesting all gcp transfers use smartsites for source and destination. Please review gcp syntax for more details.
Overview
GCP (general copy) is envisioned as a convenience wrapper tool to copy data between NOAA RDHPCS sites. It mirrors the functionality and usage of standard copy tools such as CP, RCP, and SCP. Initially, GCP will only contain knowledge of the CMRS (Gaea) and GFDL sites but will eventually be expanded to the other NOAA RDHPCS sites throughout 2011 and 2012 as infrastructure is upgraded. The program will contain knowledge of site-specific details and accessibility to filesystems via different methods such as high thruput GridFTP data transfer nodes, low thruput tunneled scp transfers, and rsync. Perhaps via a MOAB submission, it will 'hide the details' regarding the use of globus-url-copy between striped data transfer nodes, of scp -P through SSH tunnels, or of rsync via GSISSH. Finally, it will strive to accommodate data transfer requests between GFDL filesystems that are not HPCS-visible (/net2) as well as CMRS compute-node-only filesystems (FS) which involve multiple transfers.
Note: GCP is located in the fre module. This means users will have to do a module load fre (or load a specific version of fre). To display the available options type gcp -h. If users do not like or wish to have the output generated by gcp, please use the -q option.
Syntax
If you've used RCP or SCP to copy data around, you will be familiar with the GCP syntax. GCP uses 2 smarthosts; the first one points to GAEA and is cmrs:/ (or gaea:/) while the second is gfdl:/ which points to GFDL. It is important to use a smarthost site when the source or destination is not mounted where your GCP command is being initialized. For instance, when you are initializing any data transfers from GAEA between GAEA and GFDL, you will need to use gfdl:/. The basic transfer syntax will be as follows:
gcp /path/to/source /path/to/destination #for moving data locally gcp /path/to/source site:/path/to/destination #for pushing data from local to remote gcp site:/path/to/source /path/to/destination #for pulling data from remote to local
Users do not need to specify a smarthost when a specific path is mounted to the location where the gcp call is being made. As of version fre/test, users can now use smartsites for every source to ensure files transfer appropriately. This is recommended for users who would like to use common scripts on multiple platforms. GCP currently only recognizes smarthosts for gaea and gfdl. On gaea, you can use gaea: followed by your path. At gfdl you can use gfdl: followed by your path. Below are some examples.
Running local transfers on Gaea (all equivalent):
gcp gaea:/lustre/ltfs/scratch/Tara.McQueen/fileA gaea:/ncrc/home1/Tara.McQueen/ gcp ncrc:/lustre/ltfs/scratch/Tara.McQueen/fileA ncrc:/ncrc/home1/Tara.McQueen/ gcp /lustre/ltfs/scratch/Tara.McQueen/fileA /ncrc/home1/Tara.McQueen/
Running local transfers on PP&AN at GFDL (all equivalent):
gcp gfdl:/work/Tara.McQueen/fileD gfdl:/home/tlm/ gcp /work/Tara.McQueen/fileD /home/tlm/
Running remote transfers on PP&AN within GFDL (all equivalent):
gcp gfdl:/archive/Tara.McQueen/fileE gfdl:/net/tlm/ gcp /archive/Tara.McQueen/fileE gfdl:/net/tlm/
Running remote "push" transfers from PP&AN to Gaea (all equivalent):
gcp gfdl:/work/Tara.McQueen/fileC gaea:/ncrc/home1/Tara.McQueen/ gcp gfdl:/work/Tara.McQueen/fileC ncrc:/ncrc/home1/Tara.McQueen/ gcp /work/Tara.McQueen/fileC gaea:/ncrc/home1/Tara.McQueen/ gcp /work/Tara.McQueen/fileC ncrc:/ncrc/home1/Tara.McQueen/
Running remote "push" transfers from Gaea to PP&AN (all equivalent):
gcp gaea:/lustre/fs/scratch/Tara.McQueen/fileA gfdl:/archive/tlm/ gcp ncrc:/lustre/fs/scratch/Tara.McQueen/fileA gfdl:/archive/tlm/ gcp /lustre/fs/scratch/Tara.McQueen/fileA gfdl:/archive/tlm/
What is mounted where?
On Gaea Login node
home = /ncrc/home1|2/$USER/
ltfs = /lustre/ltfs/scratch/$USER/
fs = /lustre/fs/scratch/$USER/
On Gaea Batch node
fs = /lustre/fs/scratch/$USER/
home = /ncrc/home1|2/$USER/
On PP&AN at GFDL
home = /home/$USER/
work = /work/$USER/
archive = /archive/$USER/
ptmp = /ptmp/$USER/
vftmp = /vftmp/$USER/
On Workstations at GFDL
home = /home/$USER/
net = /net/$USER/
net2 = /net2/$USER/
work = /work/$USER/ (read only)
archive = /archive/$USER/ (read only)
Filesystem notes:
- cmrs and ncrc are aliases for gaea. All three work as smarthosts as well as in explanations. They may be interchanged.
- CMRS filesystems ( /home, /lustre/ltfs) to GFDL filesystems ( /archive, /work, /home) can be one step Rsync.
- GFDL filesystems ( /archive, /work, /home) to CMRS filesystems ( /home, /lustre/ltfs) can be one step gridftp.
- Transfers to/from the CMRS filesystem /lustre/fs are a two step transfer through /lustre/ltfs/stage/$USER via a batch job submission to the CMRS LDTNs.
- GFDL workstation filesystems ( /net, /net2) transfers happen via GFDL LDTN nodes, or via an scp through an SSH tunnel.
Examples
Local
gcp /lustre/fs/scratch/$USER/file $CHOME gcp /lustre/fs/scratch/$USER/file /lustre/ltfs/scratch/$USER/
Push
gcp /lustre/ltfs/scratch/$USER/filename gfdl:/archive/$USER gcp $CHOME/filename gfdl:/home/$USER
Pull
gcp gfdl:/archive/$USER/file /lustre/fs/scratch/$USER gcp gfdl:/work/$USER/file $CHOME
Output example without -q option:
Running command while on Gaea login node.
Tara.McQueen> gcp $CHOME/transfile /lustre/ltfs/scratch/Tara.McQueen/
gcp $Revision: 1.18 $ on gaea2.ncrc.gov by McQueen
GCP::args: 1 local files to transfer in ll direction, 37 bytes
GCP::determine_checksum_method: computing local checksums using /usr/bin/md5sum
GCP::determine_transfer_method: found valid certificate, may be using GridFTP or GSISSH + RSYNC usage
GCP::determine_transfer_method: Found gsissh at /sw/xt6/globus/default/bin/gsissh, may be using that for connectivity.
GCP::determine_transfer_method: source and destination on locally accessible filesystems, bypassing any further transfer forcing.
GCP::determine_transfer_method: copying to local path '/lustre/ltfs/scratch/Tara.McQueen' via /bin/cp
GCP::Inventory::start_checksums: checksumming disabled, returning.
Starting transfer using cp
GCP::Inventory::end_transfer: transfer complete; synchronize inventory files
GCP::Inventory::end_checksums: checksumming disabled, returning.
GCP::Inventory::start_checksums: checksumming disabled, returning.
GCP::Inventory::end_checksums: checksumming disabled, returning.
GCP::Inventory::validate_checksums: checksumming disabled, returning.
gcp options
There are bound to be far more options. This is a starting point but not all of these are effectively implemented yet.
-h or --help for more details/information -v or --verbose verbose mode -d or --debug debug output -x [desired transfer method] Override the defaults for the transfer method. gridftp, scp, cp and rsync are supported. -r or --recusive recursive (not supported on initial deployment) -v or --version Displays version number and exit. -cd or --create-dirs Create destination directories as needed. (NOT FULLY IMPLEMENTED YET...read notes below)
Note: the -cd/--create-dirs option changes the interpretation of the destination path, forcing it to be interpreted as a directory pathname. This will work for creating a directory on gaea from the analysis nodes here at GFDL. Below is an example from AN003:
gcp -v -d -cd /home/tlm/RTS/rts.xml gaea:/ncrc/home1/Tara.McQueen/NewDirName/
This will create the currently non existent directory NewDirName/ on gaea. In order for this to work, the user must specify the new directory names in the destination path and pass it a file. gcp will not create this directories unless a file of size > 0 is being transferred. Users can use this to create new sub directories as well. For instance:
gcp -v -d -cd /home/tlm/RTS/rts.xml gaea:/ncrc/home1/Tara.McQueen/NewDirName/foo/boo/
This command will create the directories NewDirName/ , foo/ and boo/ if they did not exist in the path.
GCP working cases with 14M file (as of fre version 2.2)
Wildcards and recursive copies
GCP is capable of preforming multiple file and sub directory transfers for most cases. Below are working examples:
To transfer from login node on gaea between fs, ltfs and home1|2 -
gcp -x rsync -r -v -d /path/to/dir/* /path/to/destination/ gcp -x rsync -r -v -d /ncrc/home1/Tara.McQueen/new/* /lustre/ltfs/scratch/Tara.McQueen/recdir/desdir/
To transfer from a gaea filesystem to gfdl -
gcp -r -v -d /path/to/dir/on/gaea/* gfdl:/path/to/destination/at/gfdl/ gcp --r -v -d /ncrc/home1/Tara.McQueen/new/* /lustre/ltfs/scratch/Tara.McQueen/recdir/desdir/
To pull from gfdl to gaea while on gaea you need to use * -
gcp -r -v -d gfdl:/path/to/files* gaea:/path/to/destination/ gcp -r -v -d gfdl:/archive/van/fms/INPUT/IPCC_AR5/emissions/RCP45_v1_21_12_2009/2030_1x1.fms* gaea:/lustre/ltfs/scratch/Tara.McQueen/
Wildcards such work under the following conditions with gcp:
Gaea Login Nodes
| Source | gaea:/lustre/fs | gaea:/lustre/ltfs | gaea:/ncrc/home | gfdl:/archive | gfdl:/work | gfdl:/home | |
|---|---|---|---|---|---|---|---|
| /ncrc/home | gcp | gcp | gcp | gcp | gcp | gcp | |
| /lustre/ltfs | gcp | gcp | gcp | gcp | gcp | gcp |
Analysis Nodes
| Source | gaea:/lustre/fs | gaea:/lustre/ltfs | gaea:/ncrc/home | gfdl:/archive | gfdl:/work | gfdl:/home | gfdl:/net | gfdl:/net2 | |
|---|---|---|---|---|---|---|---|---|---|
| /home | gcp | gcp | gcp | gcp | gcp | gcp | gcp | gcp | |
| /archive | gcp | gcp | gcp | gcp | gcp | gcp | gcp | gcp | |
| /work | gcp | gcp | gcp | gcp | gcp | gcp | gcp | gcp |
GFDL Workstation
| Source | gfdl:/archive | gfdl:/work | gfdl:/home | gfdl:/net | gfdl:/net2 | |
|---|---|---|---|---|---|---|
| /archive | gcp | gcp | gcp | gcp | gcp | |
| /work | gcp | gcp | gcp | gcp | gcp | |
| /home | gcp | gcp | gcp | gcp | gcp |
Current Caveats
- The only smartsites supported by gcp are gfdl:, cmrs:, gaea:, and ncrc:. cmrs:, gaea:, and ncrc: are all equivalent. Use of any other sequence of characters for a smartsite will produce an error like:
GCP::smartsite: Smartsite specified, but no data transfer nodes defined for gridftp transfers!
- gcp to or from /ptmp is not currently supported from the workstation. Rather than failing outright, the error mode is that gcp keeps retrying the transfer:
gcp testme3 gfdl:/ptmp/Jeffrey.Durachta/testme4 CP::Inventory::start_checksums: checksumming disabled, returning. Starting transfer using gridftp . . . GCP::Inventory::transfer_gridftp: error from gridftp; retrying, effort 2 of 9999 GCP::Inventory::transfer_gridftp: error from gridftp; retrying, effort 3 of 9999
- The gaea /lustre/fs file system is not yet accessible to the workstation via gcp
- Thus you can copy to and from the file systems mounted read / write on the analysis and post-processing cluster to gaea home and long term scratch as well as within home, archive, work, your vftmp file system and your net and net2 workstation file systems. The syntax for access to your net drive is as follows. Until a less verbose version is available, you will see a "failover" to a transfer protocol called "mcp":
an005 107> gcp testme3 /net/jwd . . GCP::determine_transfer_method: copying to local path '/net/jwd' via /app/mpscp-1.3a/bin/gmpscp GCP::Inventory::start_checksums: checksumming disabled, returning. Starting transfer using gmpscp gmpscp failed, exit status 1 WARNING: source or destination does not support mpscp transfers WARNING: could not transfer /home/Jeffrey.Durachta/testme3 to /net/rab using mpscp, trying mcp INFO: executing command: /usr/local/bin/mcp --threads=4 --buffer-size=1 /home/Jeffrey.Durachta/testme3 /net/jwd
- The gaea /lustre/fs file system is not yet accessible to the gfdl analysis / post-processing cluster via gcp
- gcp to or from /ptmp is not currently supported from the gaea nodes.
- gcp provided options have not been tested for all transfers in testing matrices. In some cases, they will not work. Test cases and described functionality are based off single file transfers to and from complete path locations.
- gcp does not fully handle symbolic link transfers. We will attempt to address this in version 1.2.3 at the earliest. More details to come.
Bug Fixes per release
Version 1.1 fixes
- smartsites can be used anywhere without causing gcp to fail. Before, gcp only worked if smartsites were used for sources or destinations whose paths were not mounted to the location that were the gcp command was being called.
- spdcp is invoked for all transfers between ltfs and fs on gaea. It would originally use cp unless at least one smarthost was defined. It now uses spdcp no matter if smarthost is defined or not.
- gcp now uses gmpscp wrapper internally to do mpscp transfers on the gfdl side.
Version 1.2
- gcp functionality on the workstations. This will include a matrix for what works with gcp, what is not supported and what is not implemented yet.
- quiet mode a default on for less verbosity
- wildcards (* only) supported for most transfers. Currently, any gcp transfer that uses gridftp will not work with wildcards. Most others work with * . A list of examples will be provided.
- Added flags for full word description. For instance -d is equivalent to --debug.
- fixed -help options and descriptions output
Version 1.2.1 (currently in fre)
- deployed on March 23, 2011.
- rsync partial flag removed. This will prevent partial files from being leftover at the destination if a gcp transfer fails while using rsync.
- net3 support from an & workstation (still in testing)
Emergency patch: to fix spdcp error where it will not overwrite a previously transferred file.
- Applied in gcp Version 1.2.1.
- Problem identified and patched on March 23, 2011.
- gcp is now detecting this error and using cp as a work-around.
- Impacts transfers between FS and LTS on Gaea.
- The bug appears to be in the spdcp utility that gcp relies on.
- hsmget was not susceptible to this failure as it does not rely on spdcp.
- This error would have only been reported in stdout if Users were running gcp with the -v or -d options during which an MPI error message would be reported.
- The patch detects the MPI error and does the transfer with cp.
- The error seen from gcp would look similar to:
MOAB based transfer using spdcp has apparently failed, detected via mpirun noticed that process rank 2 with PID 15449 on node gaea-ldtn9 exited on signal 6 (Aborted).
- It is uncertain how long this failure mode has existed.
- In this failure-mode, gcp would have reported a successful exit status of "0".
- This mean that shell scripts would not have caught the error.
Outstanding issues and bugs to be resolved
Version 1.2.2 (not released yet and fixes to be incorporated are tentative)
Soon to be put in to fre/test.
- have gcp use the correct spdcp wrapper
- gcp to better manage the transfer of a small versus large file when using spdcp wrapper
- end-to-end checksumming within gcp
- typo fix for /autofs/na2_home2/$USER/ directory specification
- improved error messages
- /net3 support
Version 1.2.2+ (requested features)
- softlink transfers
- end-to-end checksums within gcp
- ability to disable checksumming
- a preserve permissions flag
- a preserve time-stamp flag




