Name

pod-ssh — Submits, retrieves statuses and cleans PoD workers using SSH connections.

Synopsis

pod-ssh [-h, --help] [-v, --version] [-d, --debug] [-c file, --config=file] [-e arg, --exec=arg] [-t arg, --threads=arg] [--logs] [--for-worker arg] {[submit] | [clean] | [fast-clean] | [status]}

Description

The pod-ssh command can be used to submit and clean PoD workers using an ssh connection.

[Important]Important

The current implementation requires users to have a public key access (password less) to destination remote hosts (worker nodes).

The pod-ssh command takes PoD's ssh plug-in configuration file as input. The configuration file is a comma-separated values (CSV) file. Fields are normally separated by commas. If you want to put a comma in a field, you need to put quotes around it. Also 3 escape sequences are supported.

Table 10.1. PoD's ssh plug-in configuration fields

12345
id (must be any unique string).

This id string is used just to distinguish different PoD workers in SSH plug-in.

a host name with or without a login, in a form: login@host.fqdnadditional SSH params (could be empty)a remote working directorya desired number of PROOF workers (could be empty).

If this parameter is empty, than PoD will spawn as many PROOF workers on that host as CPU cores.


An example of a configuration file:

r1, anar@lxg0527.gsi.de, -p24, /tmp/test, 4
# this is a comment
r2, user@lxi001.gsi.de,,/home/user/pod,16
125, user2@host, , /tmp/test,

The pod-ssh command remembers last entered config-file pathname and next time you want to use pod-ssh with the same config file, you can just call pod-ssh without providing the --config option. The command will always use the latest given setting. In order to use feature BOOST 1.41.0 (or higher) is required.

Environment on Worker Nodes

With SSH plug-in it is very often the case, that PoD can't start workers, because xproofd/ROOT is not in the PATH on worker nodes. This could happen since with a batch SSH login in some systems you don't get your /etc/profile script called (login script) and there is no environment variables, like for normal login users. If your PoD job fails, just after submission it shows DONE status. You may want to check the remote log files see the section called “Examples” from the worker nodes and if it says that there are problems to start xproofd, then you need to customize environment on WNs. To solve this issue, users either can specify the full path to desired ROOT version on the worker nodes in the PoD.cfg, in case when all WNs have the same version pf ROOT located by the same path. But more advisable solution is to use inline bash script.

Inline BASH script

User can define remote environment for PoD SSH worker nodes via a so called inline BASH script. To define a script just use @bash_begin@ and @bash_end@ tags in your PoD SSH configuration file. For example:

@bash_begin@    
    # GSI
    . /etc/profile.d/gsi.sh
    . rootlogin 527-06b-xrd
@bash_end@
    
r1, anar@lxg0527.gsi.de, -p24, /tmp/test, 4
# this is a comment
r2, user@lxi001.gsi.de,,/home/user/pod,16
125, user2@host, , /tmp/test,

Everything what PoD find between those tags will be considered as an environment script and will be sourced on each worker node listed in that configuration file.

Bu using this feature, users are able to define different configuration files for different clusters, each of which can define its own list of worker nodes and an environment script accordingly.

Be advised, if inline BASH script is found, then PoD will not use user_worker_env.sh

The pod-ssh utility exits 0 on success, and >0 if an error occurs.

Options

-h, --help

Show summary of options.

-v, --version

Version information.

-d, --debug

Show debug messages. This option enables a debug mode and helps in some cases to understand what is going wrong.

-c file, --config=file

PoD's ssh plug-in configuration file. A workers description file.

-e arg, --exec=arg

Execute a local shell script on remote worker nodes

-t arg, --threads=arg

It defines a number of threads in pod-ssh's thread pool. Min value is 1, max value is (Core*2). Default: 5

--logs

Download all log files from the worker nodes. Can be used only together with the --clean option. This command delivers all log files from the worker nodes. Logs are copied to PoD log directory, the path to which is configurable via PoD user defaults.

--for-worker arg

Perform an action on defined worker nodes. (arg is a space separated list of WN names) Can only be used in connection with "submit", "clean", "fast-clean", "exec".

submit

Submit PoD workers according to the entries in the configuration file.

clean

Clean all PoD workers according to the entries in the configuration file.

fast-clean

The fast version of the clean procedure. It only shuts worker nodes down. It doesn't actually clean workers' directories.

status

Request status of PoD workers listed in the configuration file.

There are could be the following values of the status:

  • RUN - PoD job is running,
  • DONE - PoD job is done, means PoD worker is not running on that worker node. It could be also the case that worker failed to start,
  • CLEAN - PoD worker has been cleaned,
  • UNKNOWN - it is not possible to retrieve the status of that worker.

Examples

Example 10.5. Submit PoD jobs via SSH

pod-ssh -c pod_ssh_config_file submit


Example 10.6. Check the status of PoD jobs submitted via SSH

pod-ssh status

Check the amount of available PROOF workers:

pod-info -n

or

pod-info -l


Example 10.7. Clean PoD jobs submitted via SSH

pod-ssh clean

also you can clean and download all log files from the WNs

pod-ssh clean --logs


Example 10.8. Clean only specific worker nodes

pod-ssh --for-worker r1 r2  clean