Releasing jobsub_lite to all experiment interactive nodes Wednesday, Jan. 18

Dear Jobsub users,

WHAT ARE WE DOING?

The Computational Science and Artificial Intelligence Directorate will be releasing jobsub_lite, the rewrite of jobsub_client/jobsub_server, to FIFE (FabrIc for Frontier Experiments) experiment interactive machines.

WHEN WILL THIS OCCUR?

Wednesday, Jan. 18, 2023; from 8 a.m. to 2 p.m. Central Time

WHAT IS THE IMPACT TO YOU?

There are two major impacts:

  1. jobsub_lite uses SciTokens (https://scitokens.org) to authorize users to submit jobs, write to storage, etc. The first time users submit jobs, they will have to authenticate with the token issuer (CILogon) using their Services account username and password. Detailed instructions are available here: https://fifewiki.fnal.gov/wiki/Getting_started_with_jobsub_lite#Authentication

 

  1. Users will no longer be required to source UPS-environment-building scripts or setup jobsub_client. By simply logging in to an experiment interactive node, they will be able to run all the normal jobsub commands (jobsub_submit, jobsub_q, etc.).

 

We have tried to make jobsub_lite as close to a jobsub_client replacement as possible, though there will be some slight changes from the current jobsub_client. For a short time, you will be able to submit and manage jobs from both the old jobsub_client and the new jobsub_lite, though jobs submitted with jobsub_lite will not be manageable from jobsub_client, and vice-versa. If you explicitly setup an old (not “current” version after go-live) version of jobsub_client from UPS (UNIX Product Support), you will get jobsub_client, and if you do nothing, you will get jobsub_lite.

When jobsub_lite is released, a new version of jobsub_client, v_lite, will be made current in UPS. This new version of jobsub_client will simply point to the jobsub_lite executables. This is being done so scripts that set up jobsub_client will automatically begin to use jobsub_lite. For the aforementioned short period of time when both jobsub_client and jobsub_lite are usable on all interactive nodes, older versions of jobsub_client can be set up by passing the old version to the setup command. We strongly discourage this, but we understand that there may be a few corner cases where using the old jobsub_client might be needed during the transition period.

WHAT DO YOU NEED TO DO?

  • Please test your workflows with jobsub_lite by following the instructions below.
  • If you are able, try to attend one of the training sessions the FIFE Group will be holding in January to learn more about jobsub_lite. See details below.

 

Testing jobsub_lite

Several experiments’ offline coordinators/liaisons have already requested that jobsub_lite be installed on a single experiment interactive node so users can test. Please reach out to your offline coordinator or liaison to see if jobsub_lite is installed on an interactive node for your experiment, and if so, test your workflows with jobsub_lite. If your experiment does not have a node with jobsub_lite and you want to test, please discuss with your liaison or the FIFE group and we can figure out a place for you to test.

jobsub_lite is deployed on the test interactive nodes and will be deployed everywhere via RPM, with the jobsub_lite executables installed into users’ PATHs at login. So, to use jobsub_lite on a node on which it is installed, simply log in to the node, and run the various jobsub commands like before. You don’t need to run the “setup” command from UPS to use jobsub_lite. For example, this is what running jobsub_lite commands would look like on novagpvm03.fnal.gov:

 

$ kinit -f yourusername@FNAL.GOV

$ ssh novagpvm03.fnal.gov

 

… MOTD for novagpvm03

 

-bash-4.2$ which jobsub_submit

/opt/jobsub_lite/bin/jobsub_submit

-bash-4.2$ jobsub_submit -G nova file:///usr/bin/sleep 300

Attempting kerberos auth with https://htvaultprod.fnal.gov:8200 … succeeded

Attempting to get token from https://htvaultprod.fnal.gov:8200 … succeeded

Storing vault token in /tmp/vt_u10610

Storing bearer token in /tmp/bt_token_nova_Analysis_10610

Submitting job(s).

1 job(s) submitted to cluster 57107298.

Use job id 57107298.0@jobsub01.fnal.gov to retrieve output

-bash-4.2$

 

As mentioned above, the first time you do any grid operations using jobsub_lite, you will need to authenticate with our token issuer, CILogon. More information about authentication and submitting jobs using jobsub_lite can be found in this tutorial:

https://fifewiki.fnal.gov/wiki/Getting_started_with_jobsub_lite

 

Training sessions for jobsub_lite 

We plan to hold four more jobsub_lite training sessions in January 2023, one session per week. Two sessions will be held before Jan. 18 and two after Jan. 18. (The first training session was held during a FIFE meeting in December 2022.) Please stay tuned. As soon as we finalize the training dates, we will communicate them to users.

 

If you have any questions, please open a Service Desk ticket to be routed to Distributed Computing Support Group, and we will be happy to answer any questions.