offline_processing.py
This script is used to add jobs to the processing database and is intended for continuous running during data taking. The checks the data-flow database and, once a run is complete (a new run is sensed), jobs are added to the data-processing database depending on the run type as stored in RATDB RUN table.
Example usage (for now, recommended via a screen session):
usage: offline_processing.py [-h]
(-r RUN | -b START_RUN END_RUN | -R START_RUN END_RUN | -A RUN | -1 RUN [RUN ...] | -L FILE)
[-t RATDB_TAG] [-N MODULES] [-o FILE] [-s] [-f]
[--force-prereq-pass MODULE PASSNO]
[--testing-nocal] [--testing-nodb]
config version filetype
Submits reprocessing jobs, and loops to sense new runs and submit jobs
automatically.
positional arguments:
config Specify path to config file
version Provide a RAT version
filetype L1 or L2 input files?
optional arguments:
-h, --help show this help message and exit
-r RUN, --start-run RUN
Starts a processing loop at the last run greater than
this argument not yet processed.
-b START_RUN END_RUN, --block-run START_RUN END_RUN
Starts a processing loop over a block of runs without
advancing past the end.
-R START_RUN END_RUN, --reprocess-range START_RUN END_RUN
Reprocesses a range of runs (inclusive). This mode
does not loop.
-A RUN, --reprocess-after RUN
Reprocesses all runs after the specified run
(inclusive). This mode does not loop.
-1 RUN [RUN ...], --one-shot RUN [RUN ...]
Reprocesses a specific list of runs given after this
arugment. This mode does not loop.
-L FILE, --runlist FILE
Reprocesses a specific list of runs listed in the file
specified. Format is run numbers separated by
newlines. This mode does not loop.
-t RATDB_TAG, --ratdb-tag RATDB_TAG
Use a tagged RATDB state instead of the live version.
-N MODULES, --nearline MODULES
Run a specific (set of) module(s) instead of the
default list. Make sure you know what you're doing!
-o FILE, --outlist FILE
File to append the created passes to. Useful to track
which passes are created for reprocessing runlists
-s, --skip-ratdb-check
Do not check for ratdb tables before submitting runs
(useful when producing tables)
-f, --find-prereq Find the prerequisite. Required to run a module without
also running the prerequisite in this submission.
Optionally takes one parameter (ratv) to find a prerequisite with a different ratv.
--force-prereq-pass MODULE PASSNO
Force a prereq to use a particular pass instead
--testing-nocal Test mode, force ignore of calibration locks
--testing-nodb Test mode, no database updates
This script has two main running modes: offline processing of detector data, reprocessing of detector data.
Offline processing is accomplished with the –start-run (-r) or –block-run (-b) argument which senses runs that have not yet been considered by this script after or between the run(s) in the argument. Once a new run is identified, its job documents are created according to the processing information available for the specified RAT version. This running mode loops forever sensing new runs all the while.
Reprocessing is accomplished with either the –runlist (-L), –reprocess-range (-R), –reprocess-after (-A), or –one-shot (-1) arguments. These take a filename containing runs separated by newlines, a range, a start run, or a list of runs respectively, and create a new pass for each using the specified RAT version. These modes do not loop, as there is no sensing of runs to be done.
Particular modules can be run instead of the default modules chosen by runtype. To do this, specify those modules with –nearline (-N).
The –outlist (-o) argument allows one to specify a file where the newly created jobs will be stored. This is especially useful when reprocessing to know which passes for each run (and module) are created. The processing_list program in rat-tools/GridTools can take this as an input file with its –exactly argument.
If a major reprocessing needs to occur, there is also the option of processing against a frozen RATDB database.
Currently, if a calibration lock is detected or set by the script (because a PCA/ECA calibration run type is detected) then the script will process no later runs until the calibration lock is released and the validto field set for that calibration type (this is the responsibility of the PMT calibrations group).