site_pipeline
The site_pipeline submodule contains programs supporting quick-turnaround data processing at the observatory.
Note
Documentation of interfaces is currently held separately. The focus here will be on operation; command line parameters and config file syntax.
Command line interface
Usage
To execute a pipeline element from the command line, use the
so-site-pipeline command. For example, make-source-flags can
be invoked as:
so-site-pipeline make-source-flags [options]
To configure tab-completion of element names, in bash, run:
eval `so-site-pipeline --bash-completion`
Wrapping a pipeline script
In order to plug in nicely to the command line wrapper, element
submodules should expose functions called main() and
get_parser().
The get_parser() function should look like this:
def get_parser(parser=None):
if parser is None:
parser = argparse.ArgumentParser()
# element-specific args:
parser.add_argument('obs_id', help="The obs_id to analyze.")
parser.add_arugment('--config-file', help="Config file.")
return parser
When called by the so-site-pipeline wrapper, a parser will be
passed in.
The main() function is the entry point to be called from the CLI
or from Prefect. The arguments should include the arguments defined
through the ArgumentParser, as well as any additional support for
Prefect (such as a logger argument). For example:
def main(obs_id=None, config_file=None, logger=None):
...
(Alternately a special entry point, cli_main, can be defined and that will be used instead.)
If you want the submodule to be executable directly as a script (or
through python -m), add a __main__ handling block like this
one:
if __name__ == '__main__':
from sotodlib.site_pipeline.utils.pipeline import main_launcher
main_launcher(main, get_parser)
To register a properly organized submodule in the so-site-pipeline
command line wrapper, edit cli.py and see comments inline.
Pipeline Elements
update-g3tsmurf-db
This script set up to create and maintain g3tsmurf databases. See details here.
update-book-plan
This script is designed to help with the bookbinding. It will search a given Level 2 G3tSmurf database for observations that overlap in time. The different optional arguments will let us pass information from something like the sorunlib database to further filter the observations.
check-book
For a description and documentation of the config file format, see
sotodlib.site_pipeline.check_book module autodocumentation below.
Command line arguments
Scan an “obs” or “oper” book and check for schema compliance; optionally update an obsfiledb.
usage: check-book [-h] [--config CONFIG] [--add] [--overwrite] [--verbose]
book_dir
Positional Arguments
- book_dir
Path to the Book.
Named Arguments
- --config, -c
Path to config file with work-arounds and ObsFileDb config.
- --add
After inspecting the book, add it to the ObsFileDb.
Default:
False- --overwrite
If adding to ObsFileDb, remove existing references to this obs first (prevents ‘UNIQUE constraint’ error).
Default:
False- --verbose, -v
Default:
False
Module documentation
check_book.py
This module an entry point to io.check_book, for checking obs/oper Books for internal consistency & proper schema. It may also be used to create/update an ObsFileDb for such Books.
A configuration file can be used to set the ObsFileDb filename and root path for ObsFileDb entries.
The config file can also be used to enable work-arounds and bypass certain exceptions (which should not be necessary on compliant books.)
At the of this writing a minimal config file might be simply:
# Database setup (this is the default).
obsfiledb: './obsfiledb.sqlite'
# For obsdb filenames, path relative to which those names should
# be specified. (/ is the default.)
root_path: '/'
# Work-arounds
extra_extra_files: ['frame_splits.txt']
But here is a more complete example, with lots of work-arounds enabled:
# Database setup (this is the default).
obsfiledb: './obsfiledb.sqlite'
# For obsdb filenames, path relative to which those names should
# be specified. (/ is the default.)
root_path: '/'
# Work-arounds
stream_file_pattern: 'D_obs_{stream_id}_{index:03d}.g3'
extra_extra_files: ['frame_splits.txt']
sample_range_inclusive_hack: True
tolerate_missing_ancil: True
tolerate_missing_ancil_timestamps: True
tolerate_timestamps_value_discrepancy: False
# Tolerate arbitrary extra files, except explicitly named ones
tolerate_stray_files: True
banned_files: ['frame_splits.txt']
# If stream_ids are not provided in metadata, list them here.
stream_ids:
ufm_mv14
ufm_mv18
ufm_mv19
ufm_mv22
ufm_mv6
ufm_mv7
ufm_mv9
# If detset names are not provided in metadata, provide a map from
# stream_id to detset name here.
detset_map:
ufm_mv14: sch_mv14
ufm_mv18: sch_mv18
ufm_mv19: sch_mv19
ufm_mv22: sch_mv22
ufm_mv6: sch_mv6
ufm_mv7: sch_mv7
ufm_mv9: sch_mv9
- sotodlib.site_pipeline.check_book.scan_book_dir(book_dir, logger, config, prep_obsfiledb=False)[source]
Run the BookScanner on book_dir.
- Returns:
- bool
True only if the book passed the checks.
- obsfiledb_infodict or None
If prep_obsfiledb, and ok, then this dict has the info needed for updating obsfiledb (see add_to_obsfiledb).
- Return type:
ok
(Note this function is used by update-obsdb, as well.)
- sotodlib.site_pipeline.check_book.add_to_obsfiledb(info, logger, config, overwrite=False)[source]
Add observation info to the obsfiledb specified in config. The “info” dict is the one returned by scan_book_dir. If overwrite, then file entries in the obsfiledb, for this obs_id, will be replaced.
(Note this function is used by update-obsdb, as well.)
update-obsdb
For a description and documentation of the config file format, see
sotodlib.site_pipeline.update_obsdb module autodocumentation below.
Command line arguments
usage: update-obsdb [-h] --config CONFIG [--recency RECENCY]
[--verbosity VERBOSITY] [--booktype BOOKTYPE]
[--overwrite] [--fastwalk] [--limit LIMIT]
[--filter FILTER]
Named Arguments
- --config
ObsDb, ObsfileDb configuration file
- --recency
Days to subtract from now to set as minimum ctime. If None, no minimum
- --verbosity
Increase output verbosity. 0:Error, 1:Warning, 2:Info(default), 3:Debug
Default:
2- --booktype
Select book type to look for: obs, oper, both(default)
Default:
'both'- --overwrite
If true, writes over existing entries
Default:
False- --fastwalk
Assume known directory tree shape and speed up walkthrough
Default:
False- --limit
Limit processing to only this number of book candidates.
- --filter
Limit processing to books (full path) matching this fnmatch wildcard string.
Module documentation
update_obsdb.py
Create and/or update an obsdb and obsfiledb based on some books. The config file could be of the form:
base_dir: path_to_base_directories. Can be a list or a single string.
obsdb_cols:
start_time: float
stop_time: float
n_samples: int
telescope: str
tube_slot: str
type: str
subtype: str
obsdb: dummyobsdb.sqlite
obsfiledb: dummyobsfiledb.sqlite
lat_tube_list_file: path to yaml dict matching tubes and bands
tolerate_stray_files: True
known_bad_books_file: path to file listing bad books (one obs_id per line)
extra_extra_files:
- Z_bookbinder_log.txt
extra_files:
- M_index.yaml
- M_book.yaml
- sotodlib.site_pipeline.update_obsdb.telescope_lookup(telescope: str)[source]
Set a number of common queries given a telescope name
- Parameters:
telescope (str) – Name of telescope in M_index
- sotodlib.site_pipeline.update_obsdb.main(config: str, recency: float | None = None, booktype: str | None = 'both', verbosity: int | None = 2, overwrite: bool | None = False, fastwalk: bool | None = False, limit: int | None = None, filter: str | None = None)[source]
Create or update an obsdb for observation or operations data.
- Parameters:
config (str) – Path to config file
recency (float) – How far back in time to look for databases, in days. If None, goes back to the UNIX start date (default: None)
booktype (str) – Look for observations or operations data or both (default: both)
verbosity (int) – Output verbosity. 0:Error, 1:Warning, 2:Info(default), 3:Debug
overwrite (bool) – if False, do not re-check existing entries
fastwalk (bool) – if True, assume the directories have a structure /base_dir/obs|oper/d{5}/… Then replace base_dir with only the directories where d{5} is greater or equal to recency.
limit (int) – Limit the number of books to process; helpful for debugging. If None or 0, there is no limit.
update-obsdb-ancil
This script may be used to update ancillary data archives and then to
update an ObsDb with reduced statistics from those archives. The
script requires a configuration file, which includes a list of
“datasets”; each dataset corresponds to some ancillary data source and
a method for reducing that data to be summarized in ObsDb columns. Each dataset is mapped to a
class in the sotodlib.io.ancil module; the code there defines
how the data are to be obtained, stored, and reduced.
The main processing commands are update-base-data and
update-obsdb. For each of these, the caller can specify which
configured datasets they want to run on, and what time range to
consider for the update.
For example, to perform a bulk update of all ancillary data archives, considering data from the past week, run:
so-site-pipeline update-obsdb-ancil \
update-base-data -c ancil.yaml --lookback-days=7
To then update the specified obsdb, run:
so-site-pipeline update-obsdb-ancil \
update-obsdb -c ancil.yaml --lookback-days=7
In automation (e.g. from cronjob) users will most likely use the
run-job command, which can run a pre-configured series of
operations (different combinations of update-base-data and
update-obsdb). That might look like:
so-site-pipeline update-obsdb-ancil \
run-job -c ancil.yaml standard_7day_update
This script complements the basic Book checking and indexing done by
site_pipeline.update_obsdb. To import records from some upstream
obsdb into the “target_obsdb”, use the import-records command
(this requires upstream_obsdb to be set in the config file). To
trigger a run of the site_pipeline.update_obsdb script, the
update-books command can be used. Note this command should be
provided with a config file suitable for
site_pipeline.update_obsdb. The CLI arguments config_file,
lookback_days, and redo are translated to config,
recency and overwrite, respectively.
For testing there are check and test commands. The check
command simply instantiates an engine for each dataset – basically
checking for configuration problems and summarizing some easily
determined state. E.g.:
>>> so-site-pipeline update-obsdb-ancil \
check --dataset apex-pwv
Loading config file ancil.yaml ...
dataset=apex-pwv
output_dir: apex/
files_found: 33
The test command can be used to load data and run a reduction
operation, targeting a specific obs in the ObsDb. For example:
>>> so-site-pipeline update-obsdb-ancil \
test --dataset apex-pwv \
--query "type == 'obs' and timestamp > 1760100000"
apex-pwv
obs_1760103123_satp1_1111111 {'mean': 0.48, 'start': 0.44, 'end': 0.51, 'span': 0.13}
obs_1760106068_satp1_1111111 {'mean': 0.6, 'start': 0.58, 'end': 0.82, 'span': 0.31}
obs_1760108443_satp1_1111111 {'mean': 0.78, 'start': 0.78, 'end': 0.78, 'span': 0.0}
...
Or with a specific obs_id:
>>> so-site-pipeline update-obsdb-ancil \
test obs_1760168230_satp1_1111111
apex-pwv
obs_1760168230_satp1_1111111 {'mean': 0.56, 'start': 0.57, 'end': 0.55, 'span': 0.06}
toco-pwv
obs_1760168230_satp1_1111111 {'mean': 0.439, 'start': 0.459, 'end': 0.463, 'span': 0.091}
pwv-combo
obs_1760168230_satp1_1111111 {'mean': 0.44, 'std': 0.018, 'qual': 1.0}
weather-station
obs_1760168230_satp1_1111111 {'wind_speed': 1.2, 'wind_dir': 228.7, 'uv': nan, 'ambient_temp': -2.83}
...
Configuration
The configuration file is yaml. In addition to some general settings
at root level, two large sub-blocks are used to define the data
archives (datasets) and the different job definitions
(job_defs).
In the datasets block, each entry will be parsed as the
configuration for one of the classes registered in
sotodlib.io.ancil.ANCIL_ENGINES; find links to each
engine’s configuration dataclass in the io.ancil Engines
section. Note that the key of each entry is
normally used to find the engine in ANCIL_ENGINES – but you can call
the datasets entries whatever you want as long as the class is
defined, in the block (i.e. instead of "apex-pwv": {...} you could
have "my-weird-name": {"class": "apex-pwv", ...}).
Here’s an annotated example:
# ObsDb to use for update-obsdb jobs
target_obsdb: /so/metadata/obsdb.sqlite
# ObsDb to use as a source for "import-records" (optional; will not
# be modified).
upstream_obsdb: /so/metadata/obsdb0.sqlite
# This (optionally) sets the default data_prefix used for base data
# in all datasets (can be overridden by setting data_prefix in a
# dataset config block).
data_prefix: /data/ancil
# Datasets -- the entry names are looked up in
# the io.ancil registered engines.
datasets:
apex-pwv:
data_dir: apex/
toco-pwv:
data_dir: toco/
hkdb_config: "/db_configs/hkdb-site.cfg"
pwv-combo: {}
# job_defs -- a list of named jobs; this is like
# a shortcut for running common operations.
job_defs:
- name: basic
steps:
- command: "update-base-data"
lookback_days: 7
carry_fail: true
- command: "update-obsdb"
lookback_days: 3
redo: true
- command: "update-obsdb"
lookback_days: 7
Each block in the job_defs list will be run, in sequence. The dict is passed through to the command being run unaltered, except as noted:
config_file, if not specified explicitly, will be set to the path of this file.commandprovides the name of the sub-command to run (as if it had been specified on command line).ignore_fail, if set to true, will cause the step to not block completion of subsequent steps if it fails; the script may exit successfully in the end.carry_fail, if set to true, will permit subsequent steps to run, if this step fails, but will cause the whole job to fail, in the end.
Command line arguments
usage: update-obsdb-ancil [-h] [--verbose]
{update-base-data,update-obsdb,update-books,import-records,check,test,run-job}
...
Named Arguments
- --verbose, -v
Pass one or more times to increase logging verbosity.
Commands
- command
Possible choices: update-base-data, update-obsdb, update-books, import-records, check, test, run-job
Sub-commands
update-base-data
- Update the base
data for an ancillary data archive. This will typically trigger loading of the data from the source, and then reduction and storage in the local archive.
update-obsdb-ancil update-base-data [-h] [--config-file CONFIG_FILE]
[--dataset DATASET]
[--time-range TIME_RANGE TIME_RANGE]
[--lookback-days LOOKBACK_DAYS]
[--full-scan]
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'- --dataset, -d
Limit processing to only the specified dataset. If passed multiple times, the chosen datasets are processed in the corresponding order.
- --time-range
Restrict processing to items falling between the two provided unix timestamps.
- --lookback-days
As an alternative to –time-range, restrict processing to items falling between the present time and the specified number of days in the past.
- --full-scan
Default:
False
update-obsdb
Update obsdb records based on data already in the archives. This is typically run after refreshing the base data with update-base-data.
update-obsdb-ancil update-obsdb [-h] [--config-file CONFIG_FILE]
[--dataset DATASET]
[--time-range TIME_RANGE TIME_RANGE]
[--lookback-days LOOKBACK_DAYS] [--redo]
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'- --dataset, -d
Limit processing to only the specified dataset. If passed multiple times, the chosen datasets are processed in the corresponding order.
- --time-range
Restrict processing to items falling between the two provided unix timestamps.
- --lookback-days
As an alternative to –time-range, restrict processing to items falling between the present time and the specified number of days in the past.
- --redo
Default:
False
update-books
Update obsdb and obsfiledb by scanning data directories for new books. This calls site_pipeline.update_obsdb.
update-obsdb-ancil update-books [-h] [--config-file CONFIG_FILE]
[--lookback-days LOOKBACK_DAYS] [--redo]
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'- --lookback-days
As an alternative to –time-range, restrict processing to items falling between the present time and the specified number of days in the past.
- --redo
Default:
False
import-records
- Prepare the
target obsdb for updates by pulling in rows from a primary “upstream” obsdb.
update-obsdb-ancil import-records [-h] [--config-file CONFIG_FILE]
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'
check
- Read the config file and test
datasets by attempting to instantiate them. This should not normally touch the underlying data or attempt to load new data from base sources.
update-obsdb-ancil check [-h] [--config-file CONFIG_FILE] [--dataset DATASET]
[--time-range TIME_RANGE TIME_RANGE]
[--lookback-days LOOKBACK_DAYS]
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'- --dataset, -d
Limit processing to only the specified dataset. If passed multiple times, the chosen datasets are processed in the corresponding order.
- --time-range
Restrict processing to items falling between the two provided unix timestamps.
- --lookback-days
As an alternative to –time-range, restrict processing to items falling between the present time and the specified number of days in the past.
test
- Perform the obsdb data
reduction on a specified obs_id. Results are not saved to obsdb.
update-obsdb-ancil test [-h] [--config-file CONFIG_FILE] [--dataset DATASET]
[--query QUERY] [--compare]
[obs_id ...]
Positional Arguments
- obs_id
obs_id (pass multiple) to compute results for.
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'- --dataset, -d
Limit processing to only the specified dataset. If passed multiple times, the chosen datasets are processed in the corresponding order.
- --query, -q
Use this obsdb query to select records for computation, instead of passing specific obs_id.
- --compare
If passed, the computed values will be compared to values already present in the obsdb.
Default:
False
run-job
Run a job defined in the config file.
update-obsdb-ancil run-job [-h] [--config-file CONFIG_FILE] job_name
Positional Arguments
- job_name
Name of job to run (matched to entry in config[‘job_defs’]).
Named Arguments
- --config-file, -c
Path to config file.
Default:
'ancil.yaml'
Module documentation
update-obsdb-ancil
This script uses io.ancil modules to update archives of reduced ancillary data and to update an obsdb with quantities computed based on those ancillary data.
The entry point is main() but the different command functions
can be called directly.
- sotodlib.site_pipeline.update_obsdb_ancil.update_base_data(config_file, time_range=None, datasets=None, full_scan=False, verbose=None)[source]
- sotodlib.site_pipeline.update_obsdb_ancil.update_obsdb(config_file, time_range=None, datasets=None, redo=False, verbose=None)[source]
- sotodlib.site_pipeline.update_obsdb_ancil.main(command: str | None = None, verbose: bool | None = None, config_file: str | None = None, time_range: list | None = None, dataset: list | None = None, lookback_days: float | None = None, query: str | None = None, obs_id: list | None = None, full_scan: bool | None = None, redo: bool | None = None, compare: bool | None = None, job_name: str | None = None, chain_count: int | None = 0)[source]
Entry point for CLI or job runner. Some parameters are only used by some commands.
update-smurf-caldbs
This update script is used to add detset and calibration metadata to manifest dbs
Module Docs
Script to import tuning and readout id channel mapping, and detector calibration information into manifest dbs for book loading. At present this just works in the configuration where it has access to both level 2 and level 3 indexing. This is technically possible with just level 3 data / indexing but requires some still non-existant tools.
Configuration file required:
config = {
'archive': {
'detset': {
'root_dir': /path/to/detset/root,
'index': 'detset.sqlite',
'h5file': 'detset.h5',
'context': 'context.yaml',
'write_relpath': True
},
'det_cal': {
'root_dir': /path/to/det_cal/root,
'index': 'det_cal.sqlite',
'h5file': 'det_cal.h5,
'context': 'context.yaml',
'failed_obsid_cache': 'failed_obsids.yaml',
'write_relpath': True
},
},
'g3tsmurf': g3tsmurf_hwp_config.yaml',
'imprinter': imprinter.yaml,
}
The calibration info described below is used to populate the calibration db. For more information on how calibration info is computed in sodetlib, checkout the following docs and source code:
- class sotodlib.site_pipeline.update_smurf_caldbs.CalInfo(readout_id: str = '', r_tes: float = nan, r_frac: float = nan, p_bias: float = nan, s_i: float = nan, phase_to_pW: float = nan, v_bias: float = nan, tau_eff: float = nan, bg: int = -1, polarity: int = 1, r_n: float = nan, p_sat: float = nan)[source]
Class that contains detector calibration information that will go into the caldb.
- r_tes
Detector resistance [ohms], determined through bias steps while the detector is biased
- Type:
- phase_to_pW
Phase to power conversion factor [pW/rad] computed using s_i, pA_per_phi0, and detector polarity
- Type:
- tau_eff
Effective thermal time constant [sec] of the detector, measured from bias steps
- Type:
- bg
Bias group of the detector. Taken from IV curve data, which contains bgmap data taken immediately prior to IV. This will be -1 if the detector is unassigned
- Type:
- polarity
Polarity of the detector response for a positive change in bias current while the detector is superconducting. This is needed to correct for detectors that have reversed response.
- Type:
Command line arguments
usage: update_smurf_caldbs.py [-h] [--config CONFIG] [--skip-detset]
[--skip-detcal] [--overwrite]
Named Arguments
- --config
configuration file
- --skip-detset
Skip detset update
Default:
False- --skip-detcal
Skip detcal update
Default:
False- --overwrite
Overwrite existing entries
Default:
False
update-det-cal
This module is used to compute detector calibration parameters from sodetlib data products.
The naive computation is described in the sodetlib documentation.
Details about the RP and loopgain correction can be found on our confluence.
CalInfo object
- class sotodlib.site_pipeline.update_det_cal.CalInfo(readout_id: str = '', r_tes: float = nan, r_frac: float = nan, p_bias: float = nan, s_i: float = nan, phase_to_pW: float = nan, v_bias: float = nan, tau_eff: float = nan, loopgain: float = nan, tes_param_correction_success: bool = False, bg: int = -1, polarity: int = 1, r_n: float = nan, p_sat: float = nan, naive_r_tes: float = nan, naive_r_frac: float = nan, naive_p_bias: float = nan, naive_s_i: float = nan, bandpass: str = 'NC')[source]
Class that contains detector calibration information that will go into the caldb.
- r_tes
Detector resistance [ohms], determined through bias steps while the detector is biased.
- Type:
- phase_to_pW
Phase to power conversion factor [pW/rad] computed using s_i, pA_per_phi0, and detector polarity.
- Type:
- v_bias
Commanded bias voltage [V] on the bias line of the detector for the observation.
- Type:
- tau_eff
Effective thermal time constant [sec] of the detector, measured from bias steps.
- Type:
- tes_param_correction_success
True if TES parameter corrections were successfully applied.
- Type:
- bg
Bias group of the detector. Taken from IV curve data, which contains bgmap data taken immediately prior to IV. This will be -1 if the detector is unassigned.
- Type:
- polarity
Polarity of the detector response for a positive change in bias current while the detector is superconducting. This is needed to correct for detectors that have reversed response.
- Type:
- p_sat
“saturation power” of the TES [W] calculated from IV curve data. This is defined as the electrical bias power at which the TES resistance is 90% of the normal resistance.
- Type:
- naive_r_tes
Detector resistance [ohms]. This is based on the naive bias step estimation without any additional corrections.
- Type:
- naive_r_frac
Fractional resistance of TES, given by r_tes / r_n. This is based on the naive bias step estimation without any additional corrections.
- Type:
- naive_p_bias
Bias power on the TES [W] computed using bias steps at the bias point. This is based on the naive bias step estimation without any additional corrections.
- Type:
- naive_s_i
Current responsivity of the TES [1/V] computed using bias steps at the bias point. This is based on the naive bias step estimation without using any additional corrections.
- Type:
Configuration
Configuration of the update_det_cal script is done by supplying a yaml file.
usage: update_smurf_caldbs.py [-h] config_file
Positional Arguments
- config_file
yaml file with configuration for update script.
The possible configuration parameters are defined by the DetCalCfg class:
- class sotodlib.site_pipeline.update_det_cal.DetCalCfg(root_dir: str, context_path: str, *, raise_exceptions: bool = False, fit_tau: bool = False, apply_cal_correction: bool = True, hwpss_subtraction: bool = False, metadata_list: str | List[str] = 'all', index_path: str = 'det_cal.sqlite', h5_path: str = 'det_cal.h5', h5_unix_digits: int = 0, cache_failed_obsids: bool = True, failed_cache_file: str = 'failed_obsids.yaml', show_pb: bool = True, param_correction_config: Dict[str, Any] | None | sodetlib.tes_param_correction.AnalysisCfg = None, run_method: str = 'site', nprocs_obs_info: int = 1, nprocs_result_set: int = 10, num_obs: int | None = None, log_level: str = 'DEBUG', multiprocess_start_method: Literal['spawn', 'fork'] = 'spawn')[source]
Class for configuring the behavior of the det-cal update script.
- Parameters:
root_dir (str) – Path to the root of the results directory.
context_path (str) – Path to the context file to use.
raise_exceptions (bool) – If Exceptions should be raised in the get_cal_resset function. Defaults to False.
fit_tau (bool) – If True, re-fit the biasstep tau. Defaults to False.
apply_cal_correction (bool) – If True, apply the RP calibration correction, and use corrected results for Rtes, Si, Pj, and loopgain when successful. Defaults to True.
hwpss_subtraction (bool) – If True, reanalyze biasstep with hwpss subtraction. Defaults to False.
metadata_list (str or List of str) – List of metadata labels to load. Defaults to ‘all’.
index_path (str) – Path to the index file to use for the det_cal database. Defaults to “det_cal.sqlite”.
h5_path (str) – Path to the HDF5 file to use for the det_cal database. Default to “det_cal.h5”.
h5_unix_digits (int) – Number of digits of unixtime to be added to h5_path. For example if h5_unix_digits = 4, h5_path will be modified to “det_cal_1700.h5”. Defaults to 0.
cache_failed_obsids (bool) – If True, will cache failed obs-ids to avoid re-running them. Defaults to True.
failed_file_cache (str) – Path to the yaml file that will store failed obsids. Defaults to “failed_obsids.yaml”.
show_pb (bool) – If True, show progress bar in the run_update function. Defaults to True.
param_correction_config (dict) – Configuration for the TES param correction. If None, default values are used.
run_method (str) – Must be “site” or “nersc”. If “site”, this function will not parallelize SQLite access, and will only parallelize the TES parameter correction. If “nersc”, this will parallelize both SQLite access and the TES param correction, using
nprocs_obs_infoandnprocs_result_setprocesses respectively.nprocs_obs_info (int) – Number of processes to use to acquire observation info from the file system. Defaults to 1.
nprocs_result_set (int) – Number of parallel processes that should to compute the TES parameters, and to run the TES parameter correction.
num_obs (Optional[int]) – Max number of observations to process per run_update call. If not set, will run on all available observations.
log_level (str) – Logging level for the logger.
multiprocess_start_method (str) – Method to use to start child processes. Can be “spawn” or “fork”.
Detector and Readout ID Mapping
These processes are interrelated and use a combination of the DetMap software package and sotodlib. The two scripts below are designed to use the same config files for simplicity and can be run with the level 2 G3tSmurf setup. The resulting ManifestDbs should work for both level 2 and level 3 SMuRf data.
make_det_info_wafer
This script uses basic array construction inputs to assemble a table
of information about a set of UFMs and save it to an HDF5 file. The
resulting datasets may be used to populate det_info.wafer, once
the det_id of the readout channels is known. The detector info
mapping created by this script is re-usable as long as the UFM
continues to be associated with the same stream_id.
Here is a basic configuration file:
output_dir: ./satp3_wafer_info_240305
array_info_dir: "/home/pipeline/site-pipeline-configs/shared/detmapping/design/"
stream_ids:
- ufm_mv5
- ufm_mv12
- ufm_mv17
- ufm_mv23
- ufm_mv27
- ufm_mv33
- ufm_mv35
Check the det_ids for sensibility… and if you need to force bandpass values, add a config file entry like this:
bandpass_remap:
90: 220
150: 280
For LF arrays, the inputs and outputs are somewhat different and the
config should include mode: LF; e.g.:
output_dir: ./lat_wafer_info_260211
array_info_dir: "/home/pipeline/site-pipeline-configs/shared/detmapping/design/"
mode: LF
stream_ids:
- mfm_lot1
The output database wafer_info.sqlite and HDF5 file
wafer_info.h5 are written to the output_dir, which is created
if it does not exist.
get_brightsrc_pointing_part1
The two-part get_brightsrc_pointing_part{} script set will solve for the xieta
coordinates of detectors that observe a bright source during an observation.
It is a two part process that requires a map step and then a TOD step. The scripts require the settings and preprocessing config files described below.
For job submission and parallelization, see example NERSC slurm submission config at the end of this section.
The code will process all wafers unless otherwise specified.
It is recommended to run with parallel_job: True in the config files if analyzing
multiple wafers at once.
Otherwise, specify a wafer slot or restrict detectors in command line args to debug.
Command Line arguments: .. argparse:
:module: sotodlib.site_pipeline.get_brightsrc_pointing_part1
:func: get_parser
There options to include min_ctime and max_ctime arguments, which will process all observations in the time frame, is not recommended unless severely restricting the detectors for debugging.
Generated results
The Step 1 map-based analysis scripts will generate the following outputs in the specified directory:
Single detector maps in
/results/single_det_maps/<obs_id>_<ws#>.hdf.
All single maps are packaged in a single hdf file, with detector readout_id as the keys in the h5py file.
2. Fitted xi-eta focal plane position results saved as ResultSet in
/path/to/results/map_based_resultsas specified in the Step 1 config file. Script will append ‘force_zero_roll’ onto the specified results_dir if True in config file. Load ResultSet with keyword ‘focal_plane’
Contents:
ResultSet<[dets:readout_id, xi, eta, gamma, R2], N rows>
- The Step 2 TOD-based analysis scripts will use the map-based results as a starting point
and then generate the finalized outputs in the specified directory:
1. Fitted xi-eta focal plane position results saved as ResultSet in
/path/to/results/tod_based_resultsas specified in config file for Step-2. Script will append ‘force_zero_roll’ onto the specified results_dir if True in config file. Load ResultSet with keyword ‘focal_plane’. The median boresight values from small time range the source was visible to each detector is included.Contents:
ResultSet<[dets:readout_id, xi, eta, gamma, xi_err, eta_err, R2, redchi2, az, el, roll], N rows>
Configuration Files
The configuration files to be input as configs in the command line should
have the following arguments as well as any preprocessing steps wished to be taken.
Only processing steps that are agnostic of det-match can be used to do
initial analyses without formalized metadata.
The parameters in these examples could be used for SAT mid-freq moon observations.
Step 1 Config:
context_file: /path/to/context.yaml
query_tags: ['moon=1'] #(alternatively specify --sso_name in kwargs)
optics_config_fn: '/global/cfs/cdirs/sobs/users/elleshaw/process_brightsrc/ufm_to_fp.yaml'
single_det_maps_dir: /path/to/results/single_det_maps
results_dir: /path/to/results/map_based_results
parallel_job: True #For job submission. Parallel across wafers.
wafer_mask_det: 8. # (degrees) mask around detector to cut TOD when source too far away.
res_deg: 0.3
xieta_bs_offset: [0., 0.] #Good to input xieta offset in radians. (!!! for satp2)
save_normal_roll: False #false for SAT, true for LAT
save_force_zero_roll: True #true for SAT, false for LAT
hit_circle_r_deg: 7. # radial mask to decide which UFMs are hit by source and should be analyzed.
hit_time_threshold: 600 #seconds, if hit_time not met then UFM does not get analyzed.
process_pipe:
- name: 'detrend'
process:
count: 2000
method: 'linear'
- name: 'apodize'
process:
apodize_samps: 2000
- name: 'fourier_filter'
process:
signal_name: "signal"
wrap_name: null
filt_function: "low_pass_sine2"
trim_samps: null
filter_params:
cutoff: 1.9
width: 0.2
- name: 'fourier_filter'
process:
signal_name: "signal"
wrap_name: null
filt_function: "high_pass_sine2"
trim_samps: 2000
filter_params:
cutoff: 0.05
width: 0.1
Part 2 is the TOD-based step. Its config file should look like the following. The parameters in these examples are used for SAT mid-freq moon observations.
context_file: /path/to/context.yaml
query_tags: ['moon=1'] #(alternatively specify --sso_name in kwargs)
optics_config_fn: '/global/cfs/cdirs/sobs/users/elleshaw/process_brightsrc/ufm_to_fp.yaml'
fp_hdf_dir: /path/to/results/map_based_results from step 1 config file.
# If force_zero_roll is was True, then append _force_zero_roll to the end.
# Just make sure it matches where the results from Step 1 are.
result_dir: /path/to/resuls/tod_based_results #Where you want the final Step2 results to show up.
parallel_job: True
force_zero_roll: True #Results will show up roatated in the xi-eta results as they are on the sky.
ds_factor: 40
mask_deg: 2.5 # (degrees) size for circular mask around SSO (helps exclude focal plane reflections too)
fit_func_name: 'gaussian2d_nonlin'
max_non_linear_order: 3 #Suggested to use 1 for jupiter or sso's
#that do not saturate.
fwhm_init_deg: 0.5 # (degrees) Lower for SATp2
error_estimation_method: 'force_one_redchi2'
flag_name_rms_calc: 'around_source'
flag_rms_calc_exclusive: False
process_pipe:
- name: 'detrend'
process:
count: 2000
method: 'linear'
- name: 'fourier_filter'
process:
signal_name: 'signal'
filt_function: 'iir_filter'
trim_samps: null
filter_params:
invert: True
- name: 'apodize'
process:
apodize_samps: 2000
- name: 'fourier_filter'
process:
signal_name: "signal"
wrap_name: null
filt_function: "low_pass_sine2"
trim_samps: null
filter_params:
cutoff: 1.9
width: 0.2
- name: 'source_flags'
source_flags_name: 'source_wide'
save: True
calc:
mask:
shape: circle
xyr: [0., 0., 5.0]
merge: True
max_pix: 10000000000
- name: 'source_flags'
source_flags_name: 'source_narrow'
save: True
calc:
mask:
shape: circle
xyr: [0., 0., 3.0]
merge: True
max_pix: 10000000000
- name: 'combine_flags'
process:
flag_labels: ['source_wide.moon', 'source_narrow.moon']
method: 'except'
total_flags_label: 'around_source'
- name: 'flag_turnarounds'
process:
truncate: True
- name: 'sub_polyf'
process:
method: 'legendre'
degree: 2
mask: 'around_source'
exclusive: False
Example NERSC slurm job submission config file
Submit the job submission file with the following commands:
For Step 1 map-based
sbatch submit_moon_job_script.sh <platform> <obs_id> 1 map
For Step 2 TOD based
sbatch submit_moon_job_script.sh <platform> <obs_id> 0 tod
get_brightsrc_pointing_part2
See Part 1 for description
make_read_det_match
This script generates the readout ID to detector ID mapping required to
translate between the detector hardware information (ex: pixel position) and the
readout IDs of the resonators used to index the SMuRF data. The script uses the
G3tSmurf database to generate a list of TuneSets (tune files) for a set of
arrays / stream ids and runs the DetMap mapping software to generate a mapping
between detectors and resonators. The saved metadata is formatted so with the
correctly formatted Context file the detector ids can be automatically loaded in
the det_info AxisManager.
Config file format
Here’s an example configuration file. Many of these values depend on hardware
setup and readout software setup. Making the detector ID info only requires a
subset of these parameters but the processes are linked so it is probably worth
always having the same configuration file. Tested mapping strategies include
assignment and map_by_freq.
data_prefix : "/path/to/level2/data/"
g3tsmurf_db: "/path/to/g3tsmurf.db"
read_db: "/path/to/readout_2_detector_manifest.db"
read_info: "/path/to/readout_2_detector_hdf5.h5"
det_db : "/path/to/det_info/wafer/det_info_manifest.db"
det_info : "/path/to/det_info/wafer/det_info_hdf5.h5"
arrays:
# name must match DetMap array names
- name: "Cv4"
stream_id: "ufm_cv4"
# Based on hardware config
north_is_highband: False
dark_bias_lines: []
# how we want to call DetMap
mapping :
version : 0
strategy: "assignment"
# parameters for mapping strategy
params: {
"output_parent_dir":"/writable/path/",
"do_csv_output": False,
"verbose": False,
"save_layout_plot": False,
"show_layout_plot": False,
}
Context file format
To load these metadata with context, these entries must be part of the context
file. Since the detector hardware information loads off of the det_id field,
which is loaded from the readout to detector mapping, the order of the metadata
entries mater.
imports:
- sotodlib.io.load_smurf
- sotodlib.io.metadata
obs_loader_type: 'g3tsmurf'
metadata:
- db: "/path/to/readout_2_detector_manifest.db"
det_info: true
- db: "/path/to/det_info/wafer/det_info_manifest.db"
det_info: true
multi: true
update_det_match
The update_det_match script will run the det_match module on any new
detsets with available calibration metadata. It loads smurf and resonator
information from the AxisManager metadata, and matches resonators against a
solution file in the site-pipeline-configs.
To run, this script requires a config file described below. If run without the
--all flag, it will only run one detset at a time. If run with the
--all flag, will continue running until all detsets have been mantched.
Generated results
This generates the following data in the specified results directory:
A match file, with the path
<results_path>/matches/<detset>.h5is written for every detset.The file
<results_path>/assignment.sqliteis a manifestdb, that contains the mapping from readout-id to detector-id. This is compatible with thedet_info_waferandfocal_planemetadata.The
<results_path>/det_match.sqlitefile, that contains thedet_match.Resonatordata from the match for each resonator.
Configuration
This script takes in a config yaml file, which corresponds directly to the
UpdateDetMatchesConfig class (see docs below).
For example, this can run simply with the config file:
results_path: /path/to/results
context_path: /path/to/context.yaml
Note that by default, this will run a scan of frequency offsets between the solution and the resonator metadata to find the freq-offset with the best match. To disable this, you can run a config file like the following:
results_path: /path/to/results
context_path: /path/to/context.yaml
freq_offset_range_args: None
Below is a more complex config used for SATp1 matching:
results_path: /so/metadata/satp1/manifests/det_match/satp1_det_match_240220m
context_path: /so/metadata/satp1/contexts/smurf_detcal.yaml
show_pb: False
freq_offset_range_args: NULL
apply_solution_pointing: False
solution_type: resonator_set
resonator_set_dir: /so/metadata/satp1/ancillary/detmatch_solutions/satp1_detmatch_solutions_240219r1
match_pars:
freq_width: 0.2
Below is the full docs of the configuration class.
- class sotodlib.site_pipeline.update_det_match.UpdateDetMatchesConfig(results_path: str, context_path: str, site_pipeline_root: str | None = None, wafer_map_path: str | None = None, freq_offset_range_args: tuple[float, float, float] | None = (-4, 4, 0.3), match_pars: Dict | None = None, detset_meta_name: str = 'smurf', detcal_meta_name: str = 'det_cal', ufms: List[str] | None = None, show_pb: bool = False, apply_solution_pointing: bool = True, write_relpath: bool = True, solution_type: str = 'kaiwen_handmade', resonator_set_dir: str | None = None, time_before_cache_failure: float = 604800.0, start_time: int = 0, stop_time: int = 4294967296)[source]
Configuration for update script
- Parameters:
results_path (str) – Path to directory where results such as matches, manifestdbs, and h5 files will be stored.
context_path (str) – Path to context file. This must contain detset and det_cal metadata.
site_pipeline_root (str) – Path to root of site-pipeline-configs. If
$SITE_PIPELINE_CONFIG_DIRis set in the environment, that will be used as the default.wafer_map_path (str) – Path to wafer-map to be used to find det-match solution files. If not specified, defaults to
<site_pipeline_root>/shared/detmapping/wafer_map.yaml.freq_offset_range_args (Optional[Tuple[float, float, float]]) – If this is not None, for each match, we will scan over a range of freq-offsets to determine the optimal offset to use. If set, must contain a tuple of floats, containing ([start,] stop, [step,]) that will be passed directly to
np.arange. If it is None, will just run with the match with freq_offset_mhz=0.match_pars (Optional[Dict]) – If not None, will be passed directly to
det_match.MatchParamsthat is used by the det-match function.detset_meta_name (str) – Name of the metadata entry in the context that contains detset info.
detcal_meta_name (str) – Name of the metadata entry in the context that contains det_cal info.
ufms (Optional[List[str]]) – List of ufm names to run update_det_match on. Will run on all ufms for which detsets exist if None.
show_pb (bool) – Will show progress bar when scanning freq-offset.
apply_solution_pointing (bool) – If True, pointing information computed from design-detector positions will be used in the
mergeddetset of the match.write_relpath (bool) – If True, will use the relative path to the h5 file (relative to the db path) when writing to the manifestdb
solution_type (str) – Type of solutions to use. Must be one of [‘kaiwen_handmade’, ‘resonator_set’]. If ‘kaiwen_handmade’, will use the handmade solutions from Kaiwen pulled from the wafer_map file in the site-pipeline-configs. If resonator_set, must also specify the
resonator_set_dirto pull solutions from.resonator_set_dir (Optional[str]) – If
solution_typeis ‘resonator_set’, this must be specified and contain the path to the resonator-set solutions. This directory must have a res-set npy file for each stream_id that is expected in the matching, formatted like<resonator_set_dir>/<stream_id>.npy, which contains the result fromnp.save(fname, match.merged.as_array()).time_before_cache_failure (float) – Time in seconds before a failed detset will be added to the cache. This is to prevent new detsets still acquiring data from being added right away.
- freq_offsets
If not None, contains freq_offsets determined by
freq_offset_range_argswhich will be scanned over.- Type:
Optional[np.ndarray]
update-hkdb
The update_hkdb site-pipeline script is used to scan through housekeeping files, and update the index database. Configuration for this script must be passed in through a config file, with fields that map to the HkConfig dataclass, described below:
- class sotodlib.io.hkdb.HkConfig(hk_root: str, db_file: str | None = None, db_url: str | ~sqlalchemy.engine.url.URL | None = None, echo_db: bool = False, file_idx_lookback_time: float | None = None, show_index_pb: bool = True, aliases: ~typing.Dict[str, str] = <factory>)[source]
Configuration object for indexing and loading from an HK archive.
If instantiating from a nested dictionary, the
from_dictclass method can be used to convert fields to their proper data types.- Parameters:
hk_root (str) – Root directory for the HK archive
db_file (Optional[str]) – Path to the hk index database if the database is an sqlite file. Either this or db_url must be set
db_url (Optional[Union[str, db.URL]]) – URL used for db engine. Either this or db_file must be set
echo_db (bool) – Whether database operations should be echoed
file_idx_lookback_time (Optional[float]) – Time [sec] to look back when scanning for new files to index
show_index_pb (bool) – If true, shows progress bar when indexing
Aliases for hk fields. In this dict, the key is the alias name, and the value is the field descriptor, in the format of
agent.feed.field. For example:{ 'fp_temp': 'cryo-ls372-lsa21yc.temperatures.Channel_02_T', }
These aliases are only used on load, and do not affect how data is stored in the hkdb. Aliases should be valid python identifiers, since they will be set as attributes in the HkResult object.
analyze-bright-ptsrc
This script analyzes an observation of a bright point source (such as a planet) and performs per-detector fitting of beam parameters including amplitude and FWHM.
usage: __main__.py [-h] [--ctx_file CTX_FILE] --obs_id OBS_ID --sso_name
SSO_NAME --band BAND --tele TELE --tube TUBE --ufm UFM
--max_samps MAX_SAMPS [--scan_rate SCAN_RATE]
--config_file_path CONFIG_FILE_PATH [--outdir OUTDIR]
[--test-mode] [--highpass] [--cutoff_high CUTOFF_HIGH]
[--lowpass] [--cutoff_low CUTOFF_LOW]
[--threshold_src THRESHOLD_SRC] [--distance DISTANCE]
[--do_abs_cal] [--fit_pointing] [--fit_beam]
[--pointing_dict POINTING_DICT]
[--representative_dets REPRESENTATIVE_DETS [REPRESENTATIVE_DETS ...]]
[--plot_results]
Named Arguments
- --ctx_file
The location of the context file.
- --obs_id
Observation id in the context file.
- --sso_name
Source name.
- --band
Frequency band.
- --tele
Telescope name (SAT/LAT).
- --tube
Telescope tube.
- --ufm
Telescope ufm.
- --max_samps
Number of maximum samples to load
- --scan_rate
Scan rate, required for simulations
Default:
200- --config_file_path
Location of configuration file that contains beam size and fitting parameters.
- --outdir
The location for the .h5 output files to be stored.
- --test-mode
Run analysis on a subset of detectors, to quickly check for problems.
Default:
False- --highpass
If True, use highpass sine filter.
Default:
False- --cutoff_high
The cutoff frequency to be used in the filtering.
- --lowpass
If True, use lowpass sine filter.
Default:
False- --cutoff_low
The cutoff frequency to be used in the filtering.
- --threshold_src
The max amplitude required for the peak finding
Default:
10- --distance
The max amplitude required for the peak finding
Default:
50000- --do_abs_cal
Do absolute calibration fit.
Default:
False- --fit_pointing
Fit the pointing parameters.
Default:
False- --fit_beam
Fit the beam parameters.
Default:
False- --pointing_dict
Path to a pickle file of a pointing dictionary.
- --representative_dets
Representative detectors across the focal plane whose raw and fitted data should be plotted, given as a list of readout ids.
Default:
'no_detectors'- --plot_results
Make plots of planet footprint and fitted results.
Default:
False
finalize-focal-plane
This element produces a finalized focal plane for a given array.
It consumes the output of pointing fits (ie from analyze-bright-ptsrc)
with a detector map to combine results across multiple tuning epochs.
It works by averaging the provided analyze-bright-ptsrc results using weights,
determined by how well each fit matches the nominal template, to produce a final focal plane.
An affine transformation that lines up the template focal plane computed with physical optics
is then computed to create a “noise-free” focal plane.
This element also computes the receiver and optics tube “common mode” transformation. The optics tube common mode is how all of the arrays in one optics tube move together, and the receiver common mode is how all of the optics tubes move together. In the case of the SATs where there is only one tube, the optics tube common mode is always taken to be the identity. Given the smaller number of data points, these common modes are simple rigid transforms (shift and rotation) rather than a full affine transform.
finalize_focal_plane can optionally be run in a “per obs” mode where no averaging is done,
in this case the output database is indexed by obs_id.
- sotodlib.site_pipeline.finalize_focal_plane.gamma_fit(src, dst)[source]
Fit the transformation for gamma. Note that the periodicity here assumes things are in radians.
- Parameters:
src – Source gamma in radians
dst – Destination gamma in radians
- Returns:
Scale applied to src
shift: Shift applied to scale*src
- Return type:
scale
Config file format
Here’s an annotated example:
# This is the time range that this focal plane is valid for
# These are ctime
start_time: 0 # default is 0
stop_time: 1726605779 # default is 2**32
# There are two options to get the data in
# One is to pass in ResultSets like so:
resultsets:
obs_1: # obs_id associated with this data
# There are 3 possible ResultSets you can pass
# pointing is mandatory
pointing:
- "PATH/TO/FITS.h5" # The path to the ResultSet
- "focalplane" # The name of the ResultSet in the h5 file
# polarization and detmap are optional
polarization :
- "PATH/TO/FITS.h5"
- "polarization"
detmap:
- "PATH/TO/DETMAP.h5"
- "merged"
obs_2: ...
# When using results sets you also need to pass in additional metadata like
stream_id: "ufm_mv29"
wafer_slot: "ws0"
telescope_flavor: "SAT"
tube_slot: "st1"
# Note that in the ResultSets case only single wafer fits are supported
# You can also load the data in with context like so
context:
path: PATH/TO/CONTEXT
# There are two pointing fields in case we have both a tod and map fit for one obs_id
# This may change down the line
map_pointing: "map_pointing" # The name of the map based pointing metadata field
tod_pointing: "tod_pointing" # The name of the TOD based pointing metadata field
polarization: "polarization" # The name of the polarization metadata field (optional)
# There are two ways to specitfy the observation, obs_id and query
# Both can be provided
obs_id: [obs_1, obs_2] # Pass in the obs_id directly
query: QUERY # Pass in a query
# You can pass in detector restrictions here as well
dets: {} # Should be a dict you would pass to the dets arg of ctx.get_meta
# If neither option is passed all obs from start_time to stop_timewith valid metadata are used
per_obs: False # Set to true if you want to run in per obs mode
weight_factor: 1000 # Weights are computed with sigma=template_spacing/weight_factor.
# This is an advanced feature and should be used with caution.
# There are a few ways to pass in a template as well
template: "PATH/TO/TEMPLATE.h5" # As a h5 file with a ResultSet named the same as the UFM
gen_template: False # Or by setting this true to generate the template on the fly
# You also will need to provide some information for using the optics code
pipeline_config_dir : "PATH/TO/PIPELINE/CONFIGS" # If not provided the sysvar $PIPELINE_CONFIG_DIR is used
zemax_path: "PATH/TO/ID9_checked_trace_data.npz" # Only needed for the LAT
# Plotting info
plot: True # Set to output plot
plot_dir: "./plots" # Where to save plots
# Output info
outdir: "."
append: "test" # Will have a "_" before it.
Output file format
The results of finalize_focal_plane are stored in an HDF5 file containing
multiple datasets. The datasets are made using the ResultSet class and can be
loaded back as such but metadata stored as attributes require h5py.
The datasets and attributes are organized by tube and array as seem below:
focal_plane.h5
- (group) result # For combined results this is the start of the validity period
# For per-obs this is the obs_id
- (attr) center # The nominal center of the receive on sky
- (attr) center_transformed # The center with the common mode transform applied
- (group) transform # The receiver common mode
- (group) tube1 # The first tube (ie st1, oti1, etc.)
- (attr) center # The nominal center of the tube on sky
- (attr) center_transformed # The center with the common mode transform applied
- (group) transform # The tube common mode
- (group) ufm_1 # The first ufm for thi tube (ie ufm_mv29)
- (attr) template_centers # The nominal center for this array
- (attr) fit_centers # The fit center for this array
- (group) transform # The transform for the ufm, includes parameters with and without the common mode
- (dataset) focal_plane # The focal_plane with just fit positions
- (attr) measured_gamma # If gamma was actually measured
- (dataset) focal_plane_full # Also includes avg positions, weights, and counts
- (group) ufm_2
...
...
...
The focal_plane dataset contains four columns:
dets:det_id: The detector idxi: The transformed template xi in radianseta: The transformed template eta in radiansgamma: The transformed template gamma in radians.
If no polarization angles are provided them gamma will be populated
with the nominal values from physical optics.
There is an attribute called measured_gamma that will be False in this case.
The focal_plane_full dataset contains nine columns:
dets:det_id: The detector idxi_t: The transformed template xi in radianseta_t: The transformed template eta in radiansgamma_t: The transformed template gamma in radians.xi_m: The measured xi in radianseta_m: The measured eta in radiansgamma_m: The measured gamma in radians.weights: The average weights of the measurements for this det.r2: The fit weight passed in from the get_brightsrc_pointing datasetaz: The median Az value in radians from source-detector crossingel: The median El value in radians from source-detector crossingroll: The median Roll value in radians from source-detector crossingn_point: The number of pointing fits used for the det.n_gamma: The number of gamma fits used for this det.
All the attributes having to do with the centers of things are (1,3) arrays
in the form ((xi), (eta), (gamma)) in radians.
This transformation for xi and eta is an affine transformation defined as
\(m = An + t\), where:
mis the measuredxi-etapointingnis the nominalxi-etapointingAis the 2x2 affine matrixtis the final translation
A is then decomposed into a rotation of the xi-eta plane, a shear parameter,
and a scale along each axis.
This decomposition is done assuming the order as A = rotation*shear*scale.
For gamma the transformation is also technically affine, but since it is in just one dimension it can be described by a single shift and scale.
All of these results are stored as attributes in the transform groups.
These nominally are:
affine: The full affine matrixshift: The shift in(xi, eta, gamma)in radiansscale: The scale along(xi, eta, gamma)in radiansrot: The rotation of thexi-etaplaneshear: The shear of thexi-etaplane
The transform group for the arrays also include these attributes with
whe common mode removed, the names have _nocm appended (ie rot_nocm).
Since the common mode transformations are fit as affine transforms scale will
always be (1, 1, 1) and shear will be 0.
finalize_focal_plane will also output a ManifestDb as a file called db.sqlite
in the output directory.
By default this will be indexed by stream_id and obs:timestamp and will point to the focal_plane dataset.
If you are running in per_obs mode then it will be indexed by obs_id and will point
to results associated with data observation.
Be warned that in this case there will only be entries for observations with pointing fits,
so design your context accordingly.
Focal planes can be loaded directly from the hdf5 files if you require information other than
the focal_plane dataset. This can be done like so:
from sotodlib.coords import fp_containers as fpc
rxs = fpc.Receiver.load_file(PATH)
This will give you a dict of Receiver dataclasses with all the focal plane data.
The keys of this dict are the start times for combined focal planes and the obs_id for per-obs.
preprocess-tod
This script is set up to run a preprocessing pipeline using the preprocess module. See details in See details here for how to build a preprocessing pipeline.
This module includes the functions designed to be run as part of a batch script for automated analysis as well as options for loading AxisManagers that have all the preprocessing steps applied to them.
usage: __main__.py [-h] [--query QUERY] [--obs-id OBS_ID] [--overwrite]
[--min-ctime MIN_CTIME] [--max-ctime MAX_CTIME]
[--update-delay UPDATE_DELAY] [--tags [TAGS ...]]
[--planet-obs] [--verbosity VERBOSITY] [--nproc NPROC]
[--compress] [--run-from-jobdb] [--raise-error RAISE_ERROR]
[--pb-path PB_PATH]
configs
Positional Arguments
- configs
Preprocessing Configuration File
Named Arguments
- --query
Query to pass to the observation list. Use 'string' to pass in strings within the query.
- --obs-id
obs-id of particular observation if we want to run on just one
- --overwrite
If true, overwrites existing entries in the database
Default:
False- --min-ctime
Minimum timestamp for the beginning of an observation list
- --max-ctime
Maximum timestamp for the beginning of an observation list
- --update-delay
Number of days (unit is days) in the past to start observation list.
- --tags
Observation tags. Ex: –tags ‘jupiter’ ‘setting’
- --planet-obs
If true, takes all planet tags as logical OR and adjusts related configs
Default:
False- --verbosity
increase output verbosity. 0:Error, 1:Warning, 2:Info(default), 3:Debug
Default:
2- --nproc
Number of parallel processes to run on.
Default:
4- --compress
Compress preprocessing database.
Default:
False- --run-from-jobdb
If True, use open jobs in jobdb as the run_list.
Default:
False- --raise-error
Raise an error upon completion if any obsids or groups fail.
Default:
False- --pb-path
Path to where to save progress bar.
preprocess-obs
This script is set up to run a preprocessing pipeline using the preprocess module. See details in See details here for how to build an obs preprocessing pipeline.
This module is similar to preprocess_tod but removes grouping by detset so
that the entire observation is loaded, without signal.
usage: __main__.py [-h] [--query QUERY] [--obs-id OBS_ID] [--overwrite]
[--min-ctime MIN_CTIME] [--max-ctime MAX_CTIME]
[--update-delay UPDATE_DELAY] [--tags [TAGS ...]]
[--planet-obs] [--verbosity VERBOSITY] [--lat]
configs
Positional Arguments
- configs
Preprocessing Configuration File
Named Arguments
- --query
Query to pass to the observation list. Use 'string' to pass in strings within the query.
- --obs-id
obs-id of particular observation if we want to run on just one
- --overwrite
If true, overwrites existing entries in the database
Default:
False- --min-ctime
Minimum timestamp for the beginning of an observation list
- --max-ctime
Maximum timestamp for the beginning of an observation list
- --update-delay
Number of days (unit is days) in the past to start observation list.
- --tags
Observation tags. Ex: –tags ‘jupiter’ ‘setting’
- --planet-obs
If true, takes all planet tags as logical OR and adjusts related configs
Default:
False- --verbosity
increase output verbosity. 0:Error, 1:Warning, 2:Info(default), 3:Debug
Default:
2- --lat
If true, filter obs list to only keep the first obs_ids of those with timestamps within 10 seconds, since lat obs_ids are split by optics tube. We only need one for preprocess_obs loads the entire aman without signal.
Default:
False
make-source-flags
Command line arguments
usage: make-source-flags [-h] [-c CONFIG_FILE] [-v] obs_id
Positional Arguments
- obs_id
Observation for which to generate flags.
Named Arguments
- -c, --config-file
Configuration file.
- -v, --verbose
Pass multiple times to increase.
Default:
0
Config file format
Here’s an annotated example:
# Context for <whatever>
context_file: ./context4_b.yaml
# How to subdivide observations (by detset, but call it "wafer_slot")
subobs:
use: detset
label: wafer_slot
# Metadata index & archive filenaming
archive:
index: 'archive.sqlite'
policy:
type: 'simple'
filename: 'archive.h5'
# Mask parameters
mask_params:
mask_res: [2, 'arcmin']
default: {'xyr': [0., 0., 0.1]}
make-uncal-beam-map
This module produces maps for a single observation of a bright point source. The observation is identifed by an obs_id. The data for the observation may be divided into different detector groups; and each ‘group’ will be loaded and mapped independently (this will normally be associated with a “detset”). The data for each observation in each group may be further subdivided into ‘data splits’; this normally corresponds to frequency “band”.
- sotodlib.site_pipeline.make_uncal_beam_map.plot_map(bundle, filename=None, tod=None, obs_info=None, det_info=None, focal_plane=None, det_mask=None, group=None, subset=None, zoom_size=None, title=None, **kwargs)[source]
- sotodlib.site_pipeline.make_uncal_beam_map.main(config_file=None, obs_id=None, verbose=0, test=False)[source]
Entry point.
Command line arguments
usage: make-uncal-beam-map [-h] [-c CONFIG_FILE] [-v] [--test] obs_id
Positional Arguments
- obs_id
Observation for which to make source map.
Named Arguments
- -c, --config-file
Configuration file.
- -v, --verbose
Pass multiple times to increase.
Default:
0- --test
Reduce detector count for quick tests.
Default:
False
Config file format
Here’s an annotated example:
# Data source
context_file: ./act_uranus/context.yaml
# Sub-observation data grouping
subobs:
use: detset
label: wafer_slot
# Database of results
archive:
index: 'archive.sqlite'
policy:
type: 'directory'
root_dir: './'
pattern: 'maps/{product_id}'
# Output selection and naming
output:
map_codes: ['solved', 'weights']
pattern: '{product_id}_{split}_{map_code}.fits'
# Plot generation
plotting:
zoom:
f090: [10, arcmin]
f150: [10, arcmin]
# Preprocessing
preprocessing:
cal_keys: ['abscal', 'relcal']
pointing_keys: ['boresight_offset']
# mapmaking parameters
mapmaking:
force_source: Uranus
res:
f090: [15, arcsec]
f150: [15, arcsec]
Inputs
The Context should cause the TOD to be loaded with all supporting metadata loaded into the AxisManager. Here are key members that will be processed:
Deconvolution step:
'timeconst''iir_params'
Calibration:
Whatever is listed in preprocessing.cal_keys
Pointing correction:
'boresight_offset'
Demodulation and downsampling:
not implemented
Planet mapmaking:
'source_flags''glitch_flags'- optional
update-hwp-angle
Script for running updates on (or creating) a hwp angle g3 file. This script will run periodically even when hwp is not spinning. Meaning is designed to work from something like a cronjob. The output hwp angle should be synchronized to SMuRF timing outside this script. See details here.
Command line arguments
Analyze HWP encoder data from level-2 HK data, and produce HWP angle solution for all times.
usage: update_hwp_angle [-h] -c CONFIG_FILE [-d DATA_DIR] [-o OUTPUT_DIR]
[--update-delay UPDATE_DELAY] [--file FILE]
[--verbose VERBOSE]
Named Arguments
- -c, --config-file
Configuration File for running update_hwp_angle
- -d, --data-dir
input data directory, overwrite config data_dir
- -o, --output-dir
output data directory, overwrite config output_dir
- --update-delay
Days to subtract from now to set as minimum ctime. Set to 0 to build from scratch
Default:
2- --file
Force processing of a specific file, overriding the standard selection process. The file must be in the usual data tree, though. You may specify either the file basename (1234567890.g3) or the full path.
- --verbose
increase output verbosity. 0: Error, 1: Warning, 2: Info(default), 3: Debug
Default:
2
make-hwp-solutions
This element generates HWP angle-related metadata, which contains the calibrated HWP angle and flags. The HWP angle is synchronized with the input SMuRF timestamp. See details here.
Command line arguments
usage: make_hwp_solutions [-h] [-o SOLUTION_OUTPUT_DIR]
[--encoder-output-dir ENCODER_OUTPUT_DIR]
[--verbose VERBOSE] [--overwrite] [--query QUERY]
[--min-ctime MIN_CTIME] [--max-ctime MAX_CTIME]
[--obs-id OBS_ID [OBS_ID ...]] [--off-site]
[--nprocs NPROCS]
context HWPconfig
Positional Arguments
- context
Path to context yaml file to define observation for which to generate hwp angle.
- HWPconfig
Path to HWP configuration yaml file.
Named Arguments
- -o, --solution-output-dir
output directory of solution metadata, overwrite config solution_output_dir
- --encoder-output-dir
output directory of encoder metadata, overwrite config encoder_output_dir
- --verbose
increase output verbosity. 0: Error, 1: Warning, 2: Info(default), 3: Debug
Default:
2- --overwrite
If true, overwrites existing entries in the database
Default:
False- --query
Query to pass to the observation list. Use 'string' to pass in strings within the query.
- --min-ctime
Minimum timestamp for the beginning of an observation list
- --max-ctime
Maximum timestamp for the beginning of an observation list
- --obs-id
List of obs-ids of particular observations that you want to run
- --off-site
This must be true when you run this off site. If true, this does not try to load L2 data
Default:
False- --nprocs
number of processes to use. We use nprocs=1 at site computing
Default:
1
make-cosamp-hk
This element generates house-keeping data with timestamps co-sampled with detector timestamps.
Command line arguments
usage: make-cosamp-hk [-h] [-o OUTPUT_DIR] [--verbose VERBOSE] [--overwrite]
[--query_text QUERY_TEXT]
[--query_tags [QUERY_TAGS ...]] [--min_ctime MIN_CTIME]
[--max_ctime MAX_CTIME] [--obs_id OBS_ID]
[--update-delay UPDATE_DELAY]
config
Positional Arguments
- config
configuration yaml file.
Named Arguments
- -o, --output_dir
output data directory, overwrite config output_dir
- --verbose
increase output verbosity. 0: Error, 1: Warning, 2: Info(default), 3: Debug
Default:
2- --overwrite
If true, overwrites existing entries in the database
Default:
False- --query_text
Query text
- --query_tags
Query tags
- --min_ctime
Minimum timestamp for the beginning of an observation list
- --max_ctime
Maximum timestamp for the beginning of an observation list
- --obs_id
obs_id of particular observation if we want to run on just one
- --update-delay
Number of days (unit is days) in the past to start observation list.
An example config for wiregrid is shown below.
With the config file, wiregrid.sqlite and wiregrid_XXXX.h5,
where XXXX is substituted with the first four digits of timestamps,
are generated on /path/to/manifests/wiregrid. context_file,
input_dir, output_dir, fields, aliases, and output_prefix
are required:
context_file: '/path/to/context.yaml'
query_text: 'type == "obs"'
min_ctime: 1700000000
max_ctime: null
query_tags: ['wiregrid=1']
input_dir: '/path/to/level2/hk'
output_dir: '/path/to/manifests/wiregrid'
fields:
['satpX.wg-encoder.feeds.wgencoder_full.reference_count',
'satpX.wg-actuator.feeds.wgactuator.limitswitch_LSL1',
'satpX.wg-actuator.feeds.wgactuator.limitswitch_LSL2',
'satpX.wg-actuator.feeds.wgactuator.limitswitch_LSR1',
'satpX.wg-actuator.feeds.wgactuator.limitswitch_LSR2',]
aliases:
['encoder',
'LS1',
'LS2',
'LSR1',
'LSR2']
output_prefix: 'wiregrid'
If you specifiy some of save_mean, save_median, save_rms, save_ptp boolean
in config file, those values are calculated for the first parameter of fields and
stored to columns of sqlite with name of like “{output_prefix}_mean”. If you specify
min_valid_value, max_valid_value, max_valid_dvalue_dt in config file, values
out of the range are set to np.nan and {output_prefix}_nan_fraction is added to
columns of sqlite. A config below is an example for PWV data with its valid range is
0.3 < pwv < 3.0 mm and its valid time derivative is 0.01 mm/s:
context_file: '/path/to/context.yaml'
query_text: null
min_ctime: null
max_ctime: null
update_delay: 1
query_tags: null
input_dir: '/path/to/level2/hk'
output_dir: '/path/to/manifests/pwv_clas'
fields:
['site.env-radiometer-class.feeds.pwvs.pwv',]
aliases: ['pwv_class',]
output_prefix: 'pwv_class'
save_mean: True
save_median: True
save_rms: True
save_ptp: True
min_valid_value: 0.3
max_valid_value: 3.0
max_valid_dvalue_dt: 0.01
make-ml-map
This submodule can be used to call the maximum likelihood mapmaker.
The mapmaker will produce bin, div and sky maps. The mapmaker
has several different flags (see the example config file below) that can
be passed via the CLI or a config.yaml file. If an argument is not
specified, a value is selected from a set of defaults.
The arguments freq, area and context are required; they should
either be supplied through the CLI or the config.yaml.
Command line arguments
usage: __main__.py [-h] [--comps COMPS] [-W WAFERS] [-B BANDS] [-C CONTEXT]
[--tods TODS] [-n NTOD] [-N NMAT] [--max-dets MAX_DETS]
[-S SITE] [-v] [-q] [-@ CENTER_AT] [-w WINDOW] [-i INJECT]
[--nocal] [--nmat-dir NMAT_DIR] [--nmat-mode NMAT_MODE]
[-d DOWNSAMPLE] [--maxiter MAXITER] [--interpol INTERPOL]
[-T TILED] [--srcsamp SRCSAMP] [--unit UNIT]
query area odir [prefix]
Positional Arguments
- query
- area
- odir
- prefix
Named Arguments
- --comps
List of components to solve for. T, QU or TQU, but only TQU is consistent with the actual data
Default:
'TQU'- -W, --wafers
Detector wafer subsets to map with. ,-sep
- -B, --bands
Bandpasses to map. ,-sep
- -C, --context
Default:
'/mnt/so1/shared/todsims/pipe-s0001/v4/context.yaml'- --tods
Arbitrary slice to apply to the list of tods to analyse
- -n, --ntod
Keep at most this many tods
- -N, --nmat
Noise model to use. corr or uncorr
Default:
'corr'- --max-dets
Keep at most this many detectors
- -S, --site
Default:
'so_lat'- -v, --verbose
Default:
0- -q, --quiet
Default:
0- -@, --center-at
- -w, --window
Default:
0.0- -i, --inject
Path to map to inject. Equatorial coordinates
- --nocal
Disable calibration. Useful for sims
Default:
False- --nmat-dir
Default:
'{odir}/nmats'- --nmat-mode
How to build the noise matrix. ‘build’: Always build from tod. ‘cache’: Use if available in nmat-dir, otherwise build and save. ‘load’: Load from nmat-dir, error if missing. ‘save’: Build from tod and save.
Default:
'build'- -d, --downsample
Downsample TOD by this factor. ,-sep
Default:
'1'- --maxiter
Max number of CG steps per pass. ,-sep
Default:
'500'- --interpol
Pmat interpol per pass. ,-sep
Default:
'nearest'- -T, --tiled
0: untiled maps. Nonzero: tiled maps
Default:
1- --srcsamp
path to mask file where True regions indicate where bright object mitigation should be applied. Mask is in equatorial coordinates. Not tiled, so should be low-res to not waste memory.
- --unit
Unit of the maps
Default:
'uK'
Default Mapmaker Values
The following code block contains the hard-coded default values for non-
essential mapmaker arguments. The can be overidden in the CLI or in the
config.yaml.
defaults = {"query": "1",
"comps": "T",
"ntod": None,
"tods": None,
"nset": None,
"site": 'so_lat',
"nmat": "corr",
"max_dets": None,
"verbose": 0,
"quiet": 0,
"center_at": None,
"window": 0.0,
"inject": None,
"nocal": True,
"nmat_dir": "/nmats",
"nmat_mode": "build",
"downsample": 1,
"maxiter": 500,
"tiled": 1,
"wafer": None,
}
Config file format
Example of a config file:
# Query
query: "1"
# Context file containing TODs
context: 'context.yaml'
# Telescope info
freq: 'f150'
site: 'so_lat'
# Mapping area footprint
area: 'geometry.fits'
# Output Directory and file name prefix
odir: './output/'
prefix: 'my_maps'
# Detectors info. null by default
tods: [::100] # Restrict TOD selections by index
ntod: 3 # Special case of `tods` above. Implemented as follows: [:ntod]
nset: 10 # Number of detsets kept
max-dets: 200 # Maximum dets kept
wafer: 'w17' # Restrict which wafers are mapped. Can do multiple wafers
# Mapmaking meta
comps: 'T' # TQU
inject: null
nocal: True # No relcal or abscal
downsample: 1 # Downsample TOD by this factor
tiled: 0 # Tiling boolean (0 or 1)
nmat-dir: './nmats/' # Dir to save or load nmat
nmat: 'corr' # 'corr' or 'uncorr'
maxiter: 500 # Max number of iterative steps
nmat_mode: 'build' # 'cache', 'build', 'load' or 'save'
center_at: null
window: 0.0
inject: null
# Scripting tools
verbose: True
quiet: False
make-atomic-filterbin-map
This script will create atomic maps (maps of individual observations by wafer and
frequency, and associated splits). These maps are HWP-demodulated and filtered
and binned. Every atomic map consist of a weights, wmap (weighted map),
and hits map, as well as an information file that is used for adding the map
to an atomic map database.
Configuration yaml file
The mapmaker is configured by supplying a yaml file with --config_file.
- class sotodlib.site_pipeline.make_atomic_filterbin_map.Cfg(context: str, preprocess_config: str, area: str | None = None, nside: int | None = None, query: str = "type == 'obs' and subtype == 'cmb'", odir: str = './output', update_delay: float | None = None, site: str = 'so_sat3', hk_data_path: str | None = None, nproc: int = 1, atomic_db: str | None = None, comps: str = 'TQU', singlestream: bool = False, only_hits: bool = False, all_splits: bool = False, det_in_out: bool = False, det_left_right: bool = False, det_upper_lower: bool = False, scan_left_right: bool = False, ntod: int | None = None, tods: str | None = None, nset: int | None = None, wafer: str | None = None, freq: str | None = None, center_at: str | None = None, max_dets: int | None = None, fixed_time: int | None = None, min_dur: int | None = 300, verbose: int = 0, quiet: int = 0, window: float | None = None, dtype_tod: str = 'float32', dtype_map: str = 'float64', unit: str = 'K', use_psd: bool = True, wn_label: str = 'preprocess.noiseQ_mapmaking.std', apply_wobble: bool = True, compress: bool = True)[source]
Class to configure make-atomic-filterbin-map
- Parameters:
context (str) – Path to context file
preprocess_config (str) – Path to config file(s) to run the preprocessing pipeline. If 2 files, representing 2 layers of preprocessing, they should be separated by a comma.
area (str) – WCS kernel for rectangular pixels. Filename to map/geometry or valid string for coords.get_wcs_kernel.
nside (int) – Nside for HEALPIX pixels
query (str) – Query, can be a file (list of obs_id) or selection string (will select only CMB scans by default)
odir (str) – Output directory
update_delay (float) – Number of days in the past to start obs list
site (str)
hk_data_path (str) – Path to housekeeping data
nproc (int) – Number of procs for the multiprocessing pool
atomic_db (str) – Path to the atomic map database
comps (str) – Components to map, only TQU implemented
singlestream (bool) – Map without demodulation (e.g. with a static HWP)
only_hits (bool) – Only create a hits map
all_splits (bool) – If True, map all implemented splits
det_in_out (bool) – Make focal plane split: inner vs outer detector
det_left_right (bool) – Make focal plane split: left vs right detector
det_upper_lower (bool) – Make focal plane split: upper vs lower detector
scan_left_right (bool) – Make samples split: left-going vs right going scans
ntod (int) – Run the first ntod observations in your query
tods (str) – Run a specific obs
nset (int) – Run the first nset wafers
wafer (str) – Run a specific wafer
freq (str) – Run a specific frequency band
unit (str) – Unit of data. Default is K
use_psd (bool) – True by default. Use white noise measured by PSD as the weights for mapmaking. Must be provided by the preprocessing
wn_label (str) – Path where to find the white noise per det by the preprocessing
apply_wobble (bool) – Correct wobble deflection. This requires aman.wobble_params metadata in the context
compress (bool) – When running preprocess from scratch, whether to compress the files
center_at (str)
max_dets (int)
fixed_time (int)
min_dur (int)
verbose (int)
quiet (int)
window (float)
dtype_tod (str) – Data type for timestreams
dtype_map (str) – Data type for maps
The only mandatory parameters are context for a context file and preprocess_config,
a preprocess database configuration file that will tell the script how to process the
timestreams. A typical configuration file could look like this:
context: /global/cfs/projectdirs/sobs/metadata/satp1/contexts/use_this_local.yaml
# Use a pixell area file for rectangular pixel maps or use an nside value for Healpix maps.
# Only use one of these options
area: band_car_fejer1_5arcmin.fits
#nside: 512
# A query can be a file with a list of obs, or an obsdb query
query: obs_list.txt
#query: "subtype == 'cmb' and timestamp >= 1708743600 and timestamp < 1713672000"
odir: output_directory
preprocess_config: preprocess_config.yaml
# Limit the number of obs, map a specific wafer or band
#ntod: 3
#wafer: ws0
#freq: f090
# Plataform to map
site: so_sat1
# Path to housekeeping data (this is used for extracting pwv)
hk_data_path: /global/cfs/cdirs/sobs/data/site/hk/
make-depth1-map
This script will create depth1 maps.
Configuration yaml file
The mapmaker is configured by supplying a yaml file with --config-file.
usage: make-depth1-map [-h] [--config-file CONFIG_FILE] [--query QUERY]
[--area AREA] [--odir ODIR]
[--preprocess-config PREPROCESS_CONFIG] [-C COMPS]
[-c CONTEXT] [-n NTOD] [--tods TODS] [--nset NSET]
[-N NMAT] [--max-dets MAX_DETS] [-S SITE] [-v] [-q]
[--cont] [-@ CENTER_AT] [-w WINDOW]
[--nmat-dir NMAT_DIR] [--nmat-mode NMAT_MODE]
[-d DOWNSAMPLE] [--maxiter MAXITER] [-T TILED]
[-W WAFER [WAFER ...]] [--freq FREQ [FREQ ...]]
[-g TASKS_PER_GROUP] [--rhs] [--bin]
[--srcsamp SRCSAMP] [--unit UNIT] [--min-dets MIN_DETS]
[--update-delay UPDATE_DELAY] [--min-dur MIN_DUR]
[--pretend-now-is PRETEND_NOW_IS]
Named Arguments
- --config-file
Path to mapmaker config.yaml file
- --query
- --area
Path to FITS file describing the mapping geometry
- --odir
Directory for saving output maps
- --preprocess-config
Preprocess configuration file
- -C, --comps
T,Q, and/or U
- -c, --context
Context containing TODs
- -n, --ntod
Special case of tods above. Implemented as follows: [:ntod]
- --tods
Restrict TOD selections by index
- --nset
Number of detsets kept
- -N, --nmat
‘corr’ or ‘uncorr’
- --max-dets
Maximum number of dets kept
- -S, --site
Observatory site
- -v, --verbose
- -q, --quiet
- --cont
continue a run
Default:
False- -@, --center-at
- -w, --window
- --nmat-dir
Directory to where nmats are loaded from/saved to
- --nmat-mode
How to build the noise matrix. ‘build’: Always build from tod. ‘cache’: Use if available in nmat-dir, otherwise build and save. ‘load’: Load from nmat-dir, error if missing. ‘save’: Build from tod and save.
- -d, --downsample
Downsample TOD by this factor
- --maxiter
Maximum number of iterative steps
- -T, --tiled
- -W, --wafer
Detector wafer subset to map with
- --freq
Frequency band to map with
- -g, --tasks-per-group
number of tasks per group. By default it is 1, but can be higher if you want more than one MPI job working on a depth-1 map, e.g. if you don’t have enough memory for so many MPI jobs
- --rhs
Save the rhs maps
Default:
False- --bin
Save the bin maps
Default:
False- --srcsamp
Path to mask file where True regions indicate where bright object mitigation should be applied. Mask is in equatorial coordinates. Not tiled, so should be low-res to not waste memory.
- --unit
Unit of the maps
- --min-dets
Minimum number of detectors for an obs (per wafer per freq)
- --update-delay
For automatic mapmaking, how many days back to map
- --min-dur
Minimum duration of obs to be included, in seconds. Default is 5 minutes
- --pretend-now-is
Change current time for running with update-delay. Time in UTC and format %Y-%m-%d %H:%M:%S.
A typical configuration file could look like this:
# Query
query: "1"
# Context file containing TODs
context: 'context.yaml'
# A preprocess database configuration file that will tell the script how to process the timestreams
preprocess_config: ./lat_config_mf.yaml
# Telescope info
site: 'so_lat'
# Mapping area footprint
area: 'geometry.fits'
# Output Directory and file name prefix
odir: './output/'
# Mapmaking meta
comps: 'T' # TQU
downsample: 1 # Downsample TOD by this factor
tiled: 0 # Tiling boolean (0 or 1)
unit: 'K'
maxiter: 2
srcsamp: 'srcsamp_mask.fits'
# Scripting tools
verbose: True
quiet: False
tasks_per_group: 4
# Database integration
mapcat_database_name: './mapcat.sqlite'
mapcat_depth_one_parent: './output/'
update-mapviewer-dbs
This module maintains databases for mapviewer instances that show atomic/depth-1 maps per instrument.
Command line arguments
Generate and populate tilemaker databases for SAT instruments
usage: update-mapviewer-dbs [-h]
[--mapviewer-dbs-root-path MAPVIEWER_DBS_ROOT_PATH]
[--instrument-db-paths INSTRUMENT_DB_PATHS [INSTRUMENT_DB_PATHS ...]]
[--cutoff-days CUTOFF_DAYS]
Named Arguments
- --mapviewer-dbs-root-path
Path to root directory where mapviewer-compatible DBs are stored
Default:
'/so/services/mapviewer'- --instrument-db-paths
Paths to an instrument’s atomic/depth-1 SQLite DB file
- --cutoff-days
Delete map groups older than this many days and repopulate with maps whose ‘ctime’ is greater than or equal to (now - cutoff_days)
Default:
1
QDS Monitor
The QDS Monitor is meant to be a simple to use class that allows users to publish the results of their calculations to a live monitor. The live monitor backend is an Influx Database, which is used with the SO Data Acquisition system, known as the Observatory Control System. This allows us to use the same live monitoring interface, Grafana.
Overview
The Monitor class wraps the InfluxDB interface, and provide a few simple
methods – check, record, and write – detailed in the
API section.
check is meant to be used to check if the calculation already has been
performed for the given observation/tag set. This can be used to ensure
expensive calculations are not repeated when running batch jobs. record
takes your calculations, timestamps, and a set of identifying tags, and queues
them for batch writing to the InfluxDB. Finally, write will write your
recorded results to the InfluxDB, clearing the queue.
This perhaps is best demonstrated with some examples, shown in the next section.
Examples
Simple Pseudocode
The general outline we’re aiming for is as follows:
from sotodlib.site_pipeline.monitor import Monitor
# Initialize DB Connection
monitor = Monitor('localhost', 8086, 'qdsDB')
# Load observation
tod = so_data_load.load_observation(context,
observation_id, detectors_list)
# Compute statistic
result = interesting_calculation(tod)
# Tag and write to DB
tags = {'telescope': 'LAT', 'wafer': wafer_name}
monitor.record('white_noise_level', result, timestamp, tags)
monitor.write()
Real World Example
The following is a real world example of the Monitor in action. We’ll walk
through the important parts, omitting some descriptive print statements. The
full script is included below.
To start, we will import the module and create our Monitor object.
You will need to know the address and port for your InfluxDB, as well as the
name of the database within InfluxDB that you want to write to.:
from sotodlib.site_pipeline.monitor import Monitor
monitor = Monitor('localhost', 8086, 'qds')
Note
Secure connection to an external InfluxDB is supported. To connect use to https://example.com/influxdb/ use:
monitor = Monitor(host='example.com',
port=443,
username=u'username',
password=u'ENTER PASSWORD HERE',
path='influxdb',
ssl=True)
Let’s say we want to load some of the sims, we’ll create our Context and get the observations with:
context = core.Context('pipe_s0001_v2.yaml')
observations = context.obsfiledb.get_obs()
Then we can, for example, loop over all observations, determining the detectors and wafers in each observation:
for obs_id in observations:
c = context.obsfiledb.conn.execute('select distinct DS.name, DS.det from detsets DS '
'join files on DS.name=files.detset '
'where obs_id=?', (obs_id,))
dets_in_obs = [tuple(r) for r in c.fetchall()]
wafers = np.unique([x[0] for x in dets_in_obs])
We’ll run our calculation for each wafer, so let’s loop over those now, building a detector list for the wafer, and loading the TOD for just those detectors and computing their FFTs:
for wafer in wafers:
det_list = build_det_list(dets_in_obs, wafer)
tod = so_data_load.load_observation(context.obsfiledb, obs_id, dets=det_list)
# Compute ffts
ffts, freqs = rfft(tod)
det_white_noise = calculate_noise(tod, ffts, freqs)
Now we want to save our results to the monitor. To do this, we’ll need two other lists, one for the timestamps associated with each noise value (in this case, these are all the same, and use the first timestamp in the TOD), and one for the tags for each noise value (in this example we tag each detector individually with their detector ID, along with the wafer it is on and what telescope we’re working with – this probably is in the context somewhere, but I’m just writing in SAT1):
timestamps = np.ones(len(det_white_noise))*tod.timestamps[0]
base_tags = {'telescope': 'SAT1', 'wafer': wafer}
tag_list = []
for det in det_list:
det_tag = dict(base_tags)
det_tag['detector'] = det
tag_list.append(det_tag)
log_tags = {'telescope': 'SAT1', 'wafer': wafer}
monitor.record('white_noise_level', det_white_noise, timestamps, tag_list, 'detector_stats', log_tags=log_tags)
monitor.write()
We also include a set of log tags, these are to record that we’ve completed
this calculation for this observation and wafer. Lastly we record the
measurement, giving it the name “white_noise_level”, passing our three lists of
equal length (det_white_noise, timestamps, tag_list), and recording
the measurement as completed in the “detector_stats” log with the observation ID
and wafer log tags.
Where these log tags could come in handy is if we need to stop and restart our calculation and want to skip recomputing the results. Since we saved the wafer along with the observation ID it would make sense to check at the wafer level loop:
for wafer in wafers:
# Check calculation completed for this wafer
check_tags = {'wafer': wafer}
if monitor.check('white_noise_level', obs_id, check_tags):
continue
Add this to the top of our wafer loop would skip already recorded wafers for this observation id.
The example script in its entirety is shown here:
# Largely based on 20200514_FCT_Software_Example.ipynb from the pwg-fct
import numpy as np
from sotodlib import core
import sotodlib.io.load as so_data_load
from sotodlib.tod_ops import rfft
import qds
monitor = qds.Monitor('localhost', 56777, 'qds')
context = core.Context('pipe_s0001_v2.yaml')
observations = context.obsfiledb.get_obs()
print('Found {} Observations'.format(len(observations)))
o_list = range(len(observations)) # all observations
for o in o_list:
obs_id = observations[o]
print('Looking at observation #{} named {}'.format(o,obs_id))
c = context.obsfiledb.conn.execute('select distinct DS.name, DS.det from detsets DS '
'join files on DS.name=files.detset '
'where obs_id=?', (obs_id,))
dets_in_obs = [tuple(r) for r in c.fetchall()]
wafers = np.unique([x[0] for x in dets_in_obs])
print('There are {} detectors on {} wafers in this observation'.format(len(dets_in_obs), len(wafers)))
for wafer in wafers:
# Check calculation completed for this wafer
check_tags = {'wafer': wafer}
if monitor.check('white_noise_level', obs_id, check_tags):
continue
# Process Obs+Wafer
# Build detector list for this wafer
det_list = []
for det in dets_in_obs:
if det[0] == wafer:
det_list.append(det[1])
print('{} detectors on this wafer'.format(len(det_list)))
tod = so_data_load.load_observation(context.obsfiledb, obs_id, dets=det_list )
print('This observation is {} minutes long. Has {} detectors and {} samples'.format(round((tod.timestamps[-1]-tod.timestamps[0])/60.,2),
tod.dets.count, tod.samps.count))
print('This TOD AxisManager has Axes: ')
for k in tod._axes:
print('\t{} with {} entries'.format(tod[k].name, tod[k].count ) )
print('This TOD AxisManager has fields : [axes]')
for k in tod._fields:
print('\t{} : {}'.format(k, tod._assignments[k]) )
if type(tod._fields[k]) is core.AxisManager:
for kk in tod[k]._fields:
print('\t\t {} : {}'.format(kk, tod[k]._assignments[kk] ))
# Compute the FFT and detector white noise levels
ffts, freqs = rfft(tod)
tsamp = np.median(np.diff(tod.timestamps))
norm_fact = (1.0/tsamp)*np.sum(np.abs(np.hanning(tod.samps.count))**2)
fmsk = freqs > 10
det_white_noise = 1e6*np.median(np.sqrt(np.abs(ffts[:,fmsk])**2/norm_fact), axis=1)
# Publish to monitor
timestamps = np.ones(len(det_white_noise))*tod.timestamps[0]
base_tags = {'telescope': 'LAT', 'wafer': wafer}
tag_list = []
for det in det_list:
det_tag = dict(base_tags)
det_tag['detector'] = det
tag_list.append(det_tag)
log_tags = {'observation': obs_id, 'wafer': wafer}
monitor.record('white_noise_level', det_white_noise, timestamps, tag_list, 'detector_stats', log_tags=log_tags)
monitor.write()
API
- class sotodlib.site_pipeline.monitor.Monitor(host, port, database='qds', username='root', password='root', path='', ssl=False)[source]
- classmethod from_configs(configs)[source]
Create a monitor from a configuration file
- Parameters:
configs (dict or string) – configuration dictionary or string that’s a file name that can be loaded by yaml into a configuration dictionary
- Return type:
connected Monitor Instance
- check(field, observation, tags, log='obs_process_log')[source]
Check if monitored measurement has been reacorded already.
All recorded measurement fields within the Monitor are tracked in a log within InfluxDB. This check will search this log with a search like:
SELECT {field} FROM "log" WHERE observation = {observation} AND {tag1} = '{value1}' AND {tag2} = '{value2}';
- Parameters:
- Returns:
True if calculation already performed, False otherwise
- Return type:
- record(field, values, timestamps, tags, measurement, log='obs_process_log', log_tags=None, qa_metrics=False, obs_id='')[source]
Record a monitored statistic to the InfluxDB. Values not written to DB until
Monitor.write()is called.- Parameters:
field (str) – Measurement field, i.e. “white_noise_level”
values (list or np.array) – Values for the field for each unique set of tags and timestamps
timestamps (list or np.array) – Timestamps for the field values
tags (list of dict) – List of dictionaries containing tags for the InfluxDB
measurement (str) – InfluxDB measurement to record to
log (str) – InfluxDB measurement to use for logging completed calculation
log_tags (list of dict) – Tags to use for the log, typically you won’t want to record you’ve completed a calculation for each individual detector, but maybe some higher level group. If this is None tags will be used.
qa_metrics (bool) – If called from QA metrics related scripts, use special logging. Default is false.
obs_id (str) – Obs_id to use for value of “observation” field in logging measurement.
Support
utils
Utilities for site_pipeline.
alerts
Alert and notification utilities for site_pipeline.
archive
Archive and storage policy utilities for site_pipeline.
config
Configuration and parsing utilities for site_pipeline.
- class sotodlib.site_pipeline.utils.config.ArgumentParser(*args, **kwargs)[source]
A variant of ArgumentParser that allows the defaults to be overriden by values in a yaml config files. Thus the priority order becomes, from highest to lowest:
Arguments passed on the command line
The config file
Defaults defined with add_argument()
The config file is specified using the –config-file option, which this class adds automatically. It should therefore not be added manually.
- sotodlib.site_pipeline.utils.config.parse_quantity(val, default_units=None)[source]
Convert an expression with units into an astropy Quantity.
- Parameters:
val – the expression (see Notes).
default_units – the units to assume if they are not provided in val.
- Returns:
The astropy Quantity decoded from the argument. Note the quantity is converted to the default_units, if they are provided.
Notes
The default_units, if provided, should be “unit-like”, by which we mean it is either:
An astropy Unit.
A string that astropy.units.Unit() can parse.
The val can be any of the following:
A tuple (x, u) or list [x, u], where x is a float and u is unit-like.
A string (x), where x can be parsed by astropy.units.Quantity.
A float (x), but only if default_units is not None.
Examples
>>> parse_quantity('100 arcsec') <Quantity 100. arcsec>
>>> parse_quantity([12., 'deg']) <Quantity 12. deg>
>>> parse_quantity('15 arcmin', 'deg') <Quantity 0.25 deg>
>>> parse_quantity(100, 'm') <Quantity 100. m>
- sotodlib.site_pipeline.utils.config.lookup_conditional(source, key, tags=None, default=<class 'KeyError'>)[source]
Lookup a value in a dict, with the possibility of descending through nested dictionaries using tags provided by the user.
This function returns the returns source[key] unless source[key] is a dict, in which case the tags (a list of strings) are each tested in the dict to see if they lead to a sub-setting.
For example, if the source dictionary is {‘number’: {‘a’: 1, ‘b’: 2}} and the user requests key ‘number’, with tags=[‘a’], then the returned value will be 1.
If you want a dict to be returned literally, and not crawled further, include a dummy key ‘_stop_here’, with arbitrary value (this key will be removed from the result before returning to the user).
The key ‘_default’ will always cause a match, even if none of the other tags match. (This _default value also becomes the default if further recursion fails to yield an exact match.)
- Parameters:
source (dict) – The parameter tree to search.
key (str) – The key to terminate the search on.
tags (list of str or None) – tags that may be auto-descended.
default – Value to return if the search does not resolve. The special value KeyError will instead cause a KeyError to be raised if the search is not resolved.
Examples:
source = { 'my_param': { '_default': 100., 'f150': 90. } } lookup_conditional(source, 'my_param') => 100. lookup_conditional(source, 'my_param', tags=['f090']) => 100. lookup_conditional(source, 'my_param', tags=['f150']) => 90. lookup_conditional(source, 'my_other_param') KeyError! lookup_conditional(source, 'my_other_param', default=0) => 0 # Note _default takes precedence over default argument. lookup_conditional(source, 'my_param', default=0) => 100. # Nested example: source = { 'fit_params': { '_default': { 'a': 12, 'b': 100, '_stop_here': None, # don't descend any further. }, 'f150': { 'SAT': { 'a': 1000, 'b': 1200, '_stop_here': None, }, 'LAT': { 'a': 1, 'b': 2, '_stop_here': None, }, }, }, } lookup_conditional(source, 'fit_params', tags=['f150', 'LAT']) => {'a': 1, 'b': 2} lookup_conditional(source, 'fit_params', tags=['LAT']) => {'a': 12, 'b': 100} lookup_conditional(source, 'fit_params', tags=['f150']) => {'a': 12, 'b': 100}
constants
depth1_utils
- sotodlib.site_pipeline.utils.depth1_utils.sensitivity_cut(rms_uKrts: ndarray, sens_lim: float, med_tol: float = 0.2, max_lim: float = 100) ndarray[source]
Sensitivity cuts for mapmakers, based on white noise of individual detectors
- Parameters:
- Returns:
good – A boolean array with good and bad detectors.
- Return type:
ndarray
- sotodlib.site_pipeline.utils.depth1_utils.measure_rms(tod: ndarray, dt: float = 1, bsize: int = 32, nblock: int = 10) ndarray[source]
- sotodlib.site_pipeline.utils.depth1_utils.tele2equ(coords: List[ndarray], ctime: float, detoffs: List[int] = [0, 0], site: str = 'so_sat1') ndarray[source]
- sotodlib.site_pipeline.utils.depth1_utils.find_scan_profile(tods, infos, comm: Comm = <pixell.mpiutils.FakeCommunicator object>, npoint: int = 100) ndarray[source]
- sotodlib.site_pipeline.utils.depth1_utils.find_footprint(tods, ref_wcs, comm: Comm = <pixell.mpiutils.FakeCommunicator object>, return_pixboxes: bool = False, pad: int = 1) Tuple[Any, Any, ndarray | None][source]
Find an enmap geometry (shape, wcs) that encompass all TODs composing a depth-1 map. Useful to limit the size of the depth-1 map to only the necessary.
Caveats: there might be edge cases covering a wide range of R.A. where the combined depth-1 map ends up wrapping on the other edge. The end result is a map which covers 360deg in R.A. The reason is that a priori you don’t know where the center of the overall depth-1 map is. So if the depth-1 map exposes an area >180deg wide in R.A., you might end up with this issue. In the future we might add some mechanism that finds the middle-point of the hypothetical perfectly unwrapped patch, which would allow us to find the optimal wrapping.
- Parameters:
tods (list) – List of Axis Managers containing the observations in a depth-1 map. These will don’t have signal, but only the ancillary data required (boresight, focal_plane, timestamps, etc).
ref_wcs (WCS dict) – Reference wcs to build the geometry of the map.
comm (MPI communicator, optional)
return_pixboxes (bool, optional) – whether or not also return the pixboxes of the obs.
pad (int, optional)
- Returns:
shape (tuple) – Shape of the geometry, (N pixels in X, N pixels in Y).
wcs (WCS dict) – WCS of the map.
pixboxes (list, optional) – List of pixboxes for each input obs.
- sotodlib.site_pipeline.utils.depth1_utils.read_tods(context, obslist, inds=None, comm=<pixell.mpiutils.FakeCommunicator object>, no_signal=False, site='so', L=None, min_dets=50)[source]
- sotodlib.site_pipeline.utils.depth1_utils.calibrate_obs(obs, band, site='so', dtype_tod=<class 'numpy.float32'>, nocal=True, unit='K', L=None, min_dets=50) Tuple[Any | None, ndarray | None][source]
- sotodlib.site_pipeline.utils.depth1_utils.create_mapmaker_config(defaults: dict = {'bin': False, 'center_at': None, 'comps': 'T', 'cont': False, 'downsample': 1, 'freq': None, 'mapcat_database_name': 'mapcat.db', 'mapcat_database_type': 'sqlite', 'mapcat_depth_one_parent': './', 'max_dets': None, 'maxiter': 100, 'min_dets': 50, 'min_dur': 300, 'nmat': 'corr', 'nmat_dir': '/nmats', 'nmat_mode': 'build', 'nset': None, 'ntod': None, 'odir': './outputs', 'pretend_now_is': None, 'query': "type == 'obs' and subtype == 'cmb'", 'quiet': 0, 'rhs': False, 'site': 'so_lat', 'srcsamp': None, 'tasks_per_group': 1, 'tiled': 1, 'tods': None, 'unit': 'K', 'update_delay': None, 'verbose': 0, 'wafer': None, 'window': 0.0}, config_file: str | None = None, args: dict = {}) dict[source]
- sotodlib.site_pipeline.utils.depth1_utils.write_depth1_map(prefix: str, data: ~numpy.ndarray, dtype: ~numpy.dtype[~typing.Any] | None | type[~typing.Any] | ~numpy._typing._dtype_like._SupportsDType[~numpy.dtype[~typing.Any]] | str | tuple[~typing.Any, int] | tuple[~typing.Any, ~typing.SupportsIndex | ~collections.abc.Sequence[~typing.SupportsIndex]] | list[~typing.Any] | ~numpy._typing._dtype_like._DTypeDict | tuple[~typing.Any, ~typing.Any] = <class 'numpy.float32'>, binned: bool = False, rhs: bool = False, unit: str = 'K')[source]
exceptions
io
logging
Logging utilities for site_pipeline.
mapcat
- sotodlib.site_pipeline.utils.mapcat.map_to_calculate(map_name: str, inds_to_use: List[int], mapcat_settings: Dict[str, str]) bool[source]
Check whether a depth-1 map needs to be (re)calculated.
Compares the total number of wafers already recorded in the map catalog against the number of indices requested. Returns True if the existing TOD count is less than what is requested.
- Parameters:
- Returns:
True if the map should be calculated, False otherwise.
- Return type:
- sotodlib.site_pipeline.utils.mapcat.commit_depth1_tods(map_name: str, obslist: Dict[Tuple[int, str, str], List[Tuple[str, str, str, int]]], obs_infos: recarray, band: str, inds: List[int], mapcat_settings: Dict[str, str]) List[mapcat.database.TODDepthOneTable][source]
Commit TOD entries for a depth-1 map to the map catalog.
For each unique observation id create a
TODDepthOneTablerow (if one does not already exist) and associates it with the given map name if possible.- Parameters:
map_name (str) – Unique name identifying the depth-1 map.
obslist (dict) – Mapping from index to list of (obs_id, …) tuples describing the observations that contribute to the map.
obs_infos (np.recarray) – Record array of observation metadata, keyed by
obs_id.band (str) – Frequency band identifier (e.g.
'f150').inds (list of int) – Indices into
obslistselecting the TODs to commit.mapcat_settings (dict) – Connection settings forwarded to
mapcat.helper.Settings.
- Returns:
The TOD entries.
- Return type:
list of TODDepthOneTable
- sotodlib.site_pipeline.utils.mapcat.commit_depth1_map(map_name: str, prefix: str, detset: str, band: str, ctime: float, start_time: float, stop_time: float, tods: List[mapcat.database.TODDepthOneTable], mapcat_settings: Dict[str, str]) None[source]
Commit or update a depth-1 map entry in the map catalog.
Creates a
DepthOneMapTablerow with paths to the map, inverse- variance, and time FITS files. If a row with the samemap_namealready exists it is merged (updated) rather than duplicated.- Parameters:
map_name (str) – Unique name identifying the depth-1 map.
prefix (str) – File path prefix;
_map.fits,_ivar.fits, and_time.fitsare appended to form the output paths.detset (str) – Tube slot / detector set identifier.
band (str) – Frequency band identifier (e.g.
'f150').ctime (float) – Representative ctime for the map.
start_time (float) – Start time of the earliest contributing observation.
stop_time (float) – Stop time of the latest contributing observation.
tods (list of TODDepthOneTable) – TOD entries to associate with this map.
mapcat_settings (dict) – Connection settings forwarded to
mapcat.helper.Settings.
obsdb
Observation database query utilities for site_pipeline.
- sotodlib.site_pipeline.utils.obsdb.get_obslist(context, query=None, obs_id=None, min_ctime=None, max_ctime=None, update_delay=None, tags=None, planet_obs=False)[source]
Query the obs database with a given query.
- Parameters:
context (core.Context) – The context to use for the obsdb.
query (str, optional) – A query string for the obsdb.
obs_id (str, optional) – The specific obsid to retrieve.
min_ctime (int, optional) – The minimum ctime of obs to retrieve.
max_ctime (int, optional) – The maximum ctime of obs to retrieve.
update_delay (int, optional) – The number of days to subtract from the current time to set the minimum ctime.
tags (list of str, optional) – A list of tags to use for the query.
planet_obs (bool, optional) – If True, format query and tags for planet obs.
- Returns:
obs_list – The list of obs found from the query.
- Return type:
pipeline
Pipeline execution utilities for site_pipeline.
jobdb
jobdb
Connect to a database
With sqlite:
jdb = jobdb.JobManager(sqlite_file='jobsdb-test.db')
With postgres:
db_url = sqy.engine.URL.create(
"postgresql",
username="abc",
password="def",
host="ghi",
database="jkl")
jdb = jobdb.JobManager(url=db_url)
Create new jobs
Check if job exists; create:
jclass = 'my_analysis'
tags = {'obs_id': '1234', 'wafer_slot': 10}
if not len(jdb.get_jobs(jclass=jclass, tags=tags)):
jdb.create_job(jclass, tags)
Identifying work
Find jobs to do:
all_jobs = jdb.get_jobs(jclass=jclass, jstate='open')
done_everything = len(all_jobs) == 0
Filter jobs to ones we might want to work on now:
recent_memory = time.time() - 3600 # Don't retry until 1 hour has passed.
to_do = [j for j in all_jobs if j.jstate == 'open' and j.visit_time <= recent_memory]
Lock a job and work on it:
with jdb.locked(to_do) as job:
if job is not None:
job.mark_visited()
ok = do_a_job(job.tags)
if ok:
job.jstate = 'done'
else:
if job.visit_count > 5:
job.state = 'failed'
Fixing things
Forcibly unlock all jobs (though feel free to be more targeted):
for j in jdb.get_jobs(jclass='my_analysis'):
jdb.unlock(j.id)
Delete some jobs:
for j in jdb.get_jobs(jclass='my_analysis'):
jdb.remove_job(j.id)
- class sotodlib.site_pipeline.jobdb.JState(value)[source]
An enumeration.
- open = 'open'
- done = 'done'
- failed = 'failed'
- ignored = 'ignored'
- class sotodlib.site_pipeline.jobdb.Job(**kwargs)[source]
- id
- jclass
- jstate
- lock
- lock_owner
- creation_time
- visit_time
- visit_count
- property tags
- class sotodlib.site_pipeline.jobdb.JobManager(engine=None, url=None, sqlite_file=None)[source]
- commit_jobs(jobs)[source]
Commit jobs (such as those created using create_job(…, commit=False)) to the jobs table.
- create_job(jclass=None, tags={}, jstate=None, creation_time=None, visit_count=None, visit_time=None, check_existing=True, commit=True)[source]
Create a new job and optionally commit to the jobs table.
Return the job.
- get_jobs(jclass=None, tags=None, jstate=None, locked=None, job_id=None)[source]
Get a list of jobs meeting particular criteria.
Note jclass and jstate can be string, a list (match any in list) or None (matches all job classes).
The returned objects are detached from any database session, and should not be modified. To operate on one or more of the jobs returned here, pass them first to the locked() context manager, which will check the records out from the database (if possible) for use in your process.
- lock(job_id, owner=None, force=False)[source]
Lock a Job record by id. If the Job is already locked, a JobLockedError is raised.
Returns a Job object that has been expunged from the database session. The object attributes can be modified, but won’t be written back to the database unless the object is merged into a new session.
- locked(jobs, count=None, owner=None)[source]
Context Manager to grant exclusive access to one or more Job. Job record is marked as locked, and this process may freely work on the job and alter the job data. When execution leaves the context, the Job will be marked as unlocked. Note the _database_ is only explicitly locked while this lock is being acquired and released. In between, other entities can do other database stuff.
- Parameters:
job (int, Job, or list) – The Job to lock, or list of Jobs from which to try to draw lockable ones.
count (int, None) – The number of jobs to lock. If specified as an integer, a list of up to that many jobs will be yielded. If None, then a single job will be locked and yielded directly (if possible), otherwise None is yielded.
owner (str) – Override lock_owner string.
Notes
If the job argument is a list, the function will try to yield one of the jobs from the list, skipping any that are locked by another session. If no unlocked jobs are available, the usual exception will be raised or else a None yielded, as per none_if_locked argument.