io.ancil

This module supports the retrieval and repackaging of various ancillary data pertinent to processing of Simons Observatory survey and calibration data. These ancillary data include such things as radiometer readings (from off-site or on-site devices), weather information, details about the behavior of the platform (such as scan speed) and other instrumentation (half-wave plate; wire grid).

The two main roles of the submodule are:

  1. The assembly of datasets for faster and more convenient access to information that is not already made available in a convenient way.

  2. Reduction of ancillary data to the level of per-observation statistitics that can be stored in an observation database, and used to classify / select telescope data.

The main user-facing interface in this module is a set of classes, each responsible for assembling and reducing a specific dataset. They expose common interfaces for updating the base data archive (if such is needed) and for computing per-observation statistics from the data for obsdb purposes.

sotodlib.io.ancil.ANCIL_ENGINES = {'apex-pwv': <class 'sotodlib.io.ancil.apex.ApexPwv'>, 'pwv-combo': <class 'sotodlib.io.ancil.pwv.PwvCombo'>, 'toco-pwv': <class 'sotodlib.io.ancil.so_hk.TocoPwv'>}

Map from engine identifier to engine class. This is constructed dynamically on module load.

Module configuration

Configuration dataclasses for the various AncilEngine subclasses.

These dataclasses are constructed from dicts, often loaded from yaml. Each AncilEngine subclass registers its prefered config class in self.config_class.

sotodlib.io.ancil.configcls.register_engine(token: str, config_class)[source]

Class decorator used to register an AncilEngine in ANCIL_ENGINES and to define the config_class class variable and to annotate the config_class docstring with a reference to the class. Use like this:

@cc.register_engine(cc.MyPreciousDataConfig, 'my-precious')
class MyPreciousData(base.AncilEngine):
   ...
class sotodlib.io.ancil.configcls.AncilEngineConfig(dataset_name: str = None, data_prefix: str = None, data_dir: str = None, friends: Field(name=None, type=None, default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, default_factory=<class 'list'>, init=True, repr=True, hash=None, compare=True, metadata=mappingproxy({}), kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, _field_type=None) = None, obsdb_format: str = None, obsdb_query: str = None)[source]

Bases: object

The base class from which all engine configs inherit.

dataset_name: str = None

Name for the dataset (subclasses usually specify a default value).

data_prefix: str = None

Prefix (directory name) for base data archive storage. This is intended to be set externally to some top-level storage location for some set of archives.

data_dir: str = None

Directory where data archive should be stored. This is taken relative to data_prefix, unless an absolute path is specified. Subclasses will usually recommend a value.

friends: Field(name=None,type=None,default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>,default_factory=<class 'list'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>,_field_type=None) = None

List of “friend” definitions that the class expects to have registered.

obsdb_format: str = None

Format string to use when turning internal field values into obsdb column names. This must be None or else include {field}; it may also include {dataset}, which will be replaced with the dataset_name.

obsdb_query: str = None

Query string to use on the obsdb for identifying records that require an update. Instead of field names, put each internal fieldname as a variable (e.g. {mean} is null) and it will be reprocessed according to obsdb_format spec.

class sotodlib.io.ancil.configcls.LowResTableConfig(dataset_name: str = None, data_prefix: str = None, data_dir: str = None, friends: Field(name=None, type=None, default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, default_factory=<class 'list'>, init=True, repr=True, hash=None, compare=True, metadata=mappingproxy({}), kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, _field_type=None) = None, obsdb_format: str = None, obsdb_query: str = None, archive_block_seconds: int = 2000000, gap_size: float = 300.0, filename_pattern: str = '{dataset_name}_{timestamp}.h5', dtypes: list = None)[source]

Bases: AncilEngineConfig

Helper class for data archives that consist of a few gathered fields, indexed by time.

archive_block_seconds: int = 2000000

Number of seconds in archive block.

dataset_time_range = (1704000000, None)

Time range for this dataset to consider.

gap_size: float = 300.0
filename_pattern: str = '{dataset_name}_{timestamp}.h5'
dtypes: list = None
data_dir: str = None

Directory where data archive should be stored. This is taken relative to data_prefix, unless an absolute path is specified. Subclasses will usually recommend a value.

data_prefix: str = None

Prefix (directory name) for base data archive storage. This is intended to be set externally to some top-level storage location for some set of archives.

dataset_name: str = None

Name for the dataset (subclasses usually specify a default value).

friends: field(default_factory=list) = None

List of “friend” definitions that the class expects to have registered.

obsdb_format: str = None

Format string to use when turning internal field values into obsdb column names. This must be None or else include {field}; it may also include {dataset}, which will be replaced with the dataset_name.

obsdb_query: str = None

Query string to use on the obsdb for identifying records that require an update. Instead of field names, put each internal fieldname as a variable (e.g. {mean} is null) and it will be reprocessed according to obsdb_format spec.

class sotodlib.io.ancil.configcls.ApexPwvConfig(dataset_name: str = 'apex_pwv', data_prefix: str = None, data_dir: str = None, friends: Field(name=None, type=None, default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, default_factory=<class 'list'>, init=True, repr=True, hash=None, compare=True, metadata=mappingproxy({}), kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, _field_type=None) = None, obsdb_format: str = '{dataset}_{field}', obsdb_query: str = '{mean} is null', archive_block_seconds: int = 2000000, gap_size: float = 300.0, filename_pattern: str = '{dataset_name}_{timestamp}.h5', dtypes: list = None)[source]

Bases: LowResTableConfig

Config class for sotodlib.io.ancil.configcls.ApexPwvConfig.

dataset_name: str = 'apex_pwv'

Name for the dataset (subclasses usually specify a default value).

obsdb_format: str = '{dataset}_{field}'

Format string to use when turning internal field values into obsdb column names. This must be None or else include {field}; it may also include {dataset}, which will be replaced with the dataset_name.

obsdb_query: str = '{mean} is null'

Query string to use on the obsdb for identifying records that require an update. Instead of field names, put each internal fieldname as a variable (e.g. {mean} is null) and it will be reprocessed according to obsdb_format spec.

archive_block_seconds: int = 2000000

Number of seconds in archive block.

data_dir: str = None

Directory where data archive should be stored. This is taken relative to data_prefix, unless an absolute path is specified. Subclasses will usually recommend a value.

data_prefix: str = None

Prefix (directory name) for base data archive storage. This is intended to be set externally to some top-level storage location for some set of archives.

dataset_time_range = (1704000000, None)

Time range for this dataset to consider.

dtypes: list = None
filename_pattern: str = '{dataset_name}_{timestamp}.h5'
friends: field(default_factory=list) = None

List of “friend” definitions that the class expects to have registered.

gap_size: float = 300.0
class sotodlib.io.ancil.configcls.TocoPwvConfig(dataset_name: str = 'toco_pwv', data_prefix: str = None, data_dir: str = None, friends: Field(name=None, type=None, default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, default_factory=<class 'list'>, init=True, repr=True, hash=None, compare=True, metadata=mappingproxy({}), kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, _field_type=None) = None, obsdb_format: str = '{dataset}_{field}', obsdb_query: str = '{mean} is null and {start} is null and {end} is null', archive_block_seconds: int = 2000000, gap_size: float = 300.0, filename_pattern: str = '{dataset_name}_{timestamp}.h5', dtypes: list = None, hkdb_config: str = None)[source]

Bases: LowResTableConfig

Config class for sotodlib.io.ancil.configcls.TocoPwvConfig.

dataset_name: str = 'toco_pwv'

Name for the dataset (subclasses usually specify a default value).

obsdb_format: str = '{dataset}_{field}'

Format string to use when turning internal field values into obsdb column names. This must be None or else include {field}; it may also include {dataset}, which will be replaced with the dataset_name.

obsdb_query: str = '{mean} is null and {start} is null and {end} is null'

Query string to use on the obsdb for identifying records that require an update. Instead of field names, put each internal fieldname as a variable (e.g. {mean} is null) and it will be reprocessed according to obsdb_format spec.

hkdb_config: str = None

Config file for accessing the site hkdb.

archive_block_seconds: int = 2000000

Number of seconds in archive block.

data_dir: str = None

Directory where data archive should be stored. This is taken relative to data_prefix, unless an absolute path is specified. Subclasses will usually recommend a value.

data_prefix: str = None

Prefix (directory name) for base data archive storage. This is intended to be set externally to some top-level storage location for some set of archives.

dataset_time_range = (1704000000, None)

Time range for this dataset to consider.

dtypes: list = None
filename_pattern: str = '{dataset_name}_{timestamp}.h5'
friends: field(default_factory=list) = None

List of “friend” definitions that the class expects to have registered.

gap_size: float = 300.0
class sotodlib.io.ancil.configcls.PwvComboConfig(dataset_name: str = 'pwv', data_prefix: str = None, data_dir: str = None, friends: Field(name=None, type=None, default=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, default_factory=<class 'list'>, init=True, repr=True, hash=None, compare=True, metadata=mappingproxy({}), kw_only=<dataclasses._MISSING_TYPE object at 0x72fc1bed1630>, _field_type=None) = None, obsdb_format: str = '{dataset}_{field}', obsdb_query: str = '{mean} is null', toco_dataset: str = 'toco-pwv', apex_dataset: str = 'apex-pwv')[source]

Bases: AncilEngineConfig

Config class for sotodlib.io.ancil.configcls.PwvComboConfig.

dataset_name: str = 'pwv'

Name for the dataset (subclasses usually specify a default value).

obsdb_format: str = '{dataset}_{field}'

Format string to use when turning internal field values into obsdb column names. This must be None or else include {field}; it may also include {dataset}, which will be replaced with the dataset_name.

obsdb_query: str = '{mean} is null'

Query string to use on the obsdb for identifying records that require an update. Instead of field names, put each internal fieldname as a variable (e.g. {mean} is null) and it will be reprocessed according to obsdb_format spec.

toco_dataset: str = 'toco-pwv'

Name of the friend dataset from which to retrieve Toco radiometer data.

apex_dataset: str = 'apex-pwv'

Name of the friend dataset from which to retrieve APEX radiometer data.

data_dir: str = None

Directory where data archive should be stored. This is taken relative to data_prefix, unless an absolute path is specified. Subclasses will usually recommend a value.

data_prefix: str = None

Prefix (directory name) for base data archive storage. This is intended to be set externally to some top-level storage location for some set of archives.

friends: field(default_factory=list) = None

List of “friend” definitions that the class expects to have registered.

Modules

Housekeeping Data

This ancil submodule specializes in datastreams saved to SO HK data feeds, including:

  • the Toco radiometer data

class sotodlib.io.ancil.so_hk.TocoPwv(cfg)[source]
getter(targets=None, results=None, raw=False, **kwargs)[source]

Generator that yields results, one by one, for entry in target. If results is provided, it must be a list of the same length as targets, containing dicts which will be updated in place and yielded.

config_class

alias of TocoPwvConfig

APEX radiometer

This ancil submodule specializes in radiometer data from the APEX telescope.

class sotodlib.io.ancil.apex.ApexPwv(cfg)[source]
getter(targets=None, results=None, **kwargs)[source]

Compute reduced APEX PWV stats for a bunch of time ranges. Each entry in targets is a time range.

config_class

alias of ApexPwvConfig

PWV combinations

This ancil submodule specializes in combining PWV readings from the APEX and Toco radiometers.

sotodlib.io.ancil.pwv.combine_pwv(rs_toco, rs_apex, time_range)[source]

Combine Toco and Apex PWV datasets (provided as ResultSet with timestamp and pwv columns) and produce statistics covering the stated time_range (timestamp, timestamp).

The provided datasets should extend at least a few minutes beyond the time_range boundaries, so as to facilitate interpolation.

PWV correction model is applied to the data before combination. Then, Toco points are used as much as possible but filled in with APEX whenever there’s a gap of more than 10 minutes in the Toco stream.

class sotodlib.io.ancil.pwv.PwvCombo(cfg)[source]

Combine and reduce PWV measurements from two data sources (APEX and Toco). Note the values ultimately recorded are a combination of all available readings (from both sources) in the relevant time range. Each source is corrected, using a model, to make them intercompatible and descriptive of the site.

There is no explicitly managed “base data” for this module, as it relies on base data from Apex and Toco modules.

The fields computed for obsdb are:

  • “mean”: average corrected radiometer readings in mm.

  • “std”: stdev of such readings.

  • “qual”: a quality indicator (0 to 1) indicating roughly what fraction of the requested time interval is covered by the datasets.

getter(targets=None, results=None, raw=False, **kwargs)[source]

Generator that yields results, one by one, for entry in target. If results is provided, it must be a list of the same length as targets, containing dicts which will be updated in place and yielded.

config_class

alias of PwvComboConfig

Support

sotodlib.io.ancil.utils.denumpy(output)[source]

Traverse a container, recursively, and convert any numpy scalars to generic python scalars. Note ndarrays are not converted – the denumpification is only applied to scalars nested in standard python containers (dict/list/tuple).

sotodlib.io.ancil.utils.intersect_ranges(r1, r2, empty_as_none=False)[source]

Return the intersection of semi-open intervals r1 = [a, b) and r2 = [c, d). If empty_as_none, then empty intervals are returned as None instead of [e, e).

sotodlib.io.ancil.utils.merge_rs(rs0, rs1)[source]

Merge rows from rs1 into rs0 (both ResultSets), sorting rows by field ‘timestamp’. Because the primary application is to update some dataset based on expanding source data, rows that repeat some ‘timestamp’ value are dropped, with precedence given to data from rs0.

class sotodlib.io.ancil.utils.LowResTable(cfg)[source]

Helper class for archiving fairly low resolution data into a set of HDF5 files, as the “base data”.

update_base(time_range=None, reset=False)[source]

Attempt to patch any gaps in the dataset, for time_range, by grabbing data from the source.

If reset is True, then the grabbed data replaces any archived data in that time_range.

See _get_raw for the specific activity of the subclass.

check_base()[source]

Check the base data – determine output directory, count files therein, take note of extra files, etc.

getter(targets=None, **kwargs)[source]

Generator that yields results, one by one, for entry in target. If results is provided, it must be a list of the same length as targets, containing dicts which will be updated in place and yielded.