Loading L3 Housekeeping Data

The sotodlib.io.hkdb module provides methods for indexing, and quickly loading housekeeping data from an archive. The databases implemented store the following information:

  • List of all housekeeping files, with start and stop times

  • List of every frame in the file, with the start and stop times of each frame, agent instance-id, feed name, and the byte-offset of the frame in the file.

To index and load housekeeping information, you will need to create a HkConfig object, documented in the API section below. For example, the following yaml file can be loaded in directly with HkConfig.from_yaml:

hk_root: /so/data/satp1/hk
db_file: satp1_hk.db
aliases:
    fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T

In the example above, the db_file is the path to the sqlite index database. It is possible to use a general database engine instead of sqlite by using the db_url parameter instead of db_file, which should be a valid sqlalchemy database URL as is described here. For example, if the index is saved in a postgres database, you can use the following config:

hk_root: /so/data/satp1/hk
db_url: postgresql://<user>:${psql_password}@storage1/hkdb_satp1
aliases:
    fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T

where psql_password will be expanded with the value of the environment variable. Similarly, the db_url can also be set as a dictionary with fields matching the URL fields from the sqlalchemy docs, for example:

hk_root: /so/data/satp1/hk
db_url:
    drivername: postgresql
    username: <user>
    password: ${psql_password}
    host: storage1
    database: hkdb_satp1
aliases:
    fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T

In addition to the hk-root and database paths, an “alias” dictionary can be defined here, to provide a mapping from a human-readable name to the field-id of the housekeeping data, where field-ids are always of the form <agent-instance>.<feed>.<field>. These aliases are only used for data loading, do not effect how the index data is stored in the database. This means aliases can also be added or modified, either in the config file or programmatically to the HkConfig object, to assist with data loading after that hk data has already been indexed.

Loading Data

To load data from the hk archive, you can pass a LoadSpec object to the load_hk() function, which defines the time range and fields to load.

For example, the code below will load the focal-plane temp for the last week:

from sotodlib.io import hkdb
cfg = hkdb.HkConfig.from_yaml('satp1_hk.yaml')
t1 = time.time()
t0 = t1 - 24 * 3600 * 7
lspec = hkdb.LoadSpec(
    cfg=cfg, start=t0, end=t1,
    fields=['fp_temp'],
)
result = hkdb.load_hk(lspec, show_pb=True)

The result object will be an HkResult object, where result.data contains the mapping from field-id to a tuple of two numpy arrays. The first array is the timestamps and the second array is data values. If any of the requested fields have an alias defined in the cfg object, that alias will be set as an attribute in the HkResult object:

>> print(result.data['cryo-ls372-lsa21yc.temperatures.Channel_02_T'][0].shape)
(4160195)
>> print(result.fp_temp[0].shape)  # same exact thing as above
(4160195)
>> plt.plot(*result.fp_temp)  # plots focal-plane temp

The fields argument to LoadSpec is a list of field-specifications, which can either be:

  • An alias in the HkConfig

  • A field-id of the form <agent-instance>.<feed>.<field>

Field-ids specifications will also allow wildcards as the feed or field, to allow one to easily load all the fields of an agent or a feed. For example, you can run:

from sotodlib.io import hkdb
cfg = hkdb.HkConfig.from_yaml('satp1_hk.yaml')
t1 = time.time()
t0 = t1 - 24 * 3600 * 7
lspec = hkdb.LoadSpec(
    cfg=cfg, start=t0, end=t1,
    fields=['cryo-ls372-lsa21yc.*.*'],
)
result = hkdb.load_hk(lspec, show_pb=True)
print(result.data.keys())

>> dict_keys(['cryo-ls372-lsa21yc.temperatures.Channel_02_R',
              'cryo-ls372-lsa21yc.temperatures.Channel_02_T',
              'cryo-ls372-lsa21yc.temperatures.sample_heater_out'])

# Note that the fp_temp alias will still be found, and assigned to the
# correct field.
print(result.fp_temp.shape)
>> (2, 4157762)

Inspecting Data

To determine what feeds (agent and feed combinations) are available in a database, use the get_feed_list() function. For example:

feeds = hkdb.get_feed_list(lspec)
print(feeds)
>> [Field(agent='acu', feed='acu_error', field='*'),
    Field(agent='acu', feed='acu_status', field='*'),
    Field(agent='acu', feed='acu_udp_stream', field='*'),
    ...]

To get a list of all fields, use get_field_list(). E.g.:

fields = hkdb.get_field_list(lspec, fields=['acu.*.*Az*'])
print(len(fields), fields[0])
>> 61 acu.acu_status.Azimuth_current_position

API

Module for loading L3 housekeeping data using a database that indexes an archive of HK files.

class sotodlib.io.hkdb.HkConfig(hk_root: str, db_file: str | None = None, db_url: str | ~sqlalchemy.engine.url.URL | None = None, echo_db: bool = False, file_idx_lookback_time: float | None = None, show_index_pb: bool = True, aliases: ~typing.Dict[str, str] = <factory>)[source]

Bases: object

Configuration object for indexing and loading from an HK archive.

If instantiating from a nested dictionary, the from_dict class method can be used to convert fields to their proper data types.

Parameters:
  • hk_root (str) – Root directory for the HK archive

  • db_file (Optional[str]) – Path to the hk index database if the database is an sqlite file. Either this or db_url must be set

  • db_url (Optional[Union[str, db.URL]]) – URL used for db engine. Either this or db_file must be set

  • echo_db (bool) – Whether database operations should be echoed

  • file_idx_lookback_time (Optional[float]) – Time [sec] to look back when scanning for new files to index

  • show_index_pb (bool) – If true, shows progress bar when indexing

  • aliases (Dict[str, str]) –

    Aliases for hk fields. In this dict, the key is the alias name, and the value is the field descriptor, in the format of agent.feed.field. For example:

    {
        'fp_temp': 'cryo-ls372-lsa21yc.temperatures.Channel_02_T',
    }
    

    These aliases are only used on load, and do not affect how data is stored in the hkdb. Aliases should be valid python identifiers, since they will be set as attributes in the HkResult object.

hk_root: str
db_file: str | None = None
db_url: str | URL | None = None
echo_db: bool = False
file_idx_lookback_time: float | None = None
show_index_pb: bool = True
aliases: Dict[str, str]
classmethod from_dict(data: Dict[str, Any]) HkConfig[source]

Generates an HkConfig object from a dictionary whose keys are fields of the HkConfig dataclass.

If the db_url is specified, it can be set as a string, a dictionary, or a sqlalchemy URL object. If it is of type dict, it will be converted to a URL object by passing it through to the keyword arguments of sqlalchemy’s URL.create function. Environment variables will be expanded in both the string and dict representations.

classmethod from_yaml(path)[source]
class sotodlib.io.hkdb.HkFile(**kwargs)[source]

Bases: Base

Database entry for a hk file.

Parameters:
  • path (str) – Path to the hk file.

  • start_time (float) – Starting ctime of the file

  • end_time (float) – Ending ctime of the file

  • size (int) – Size of file in bytes

  • mod_time (float) – Modification time of file

  • index_status (float) – Index status of file. Can be “unindexed”, “indexed”, or “failed”

id
path
start_time
end_time
size
mod_time
index_status
class sotodlib.io.hkdb.HkFrame(**kwargs)[source]

Bases: Base

Database entry for an hk frame.

Parameters:
  • file_id (int) – ID of the file containing the frame

  • agent (str) – instance-id of the OCS agent for the frame

  • feed (str) – Name of the OCS feed for the frame

  • fields_hash (str) – Hash to identify the combination of fields included in this frame.

  • byte_offset (int) – Offset of the frame in the G3 file in bytes

  • start_time (float) – Starting ctime of the frame

  • end_time (float) – Ending ctime of the frame

id
file_id
file
agent
feed
fields_hash
byte_offset
start_time
end_time
class sotodlib.io.hkdb.HkDb(cfg: HkConfig | str)[source]

Bases: object

Helper class for createing database sessions

Parameters:

cfg (Union[HKConfig, str]) – Configuration object or path to configuration file

sotodlib.io.hkdb.update_file_index(hkcfg: HkConfig, session=None, subdirs=None, retry_failed=False)[source]

Updates HkFiles database with new files on disk.

If subdirs is specified, it must be a list of (full path) sub-directories of hk_root; those will be scanned and any new files added to database.

Otherwise, all subdirs of hk_root will be scanned, subject to any restriction from hkcfg.file_idx_lookback_time.

sotodlib.io.hkdb.get_frames_from_file(hkcfg: HkConfig, file: HkFile, return_on_fail=True) List[HkFrame][source]

Returns HkFile and HkFrame objects corresponding to a given hk file.

Parameters:
  • file (HkFile) – HkFile object corresponding to the file

  • return_on_fail (bool) – If True, if there is a runtime error while reading the g3 file (usually caused by a forced shutdown), the function will still return parsed frames.

Returns:

frames – List of all HkFrames in the file

Return type:

List[HkFrame]

sotodlib.io.hkdb.update_frame_index(hkcfg: HkConfig, session=None)[source]

Updates HkFrames database with frames from unindexed files

sotodlib.io.hkdb.update_index_all(cfg: HkConfig | str, subdirs=None, retry_failed: bool = False)[source]

Updates all HK index databases, and returns a report.

sotodlib.io.hkdb.purge_unindexed_files(hkcfg: HkConfig)[source]

Remove any ‘unindexed’ files from the database. This can be used to ignore files that disappeared from filesystem before being indexed.

sotodlib.io.hkdb.reset_failed_files(hkcfg: HkConfig, pattern=None)[source]

Reset “failed” files to the “unindexed” state. If pattern is specified, it should be an sql-compatible matching string, which will be matched against the path. Returns the number of changed rows.

sotodlib.io.hkdb.get_files_info(hkcfg: HkConfig, index_status=None, limit=None)[source]

Query file rows; sorted by path. Optionally filter by index_status (list of str) and limit (int). Returns list of HkFile.

class sotodlib.io.hkdb.Field(agent: str, feed: str, field: str)[source]

Bases: object

agent: str
feed: str
field: str
matches(other)[source]
classmethod from_str(s)[source]
class sotodlib.io.hkdb.LoadSpec(cfg: HkConfig, fields: List[str], start: float, end: float, downsample_factor: int = 1, hkdb: HkDb | None = None)[source]

Bases: object

HK loading specification

Parameters:
  • cfg (HkConfig) – Configuration object

  • fields (List[str]) – List of field specifications to load. This can either be a field descriptor, of the format agent.feed.field, or an alias defined in the config. Field descriptors can contain wildcards in the feed and field portion., for instance agent.*.* will load all fields belonging to the specified agent. agent.feed.* and agent.*.*word* will also work as expected.

  • start (float) – Start time to load

  • end (float) – End time to load

  • downsample_factor (int) – Downsample factor for data

  • hkdb (Optional[HkDb]) – HkDb instance to use. If not specified, will create a new one from the cfg. This should be set manually if you are calling load_hk in a loop to prevent connection build-up.

cfg: HkConfig
fields: List[str]
start: float
end: float
downsample_factor: int = 1
hkdb: HkDb | None = None
class sotodlib.io.hkdb.HkResult(data, aliases=None)[source]

Bases: object

Helper class for storing results of LoadHk. If aliases are set for any of the keys, they will be set as attributes of the object.

data

Dict where the key is the field descriptor, and the value is a list where val[0] are timestamps, and val[1] is data.

Type:

dict

save(path)[source]
classmethod load(path)[source]
sotodlib.io.hkdb.load_hk(load_spec: LoadSpec | dict, show_pb=False, fields=None, start=None, end=None, _field_list_scan=False)[source]

Loads hk data

Parameters:
  • load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.

  • show_pb (bool) – If true, will show a progressbar :)

  • fields (List[str]) – Fields to load (overrides load_spec.fields).

  • start (float) – Starting timestamp (overrides load_spec.start).

  • end (float) – Ending timestamp (overrides load_spec.end).

  • _field_list_scan (bool) – Run in special mode to support get_field_list.

sotodlib.io.hkdb.get_feed_list(load_spec: LoadSpec | dict) List[str][source]

Return the list of feeds present in the db and for the time range specified by a LoadSpec.

Parameters:

load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.

Returns:

The list of feeds, as field spec strings, with wildcard for the field e.g. “an_agent.a_feed.*”.

Return type:

List[str]

Notes

The .start_time and .end_time are respected in the query, but the .fields entry is ignored in the search.

sotodlib.io.hkdb.get_field_list(load_spec: LoadSpec | dict, fields: List[Field] | None = None) List[str][source]

Inspect the HK files to get the field names associated with each feed covered by the load_spec (or by the fields argument). This is shallow search in that only a single frame from every agent.feed combination matching the fields list is inspected.

Parameters:
  • load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.

  • fields – List of fields (which may include wildcards) to match against.

Returns:

The list of fields, as field spec strings.

Return type:

List[str]

Notes

If fields is not specified, then it is taken from load_spec. But normally you’d want it from the get_feeds_list. The .start_time and .end_time are respected in the query, especially in the sense that the shallow data search will begin at .start_time.