Loading L3 Housekeeping Data
The sotodlib.io.hkdb module provides methods for indexing, and quickly
loading housekeeping data from an archive.
The databases implemented store the following information:
List of all housekeeping files, with start and stop times
List of every frame in the file, with the start and stop times of each frame, agent instance-id, feed name, and the byte-offset of the frame in the file.
To index and load housekeeping information, you will need to create a
HkConfig object, documented in the API section below. For example, the
following yaml file can be loaded in directly with HkConfig.from_yaml:
hk_root: /so/data/satp1/hk
db_file: satp1_hk.db
aliases:
fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T
In the example above, the db_file is the path to the sqlite index database.
It is possible to use a general database engine instead of sqlite by using the
db_url parameter instead of db_file, which should be a valid sqlalchemy
database URL as is described here. For
example, if the index is saved in a postgres database, you can use the following
config:
hk_root: /so/data/satp1/hk
db_url: postgresql://<user>:${psql_password}@storage1/hkdb_satp1
aliases:
fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T
where psql_password will be expanded with the value of the environment
variable. Similarly, the db_url can also be set as a dictionary with fields
matching the URL fields from the sqlalchemy docs, for example:
hk_root: /so/data/satp1/hk
db_url:
drivername: postgresql
username: <user>
password: ${psql_password}
host: storage1
database: hkdb_satp1
aliases:
fp_temp: cryo-ls372-lsa21yc.temperatures.Channel_02_T
In addition to the hk-root and database paths, an “alias” dictionary can
be defined here, to provide a mapping from a human-readable name to the
field-id of the housekeeping data, where field-ids are always
of the form <agent-instance>.<feed>.<field>. These aliases are only used
for data loading, do not effect how the index data is stored in the database.
This means aliases can also be added or modified, either in the config file or
programmatically to the HkConfig object, to assist with data loading after that
hk data has already been indexed.
Loading Data
To load data from the hk archive, you can pass a LoadSpec
object to the load_hk() function, which defines the time range
and fields to load.
For example, the code below will load the focal-plane temp for the last week:
from sotodlib.io import hkdb
cfg = hkdb.HkConfig.from_yaml('satp1_hk.yaml')
t1 = time.time()
t0 = t1 - 24 * 3600 * 7
lspec = hkdb.LoadSpec(
cfg=cfg, start=t0, end=t1,
fields=['fp_temp'],
)
result = hkdb.load_hk(lspec, show_pb=True)
The result object will be an HkResult object, where
result.data contains the mapping from field-id to a tuple of two
numpy arrays. The first array is the timestamps and the second array
is data values. If any of the requested fields have an alias defined
in the cfg object, that alias will be set as an attribute in the
HkResult object:
>> print(result.data['cryo-ls372-lsa21yc.temperatures.Channel_02_T'][0].shape)
(4160195)
>> print(result.fp_temp[0].shape) # same exact thing as above
(4160195)
>> plt.plot(*result.fp_temp) # plots focal-plane temp
The fields argument to LoadSpec is a list of field-specifications, which
can either be:
An alias in the HkConfig
A field-id of the form
<agent-instance>.<feed>.<field>
Field-ids specifications will also allow wildcards as the feed or field, to allow one to easily load all the fields of an agent or a feed. For example, you can run:
from sotodlib.io import hkdb
cfg = hkdb.HkConfig.from_yaml('satp1_hk.yaml')
t1 = time.time()
t0 = t1 - 24 * 3600 * 7
lspec = hkdb.LoadSpec(
cfg=cfg, start=t0, end=t1,
fields=['cryo-ls372-lsa21yc.*.*'],
)
result = hkdb.load_hk(lspec, show_pb=True)
print(result.data.keys())
>> dict_keys(['cryo-ls372-lsa21yc.temperatures.Channel_02_R',
'cryo-ls372-lsa21yc.temperatures.Channel_02_T',
'cryo-ls372-lsa21yc.temperatures.sample_heater_out'])
# Note that the fp_temp alias will still be found, and assigned to the
# correct field.
print(result.fp_temp.shape)
>> (2, 4157762)
Inspecting Data
To determine what feeds (agent and feed combinations) are available in
a database, use the get_feed_list() function. For example:
feeds = hkdb.get_feed_list(lspec)
print(feeds)
>> [Field(agent='acu', feed='acu_error', field='*'),
Field(agent='acu', feed='acu_status', field='*'),
Field(agent='acu', feed='acu_udp_stream', field='*'),
...]
To get a list of all fields, use get_field_list(). E.g.:
fields = hkdb.get_field_list(lspec, fields=['acu.*.*Az*'])
print(len(fields), fields[0])
>> 61 acu.acu_status.Azimuth_current_position
API
Module for loading L3 housekeeping data using a database that indexes an archive of HK files.
- class sotodlib.io.hkdb.HkConfig(hk_root: str, db_file: str | None = None, db_url: str | ~sqlalchemy.engine.url.URL | None = None, echo_db: bool = False, file_idx_lookback_time: float | None = None, show_index_pb: bool = True, aliases: ~typing.Dict[str, str] = <factory>)[source]
Bases:
objectConfiguration object for indexing and loading from an HK archive.
If instantiating from a nested dictionary, the
from_dictclass method can be used to convert fields to their proper data types.- Parameters:
hk_root (str) – Root directory for the HK archive
db_file (Optional[str]) – Path to the hk index database if the database is an sqlite file. Either this or db_url must be set
db_url (Optional[Union[str, db.URL]]) – URL used for db engine. Either this or db_file must be set
echo_db (bool) – Whether database operations should be echoed
file_idx_lookback_time (Optional[float]) – Time [sec] to look back when scanning for new files to index
show_index_pb (bool) – If true, shows progress bar when indexing
Aliases for hk fields. In this dict, the key is the alias name, and the value is the field descriptor, in the format of
agent.feed.field. For example:{ 'fp_temp': 'cryo-ls372-lsa21yc.temperatures.Channel_02_T', }
These aliases are only used on load, and do not affect how data is stored in the hkdb. Aliases should be valid python identifiers, since they will be set as attributes in the HkResult object.
- classmethod from_dict(data: Dict[str, Any]) HkConfig[source]
Generates an HkConfig object from a dictionary whose keys are fields of the HkConfig dataclass.
If the
db_urlis specified, it can be set as a string, a dictionary, or a sqlalchemy URL object. If it is of type dict, it will be converted to a URL object by passing it through to the keyword arguments of sqlalchemy’s URL.create function. Environment variables will be expanded in both the string and dict representations.
- class sotodlib.io.hkdb.HkFile(**kwargs)[source]
Bases:
BaseDatabase entry for a hk file.
- Parameters:
- id
- path
- start_time
- end_time
- size
- mod_time
- index_status
- class sotodlib.io.hkdb.HkFrame(**kwargs)[source]
Bases:
BaseDatabase entry for an hk frame.
- Parameters:
file_id (int) – ID of the file containing the frame
agent (str) – instance-id of the OCS agent for the frame
feed (str) – Name of the OCS feed for the frame
fields_hash (str) – Hash to identify the combination of fields included in this frame.
byte_offset (int) – Offset of the frame in the G3 file in bytes
start_time (float) – Starting ctime of the frame
end_time (float) – Ending ctime of the frame
- id
- file_id
- file
- agent
- feed
- fields_hash
- byte_offset
- start_time
- end_time
- class sotodlib.io.hkdb.HkDb(cfg: HkConfig | str)[source]
Bases:
objectHelper class for createing database sessions
- Parameters:
cfg (Union[HKConfig, str]) – Configuration object or path to configuration file
- sotodlib.io.hkdb.update_file_index(hkcfg: HkConfig, session=None, subdirs=None, retry_failed=False)[source]
Updates HkFiles database with new files on disk.
If subdirs is specified, it must be a list of (full path) sub-directories of hk_root; those will be scanned and any new files added to database.
Otherwise, all subdirs of hk_root will be scanned, subject to any restriction from hkcfg.file_idx_lookback_time.
- sotodlib.io.hkdb.get_frames_from_file(hkcfg: HkConfig, file: HkFile, return_on_fail=True) List[HkFrame][source]
Returns HkFile and HkFrame objects corresponding to a given hk file.
- Parameters:
- Returns:
frames – List of all HkFrames in the file
- Return type:
List[HkFrame]
- sotodlib.io.hkdb.update_frame_index(hkcfg: HkConfig, session=None)[source]
Updates HkFrames database with frames from unindexed files
- sotodlib.io.hkdb.update_index_all(cfg: HkConfig | str, subdirs=None, retry_failed: bool = False)[source]
Updates all HK index databases, and returns a report.
- sotodlib.io.hkdb.purge_unindexed_files(hkcfg: HkConfig)[source]
Remove any ‘unindexed’ files from the database. This can be used to ignore files that disappeared from filesystem before being indexed.
- sotodlib.io.hkdb.reset_failed_files(hkcfg: HkConfig, pattern=None)[source]
Reset “failed” files to the “unindexed” state. If pattern is specified, it should be an sql-compatible matching string, which will be matched against the path. Returns the number of changed rows.
- sotodlib.io.hkdb.get_files_info(hkcfg: HkConfig, index_status=None, limit=None)[source]
Query file rows; sorted by path. Optionally filter by index_status (list of str) and limit (int). Returns list of HkFile.
- class sotodlib.io.hkdb.LoadSpec(cfg: HkConfig, fields: List[str], start: float, end: float, downsample_factor: int = 1, hkdb: HkDb | None = None)[source]
Bases:
objectHK loading specification
- Parameters:
cfg (HkConfig) – Configuration object
fields (List[str]) – List of field specifications to load. This can either be a field descriptor, of the format
agent.feed.field, or an alias defined in the config. Field descriptors can contain wildcards in the feed and field portion., for instanceagent.*.*will load all fields belonging to the specified agent.agent.feed.*andagent.*.*word*will also work as expected.start (float) – Start time to load
end (float) – End time to load
downsample_factor (int) – Downsample factor for data
hkdb (Optional[HkDb]) – HkDb instance to use. If not specified, will create a new one from the cfg. This should be set manually if you are calling
load_hkin a loop to prevent connection build-up.
- class sotodlib.io.hkdb.HkResult(data, aliases=None)[source]
Bases:
objectHelper class for storing results of LoadHk. If aliases are set for any of the keys, they will be set as attributes of the object.
- data
Dict where the key is the field descriptor, and the value is a list where val[0] are timestamps, and val[1] is data.
- Type:
- sotodlib.io.hkdb.load_hk(load_spec: LoadSpec | dict, show_pb=False, fields=None, start=None, end=None, _field_list_scan=False)[source]
Loads hk data
- Parameters:
load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.
show_pb (bool) – If true, will show a progressbar :)
fields (List[str]) – Fields to load (overrides load_spec.fields).
start (float) – Starting timestamp (overrides load_spec.start).
end (float) – Ending timestamp (overrides load_spec.end).
_field_list_scan (bool) – Run in special mode to support get_field_list.
- sotodlib.io.hkdb.get_feed_list(load_spec: LoadSpec | dict) List[str][source]
Return the list of feeds present in the db and for the time range specified by a LoadSpec.
- Parameters:
load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.
- Returns:
The list of feeds, as field spec strings, with wildcard for the field e.g. “an_agent.a_feed.*”.
- Return type:
List[str]
Notes
The .start_time and .end_time are respected in the query, but the .fields entry is ignored in the search.
- sotodlib.io.hkdb.get_field_list(load_spec: LoadSpec | dict, fields: List[Field] | None = None) List[str][source]
Inspect the HK files to get the field names associated with each feed covered by the load_spec (or by the fields argument). This is shallow search in that only a single frame from every
agent.feedcombination matching the fields list is inspected.- Parameters:
load_spec (LoadSpec) – Load specification. See docstrings of the LoadSpec class.
fields – List of fields (which may include wildcards) to match against.
- Returns:
The list of fields, as field spec strings.
- Return type:
List[str]
Notes
If fields is not specified, then it is taken from load_spec. But normally you’d want it from the get_feeds_list. The .start_time and .end_time are respected in the query, especially in the sense that the shallow data search will begin at .start_time.