Input and output data¶
The datastore module provides an abstraction layer around data storage,
allowing different methods of storing simulation/analysis results (local
filesystem, remote filesystem, database, etc.) to provide a common interface.
The interface is built around three types of object: a DataStore may
contain many DataItems, each of which is identified by a
DataKey.
There is a single DataKey class. DataStore and
DataItem are abstract base classes, and must be subclassed to provide
different functionality.
Base classes¶
- class sumatra.datastore.DataKey(path, digest, creation, **metadata)¶
Identifies a
DataItem, and may be used to retrieve aDataItemfrom aDataStore.May also be used to store metadata (e.g. file size, mimetype) and be used as a proxy for the
DataItemon a system where the actual data is not available.- path¶
a token used to retrieve a
DataItem. For filesystem-basedDataStores, this will be a relative path. For database-backed stores (none of which have been implemented yet :-) it could be a primary key or an object encapsulating a query.
- digest¶
the SHA1 digest of the contents of the associated
DataItem. This attribute is calculated on creation of theDataKey.
- metadata¶
a
dictcontaining metadata, such as file size and mimetype.
- class sumatra.datastore.base.DataItem¶
Base class for data item classes, that may represent files or database records.
- property digest¶
docstring
- get_content(max_length=None)¶
Return the contents of the data item as a string.
If max_length is specified, return that number of bytes, otherwise return the entire content.
- save_copy(path)¶
Save a copy of the data to a local file.
If path is an existing directory, the data item path will be appended to it, otherwise path is treated as a full path including filename, either absolute or relative to the working directory.
Return the full path of the final file.
- sorted_content()¶
Return the contents of the data item, sorted by line.
- class sumatra.datastore.base.DataStore¶
Base class for data storage abstractions.
- contains_path(path)¶
Does the store contain a data item with the given path?
- copy()¶
- delete(*keys)¶
Delete the files corresponding to the given keys.
- find_new_data(timestamp)¶
Finds newly created/changed data items
- generate_keys(*paths)¶
Given a number of “paths”, return a list of keys enabling the data at those paths to be retrieved from this store later.
- get_content(key, max_length=None)¶
Return the contents of a file identified by a key.
If max_length is given, the return value will be truncated.
- get_data_item(key)¶
Return the file that matches the given key.
- get_type()¶
- required_attributes = ('find_new_data', 'get_data_item', 'delete')¶
Storing data on the local filesystem¶
Automatic archiving of data written to the local filesystem¶
- class sumatra.datastore.ArchivingFileSystemDataStore(root, archive='.smt/archive')¶
Bases:
FileSystemDataStoreRepresents a locally-mounted filesystem that archives any new files created in it. The root of the data store will generally be a subdirectory of the real filesystem.
- archive_store¶
Directory within which data will be archived.
Mirroring data to a remote webserver¶
- class sumatra.datastore.MirroredFileSystemDataStore(root, mirror_base_url)¶
Bases:
FileSystemDataStoreRepresents a locally-mounted filesystem whose contents are mirrored on a webserver, so that the files can be accessed via an HTTP URL.
- mirror_base_url¶
URL to which the file path will be appended to obtain the final URL of a file