Input and output data¶
The datastore
module provides an abstraction layer around data storage,
allowing different methods of storing simulation/analysis results (local
filesystem, remote filesystem, database, etc.) to provide a common interface.
The interface is built around three types of object: a DataStore
may
contain many DataItem
s, each of which is identified by a
DataKey
.
There is a single DataKey
class. DataStore
and
DataItem
are abstract base classes, and must be subclassed to provide
different functionality.
Base classes¶
-
class
sumatra.datastore.
DataKey
(path, digest, creation, **metadata)¶ Identifies a
DataItem
, and may be used to retrieve aDataItem
from aDataStore
.May also be used to store metadata (e.g. file size, mimetype) and be used as a proxy for the
DataItem
on a system where the actual data is not available.-
path
¶ a token used to retrieve a
DataItem
. For filesystem-basedDataStore
s, this will be a relative path. For database-backed stores (none of which have been implemented yet :-) it could be a primary key or an object encapsulating a query.
-
digest
¶ the SHA1 digest of the contents of the associated
DataItem
. This attribute is calculated on creation of theDataKey
.
-
metadata
¶ a
dict
containing metadata, such as file size and mimetype.
-
next
()¶
-
-
class
sumatra.datastore.base.
DataItem
¶ Base class for data item classes, that may represent files or database records.
-
digest
¶ docstring
-
get_content
(max_length=None)¶ Return the contents of the data item as a string.
If max_length is specified, return that number of bytes, otherwise return the entire content.
-
next
()¶
-
save_copy
(path)¶ Save a copy of the data to a local file.
If path is an existing directory, the data item path will be appended to it, otherwise path is treated as a full path including filename, either absolute or relative to the working directory.
Return the full path of the final file.
-
sorted_content
()¶ Return the contents of the data item, sorted by line.
-
-
class
sumatra.datastore.base.
DataStore
¶ Base class for data storage abstractions.
-
contains_path
(path)¶ Does the store contain a data item with the given path?
-
copy
()¶
-
delete
(*keys)¶ Delete the files corresponding to the given keys.
-
find_new_data
(timestamp)¶ Finds newly created/changed data items
-
generate_keys
(*paths)¶ Given a number of “paths”, return a list of keys enabling the data at those paths to be retrieved from this store later.
-
get_content
(key, max_length=None)¶ Return the contents of a file identified by a key.
If max_length is given, the return value will be truncated.
-
get_data_item
(key)¶ Return the file that matches the given key.
-
next
()¶
-
required_attributes
= (u'find_new_data', u'get_data_item', u'delete')¶
-
Storing data on the local filesystem¶
-
class
sumatra.datastore.
FileSystemDataStore
(root)¶ Bases:
sumatra.datastore.base.DataStore
Represents a locally-mounted filesystem. The root of the data store will generally be a subdirectory of the real filesystem.
-
root
¶
The absolute path on the underlying file system to the root directory of the data store.
-
-
class
sumatra.datastore.filesystem.
DataFile
(path, store, creation=None)¶ Bases:
sumatra.datastore.base.DataItem
A file-like object, that represents a file in a local filesystem.
Automatic archiving of data written to the local filesystem¶
-
class
sumatra.datastore.
ArchivingFileSystemDataStore
(root, archive=u'.smt/archive')¶ Bases:
sumatra.datastore.filesystem.FileSystemDataStore
Represents a locally-mounted filesystem that archives any new files created in it. The root of the data store will generally be a subdirectory of the real filesystem.
-
archive_store
¶ Directory within which data will be archived.
-
-
class
sumatra.datastore.archivingfs.
ArchivedDataFile
(path, store, creation=None)¶ Bases:
sumatra.datastore.base.DataItem
A file-like object, that represents a file inside a tar archive
Mirroring data to a remote webserver¶
-
class
sumatra.datastore.
MirroredFileSystemDataStore
(root, mirror_base_url)¶ Bases:
sumatra.datastore.filesystem.FileSystemDataStore
Represents a locally-mounted filesystem whose contents are mirrored on a webserver, so that the files can be accessed via an HTTP URL.
-
mirror_base_url
¶ URL to which the file path will be appended to obtain the final URL of a file
-
-
class
sumatra.datastore.mirroredfs.
MirroredDataFile
(path, store, creation=None)¶ Bases:
sumatra.datastore.base.DataItem
A file-like object, that represents a file existing both on a local file system and on a webserver.