class Database
(
UserDict
):
The Database object models the top-level container for a collection of tables and indexes, each Database object maps to an LMDB database object. When you open a database there are a variety of low-level (LMDB) database settings that can be applied, it's worth understanding what some of them do as when you come to productionise your system, they will make a difference. Specifically you will want to tweak 'map_size' which is the maximum allowable size of your database, and possibly max_dbs if you are going to open large numbers of tables at the same time.
sync: If False, don’t flush system buffers to disk when committing a transaction. This optimization means
a system crash can corrupt the database or lose the last transactions if buffers are not yet flushed
to disk. The risk is governed by how often the system flushes dirty buffers to disk and how often
sync() is called. However, if the filesystem preserves write order and writemap=False, transactions
exhibit ACI (atomicity, consistency, isolation) properties and only lose D (durability). I.e.
database integrity is maintained, but a system crash may undo the final transactions. Note that
sync=False, writemap=True leaves the system with no hint for when to write transactions to disk,
unless sync() is called. map_async=True, writemap=True may be preferable.
lock: If False, don’t do any locking. If concurrent access is anticipated, the caller must manage all
concurrency itself. For proper operation the caller must enforce single-writer semantics, and must
ensure that no readers are using old transactions while a writer is active. The simplest approach is
to use an exclusive lock so that no readers may be active at all when a writer begins.
subdir: If True, path refers to a subdirectory to store the data and lock files in, otherwise it refers to
a filename prefix.
create: False, do not create the directory path if it is missing.
writemap: If True, use a writeable memory map unless readonly=True. This is faster and uses fewer mallocs, but
loses protection rom application bugs like wild pointer writes and other bad updates into the database.
Incompatible with nested transactions. Processes with and without writemap on the same environment do
not cooperate well.
metasync: If False, flush system buffers to disk only once per transaction, omit the metadata flush. Defer that
until the system flushes files to disk, or next commit or sync(). This optimization maintains database
integrity, but a system crash may undo the last committed transaction. I.e. it preserves the ACI
(atomicity, consistency, isolation) but not D (durability) database property.
readahead: If False, LMDB will disable the OS filesystem readahead mechanism, which may improve random read
performance when a database is larger than RAM.
map_async: When writemap=True, use asynchronous flushes to disk. As with sync=False, a system crash can then
corrupt the database or lose the last transactions. Calling sync() ensures on-disk database integrity
until next commit.
max_readers: Maximum number of simultaneous read transactions. Can only be set by the first process to open an
environment, as it affects the size of the lock file and shared memory area. Attempts to
simultaneously start more than this many readtransactions will fail.
max_dbs: Maximum number of databases available. If 0, assume environment will be used as a single database.
map_size: Maximum size database may grow to; used to size the memory mapping. If database grows larger than
map_size, an exception will be raised and the user must close and reopen Environment. On 64-bit there
is no penalty for making this huge (say 1TB). Must be <2GB on 32-bit.
Default values for these settings come from the CONFIG class variable, please note that currently we do NOT support the 'readonly' option. It does not appear that this option works properly with transactions on sub-databases and until we can work out why, please avoid read-only databases.
CLASS PROPERTIES
CONFIG =
{
"sync": true,
"lock": true,
"subdir": true,
"create": true,
"writemap": true,
"metasync": false,
"readahead": true,
"map_async": true,
"max_readers": 64,
"max_dbs": 64,
"map_size": 2147483648
}
PROPERTIES
def isopen(
self) ->
boolReturn True is the database is currently open
def map_size(
self) ->
intReturn the currently mapped database size
def name(
self) ->
NoneReturn the unique name (uuid) of this database for replication
def read_transaction(
self) ->
TXNReturn a read-only transaction for use with "with"
def storage_allocated(
self) ->
NoneReturn the amount of storage space pre-allocated to this database, assuming the underlying filesystem supports 'sparse' storage, this allocation will not reflect the amount of disk space 'actually' used. (see 'storage_used')
METHODS
def __getitem__(
self,
name) ->
TableShortcut to self.table
PARAMETERS
- name [str] - the name of the table to recover
def __init__(
self) ->
NoneInitialise this object
def __repr__(
self) ->
strReturn a string representation of a Database object
def close(
self) ->
NoneClose this database if it is open
def drop(
self,
name, [
txn]) ->
NoneDrop (delete) a database table
PARAMETERS
- name [str] - name of table to drop
- txn [TXN / default=None] - an optional transaction
def open(
self,
path) ->
DatabaseOpen the database Returns a reference to the Database object
PARAMETERS
- path [str] - the location of the database files
def reopen(
self) ->
NoneReopen a database and all of it's tables After calling set_mapsize to resize the database, individual database handles are potentially invalidated, hence the save option is to reopen everything.
def storage_used(
self, [
txn]) ->
NoneReturn a tuple that represents the amount of storage space consumed by this database. The first entry in the tuple is the number of bytes occupied by data, the second is a tuple representing the size (in bytes) and name of each table in the database.
PARAMETERS
- txn [TXN / default=None] - an optional transaction to wrap this operation
def sync(
self, [
force]) ->
NoneForce a database sync
PARAMETERS
- force [bool / default=True] - if True make the flush synchronous
def table(
self,
table_name, [
codec], [
integerkey], [
txn]) ->
TableReturns the table associated with the supplied name
PARAMETERS
- table_name [str] - the name of the table to recover
def tables(
self, [
all], [
txn]) ->
NoneGenerate a list of tables available in this database
PARAMETERS
- all [bool / default=False] - if True also show hidden / structural tables
- txn [TXN / default=None] - an optional transaction