b.b.BTreeBuilder(index.GraphIndexBuilder) : class documentation

Part of bzrlib.btree_index View In Hierarchy

A Builder for B+Tree based Graph indices.

The resulting graph has the structure:

_SIGNATURE OPTIONS NODES _SIGNATURE := 'B+Tree Graph Index 1' NEWLINE OPTIONS := REF_LISTS KEY_ELEMENTS LENGTH REF_LISTS := 'node_ref_lists=' DIGITS NEWLINE KEY_ELEMENTS := 'key_elements=' DIGITS NEWLINE LENGTH := 'len=' DIGITS NEWLINE ROW_LENGTHS := 'row_lengths' DIGITS (COMMA DIGITS)* NODES := NODE_COMPRESSED* NODE_COMPRESSED:= COMPRESSED_BYTES{4096} NODE_RAW := INTERNAL | LEAF INTERNAL := INTERNAL_FLAG POINTERS LEAF := LEAF_FLAG ROWS KEY_ELEMENT := Not-whitespace-utf8 KEY := KEY_ELEMENT (NULL KEY_ELEMENT)* ROWS := ROW* ROW := KEY NULL ABSENT? NULL REFERENCES NULL VALUE NEWLINE ABSENT := 'a' REFERENCES := REFERENCE_LIST (TAB REFERENCE_LIST){node_ref_lists - 1} REFERENCE_LIST := (REFERENCE (CR REFERENCE)*)? REFERENCE := KEY VALUE := no-newline-no-null-bytes

Method __init__ See GraphIndexBuilder.__init__.
Method add_node Add a node to the index.
Method add_nodes Add nodes to the index.
Method finish Finalise the index.
Method iter_all_entries Iterate over all keys within the index
Method iter_entries Iterate over keys within the index.
Method iter_entries_prefix Iterate over keys within the index using prefix matching.
Method key_count Return an estimate of the number of keys in this index.
Method validate In memory index's have no known corruption at the moment.
Method _spill_mem_keys_to_disk Write the in memory keys down to disk to cap memory consumption.
Method _spill_mem_keys_without_combining Undocumented
Method _spill_mem_keys_and_combine Undocumented
Method _iter_mem_nodes Iterate over the nodes held in memory.
Method _iter_smallest Undocumented
Method _add_key Add a key to the current chunk.
Method _write_nodes Write node_iterator out as a B+Tree.
Method _get_nodes_by_key Undocumented

Inherited from GraphIndexBuilder:

Method clear_cache See GraphIndex.clear_cache()
Method set_optimize Change how the builder tries to optimize the result.
Method find_ancestry See CombinedGraphIndex.find_ancestry()
Method _check_key Raise BadIndexKey if key is not a valid key for this index.
Method _external_references Return references that are not present in this index.
Method _update_nodes_by_key Update the _nodes_by_key dict with a new key.
Method _check_key_ref_value Check that 'key' and 'references' are all valid.
def __init__(self, reference_lists=0, key_elements=1, spill_at=100000):
See GraphIndexBuilder.__init__.
Parametersspill_atOptional parameter controlling the maximum number of nodes that BTreeBuilder will hold in memory.
def add_node(self, key, value, references=()):
Add a node to the index.

If adding the node causes the builder to reach its spill_at threshold, disk spilling will be triggered.

ParameterskeyThe key. keys are non-empty tuples containing as many whitespace-free utf8 bytestrings as the key length defined for this index.
referencesAn iterable of iterables of keys. Each is a reference to another key.
valueThe value to associate with the key. It may be any bytes as long as it does not contain 0 or n.
def _spill_mem_keys_to_disk(self):
Write the in memory keys down to disk to cap memory consumption.

If we already have some keys written to disk, we will combine them so as to preserve the sorted order. The algorithm for combining uses powers of two. So on the first spill, write all mem nodes into a single index. On the second spill, combine the mem nodes with the nodes on disk to create a 2x sized disk index and get rid of the first index. On the third spill, create a single new disk index, which will contain the mem nodes, and preserve the existing 2x sized index. On the fourth, combine mem with the first and second indexes, creating a new one of size 4x. On the fifth create a single new one, etc.

def _spill_mem_keys_without_combining(self):
Undocumented
def _spill_mem_keys_and_combine(self):
Undocumented
def add_nodes(self, nodes):
Add nodes to the index.
ParametersnodesAn iterable of (key, node_refs, value) entries to add.
def _iter_mem_nodes(self):
Iterate over the nodes held in memory.
def _iter_smallest(self, iterators_to_combine):
Undocumented
def _add_key(self, string_key, line, rows, allow_optimize=True):
Add a key to the current chunk.
Parametersstring_keyThe key to add.
lineThe fully serialised key and value.
allow_optimizeIf set to False, prevent setting the optimize flag when writing out. This is used by the _spill_mem_keys_to_disk functionality.
def _write_nodes(self, node_iterator, allow_optimize=True):
Write node_iterator out as a B+Tree.
Parametersnode_iteratorAn iterator of sorted nodes. Each node should match the output given by iter_all_entries.
allow_optimizeIf set to False, prevent setting the optimize flag when writing out. This is used by the _spill_mem_keys_to_disk functionality.
ReturnsA file handle for a temporary file containing a B+Tree for the nodes.
def finish(self):
Finalise the index.
ReturnsA file handle for a temporary file containing the nodes added to the index.
def iter_all_entries(self):
Iterate over all keys within the index
ReturnsAn iterable of (index, key, value, reference_lists). There is no defined order for the result iteration - it will be in the most efficient order for the index (in this case dictionary hash order).
def iter_entries(self, keys):
Iterate over keys within the index.
ParameterskeysAn iterable providing the keys to be retrieved.
ReturnsAn iterable of (index, key, value, reference_lists). There is no defined order for the result iteration - it will be in the most efficient order for the index (keys iteration order in this case).
def iter_entries_prefix(self, keys):
Iterate over keys within the index using prefix matching.

Prefix matching is applied within the tuple of a key, not to within the bytestring of each key element. e.g. if you have the keys ('foo', 'bar'), ('foobar', 'gam') and do a prefix search for ('foo', None) then only the former key is returned.

ParameterskeysAn iterable providing the key prefixes to be retrieved. Each key prefix takes the form of a tuple the length of a key, but with the last N elements 'None' rather than a regular bytestring. The first element cannot be 'None'.
ReturnsAn iterable as per iter_all_entries, but restricted to the keys with a matching prefix to those supplied. No additional keys will be returned, and every match that is in the index will be returned.
def _get_nodes_by_key(self):
Undocumented
def key_count(self):
Return an estimate of the number of keys in this index.

For InMemoryGraphIndex the estimate is exact.

def validate(self):
In memory index's have no known corruption at the moment.
API Documentation for Bazaar, generated by pydoctor at 2022-06-16 00:25:16.