bzrlib.btree_index.BTreeBuilder : API documentation

A Builder for B+Tree based Graph indices.

The resulting graph has the structure:

_SIGNATURE OPTIONS NODES _SIGNATURE := 'B+Tree Graph Index 1' NEWLINE OPTIONS := REF_LISTS KEY_ELEMENTS LENGTH REF_LISTS := 'node_ref_lists=' DIGITS NEWLINE KEY_ELEMENTS := 'key_elements=' DIGITS NEWLINE LENGTH := 'len=' DIGITS NEWLINE ROW_LENGTHS := 'row_lengths' DIGITS (COMMA DIGITS)* NODES := NODE_COMPRESSED* NODE_COMPRESSED:= COMPRESSED_BYTES{4096} NODE_RAW := INTERNAL | LEAF INTERNAL := INTERNAL_FLAG POINTERS LEAF := LEAF_FLAG ROWS KEY_ELEMENT := Not-whitespace-utf8 KEY := KEY_ELEMENT (NULL KEY_ELEMENT)* ROWS := ROW* ROW := KEY NULL ABSENT? NULL REFERENCES NULL VALUE NEWLINE ABSENT := 'a' REFERENCES := REFERENCE_LIST (TAB REFERENCE_LIST){node_ref_lists - 1} REFERENCE_LIST := (REFERENCE (CR REFERENCE)*)? REFERENCE := KEY VALUE := no-newline-no-null-bytes

Method	__init__	See GraphIndexBuilder.__init__.
Method	add_node	Add a node to the index.
Method	add_nodes	Add nodes to the index.
Method	finish	Finalise the index.
Method	iter_all_entries	Iterate over all keys within the index
Method	iter_entries	Iterate over keys within the index.
Method	iter_entries_prefix	Iterate over keys within the index using prefix matching.
Method	key_count	Return an estimate of the number of keys in this index.
Method	validate	In memory index's have no known corruption at the moment.
Method	_spill_mem_keys_to_disk	Write the in memory keys down to disk to cap memory consumption.
Method	_spill_mem_keys_without_combining	Undocumented
Method	_spill_mem_keys_and_combine	Undocumented
Method	_iter_mem_nodes	Iterate over the nodes held in memory.
Method	_iter_smallest	Undocumented
Method	_add_key	Add a key to the current chunk.
Method	_write_nodes	Write node_iterator out as a B+Tree.
Method	_get_nodes_by_key	Undocumented

Inherited from GraphIndexBuilder:

Method	clear_cache	See GraphIndex.clear_cache()
Method	set_optimize	Change how the builder tries to optimize the result.
Method	find_ancestry	See CombinedGraphIndex.find_ancestry()
Method	_check_key	Raise BadIndexKey if key is not a valid key for this index.
Method	_external_references	Return references that are not present in this index.
Method	_update_nodes_by_key	Update the _nodes_by_key dict with a new key.
Method	_check_key_ref_value	Check that 'key' and 'references' are all valid.

def add_node(self, key, value, references=()):

overrides bzrlib.index.GraphIndexBuilder.add_node

Add a node to the index.

If adding the node causes the builder to reach its spill_at threshold, disk spilling will be triggered.

Parameters	key	The key. keys are non-empty tuples containing as many whitespace-free utf8 bytestrings as the key length defined for this index.
	references	An iterable of iterables of keys. Each is a reference to another key.
	value	The value to associate with the key. It may be any bytes as long as it does not contain 0 or n.

def _spill_mem_keys_to_disk(self):

Write the in memory keys down to disk to cap memory consumption.

If we already have some keys written to disk, we will combine them so as to preserve the sorted order. The algorithm for combining uses powers of two. So on the first spill, write all mem nodes into a single index. On the second spill, combine the mem nodes with the nodes on disk to create a 2x sized disk index and get rid of the first index. On the third spill, create a single new disk index, which will contain the mem nodes, and preserve the existing 2x sized index. On the fourth, combine mem with the first and second indexes, creating a new one of size 4x. On the fifth create a single new one, etc.

def _add_key(self, string_key, line, rows, allow_optimize=True):

Add a key to the current chunk.

Parameters	string_key	The key to add.
	line	The fully serialised key and value.
	allow_optimize	If set to False, prevent setting the optimize flag when writing out. This is used by the _spill_mem_keys_to_disk functionality.

def _write_nodes(self, node_iterator, allow_optimize=True):

Write node_iterator out as a B+Tree.

Parameters	node_iterator	An iterator of sorted nodes. Each node should match the output given by iter_all_entries.
	allow_optimize	If set to False, prevent setting the optimize flag when writing out. This is used by the _spill_mem_keys_to_disk functionality.
Returns	A file handle for a temporary file containing a B+Tree for the nodes.

def iter_entries(self, keys):

Iterate over keys within the index.

Parameters	keys	An iterable providing the keys to be retrieved.
Returns	An iterable of (index, key, value, reference_lists). There is no defined order for the result iteration - it will be in the most efficient order for the index (keys iteration order in this case).

def iter_entries_prefix(self, keys):

Iterate over keys within the index using prefix matching.

Prefix matching is applied within the tuple of a key, not to within the bytestring of each key element. e.g. if you have the keys ('foo', 'bar'), ('foobar', 'gam') and do a prefix search for ('foo', None) then only the former key is returned.

Parameters	keys	An iterable providing the key prefixes to be retrieved. Each key prefix takes the form of a tuple the length of a key, but with the last N elements 'None' rather than a regular bytestring. The first element cannot be 'None'.
Returns	An iterable as per iter_all_entries, but restricted to the keys with a matching prefix to those supplied. No additional keys will be returned, and every match that is in the index will be returned.

b.b.BTreeBuilder(index.GraphIndexBuilder) : class documentation