bzrlib.knit.KnitVersionedFiles : API documentation

Storage for many versioned files using knit compression.

Backend storage is managed by indices and data objects.

Instance Variables _index A _KnitGraphIndex or similar that can describe the parents, graph, compression and data location of entries in this KnitVersionedFiles. Note that this is only the index for this vfs; if there are fallbacks they must be queried separately.

Method	__init__	Create a KnitVersionedFiles with index and data_access.
Method	__repr__	Undocumented
Method	without_fallbacks	Return a clone of this object without any fallbacks configured.
Method	add_fallback_versioned_files	Add a source of texts for texts not present in this knit.
Method	add_lines	See VersionedFiles.add_lines().
Method	annotate	See VersionedFiles.annotate.
Method	get_annotator	Undocumented
Method	check	See VersionedFiles.check().
Method	get_parent_map	Get a map of the graph parents of keys.
Method	get_record_stream	Get a stream of records for keys.
Method	get_sha1s	See VersionedFiles.get_sha1s().
Method	insert_record_stream	Insert a record stream into this container.
Method	get_missing_compression_parent_keys	Return an iterable of keys of missing compression parents.
Method	iter_lines_added_or_present_in_keys	Iterate over the lines in the versioned files from keys.
Method	keys	See VersionedFiles.keys.
Method	_add_text	See VersionedFiles._add_text().
Method	_add	Add a set of lines on top of version specified by parents.
Method	_logical_check	Undocumented
Method	_check_add	check that version_id and lines are safe to add.
Method	_check_header	Undocumented
Method	_check_header_version	Checks the header version on original format knit records.
Method	_check_should_delta	Iterate back through the parent listing, looking for a fulltext.
Method	_build_details_to_components	Convert a build_details tuple to a position tuple.
Method	_get_components_positions	Produce a map of position data for the components of keys.
Method	_get_content	Returns a content object that makes up the specified
Method	_get_parent_map_with_sources	Get a map of the parents of keys.
Method	_get_record_map	Produce a dictionary of knit records.
Method	_raw_map_to_record_map	Parse the contents of _get_record_map_unparsed.
Method	_get_record_map_unparsed	Get the raw data for reconstructing keys without parsing it.
Class Method	_split_by_prefix	For the given keys, split them up based on their prefix.
Method	_group_keys_for_io	For the given keys, group them into 'best-sized' requests.
Method	_get_remaining_record_stream	This function is the 'retry' portion for get_record_stream.
Method	_make_line_delta	Generate a line delta from delta_seq and new_content.
Method	_merge_annotations	Merge annotations for content and generate deltas.
Method	_parse_record	Parse an original format knit record.
Method	_parse_record_header	Parse a record header for consistency.
Method	_parse_record_unchecked	Undocumented
Method	_read_records_iter	Read text records from data file and yield result.
Method	_read_records_iter_raw	Read text records from data file and yield raw data.
Method	_read_records_iter_unchecked	Read text records from data file and yield raw data.
Method	_record_to_data	Convert key, digest, lines into a raw data block.
Method	_split_header	Undocumented

Inherited from VersionedFilesWithFallbacks:

Method

get_known_graph_ancestry

Get a KnownGraph instance with the ancestry of keys.

Inherited from VersionedFiles (via VersionedFilesWithFallbacks):

Method	add_mpdiffs	Add mpdiffs to this VersionedFile.
Static Method	check_not_reserved_id	Undocumented
Method	clear_cache	Clear whatever caches this VersionedFile holds.
Method	make_mpdiffs	Create multiparent diffs for specified keys.
Method	_check_lines_not_unicode	Check that lines being added to a versioned file are not unicode.
Method	_check_lines_are_lines	Check that the lines really are full lines without inline EOL.
Method	_extract_blocks	Undocumented
Method	_transitive_fallbacks	Return the whole stack of fallback versionedfiles.

Create a KnitVersionedFiles with index and data_access.

Parameters	index	The index for the knit data.
	data_access	The access object to store and retrieve knit records.
	max_delta_chain	The maximum number of deltas to permit during insertion. Set to 0 to prohibit the use of deltas.
	annotated	Set to True to cause annotations to be calculated and stored during insertion.
	reload_func	An function that can be called if we think we need to reload the pack listing and try again. See 'bzrlib.repofmt.pack_repo.AggregateIndex' for the signature.

Return a clone of this object without any fallbacks configured.

Add a source of texts for texts not present in this knit.

Parameters a_versioned_files A VersionedFiles object.

See VersionedFiles.add_lines().

See VersionedFiles._add_text().

Add a set of lines on top of version specified by parents.

Any versions not present will be converted into ghosts.

We pass both lines and line_bytes because different routes bring the values to this function. And for memory efficiency, we don't want to have to split/join on-demand.

Parameters	lines	A list of strings where each one is a single line (has a single newline at the end of the string) This is now optional (callers can pass None). It is left in its location for backwards compatibility. It should ''.join(lines) must == line_bytes
	line_bytes	A single string containing the content

See VersionedFiles.annotate.

See VersionedFiles.check().

check that version_id and lines are safe to add.

Checks the header version on original format knit records.

These have the last component of the key embedded in the record.

Iterate back through the parent listing, looking for a fulltext.

This is used when we want to decide whether to add a delta or a new fulltext. It searches for _max_delta_chain parents. When it finds a fulltext parent, it sees if the total size of the deltas leading up to it is large enough to indicate that we want a new full text anyway.

Return True if we should create a new delta, False if we should use a full text.

Convert a build_details tuple to a position tuple.

Produce a map of position data for the components of keys.

This data is intended to be used for retrieving the knit records.

A dict of key to (record_details, index_memo, next, parents) is returned.

method is the way referenced data should be applied.
index_memo is the handle to pass to the data access to actually get the data
next is the build-parent of the version, or None for fulltexts.
parents is the version_ids of the parents of this version

Parameters allow_missing If True do not raise an error on a missing component, just ignore it.

Returns a content object that makes up the specified version.

Get a map of the graph parents of keys.

Parameters	keys	The keys to look up parents for.
Returns	A mapping from keys to parents. Absent keys are absent from the mapping.

Get a map of the parents of keys.

Parameters	keys	The keys to look up parents for.
Returns	A tuple. The first element is a mapping from keys to parents. Absent keys are absent from the mapping. The second element is a list with the locations each key was found in. The first element is the in-this-knit parents, the second the first fallback source, and so on.

Produce a dictionary of knit records.

Parameters keys The keys to build a map for

allow_missing If some records are missing, rather than error, just return the data that could be generated.

Returns

{key:(record, record_details, digest, next)}

record: data returned from read_records (a KnitContentobject)
record_details: opaque information to pass to parse_record
digest: SHA1 digest of the full text after all steps are done
next: build-parent of the version, i.e. the leftmost ancestor.

Will be None if the record is not a delta.

Parse the contents of _get_record_map_unparsed.

Returns see _get_record_map.

Get the raw data for reconstructing keys without parsing it.

Returns A dict suitable for parsing via _raw_map_to_record_map. key-> raw_bytes, (method, noeol), compression_parent

For the given keys, split them up based on their prefix.

To keep memory pressure somewhat under control, split the requests back into per-file-id requests, otherwise "bzr co" extracts the full tree into memory before writing it to disk. This should be revisited if _get_content_maps() can ever cross file-id boundaries.

The keys for a given file_id are kept in the same relative order. Ordering between file_ids is not, though prefix_order will return the order that the key was first seen.

Parameters	keys	An iterable of key tuples
Returns	(split_map, prefix_order) split_map A dictionary mapping prefix => keys prefix_order The order that we saw the various prefixes

For the given keys, group them into 'best-sized' requests.

The idea is to avoid making 1 request per file, but to never try to unpack an entire 1.5GB source tree in a single pass. Also when possible, we should try to group requests to the same pack file together.

Returns list of (keys, non_local) tuples that indicate what keys should be fetched next.

Get a stream of records for keys.

Parameters	keys	The keys to include.
	ordering	Either 'unordered' or 'topological'. A topologically sorted stream has compression parents strictly before their children.
	include_delta_closure	If True then the closure across any compression parents will be included (in the opaque data).
Returns	An iterator of ContentFactory objects, each of which is only valid until the iterator is advanced.

This function is the 'retry' portion for get_record_stream.

See VersionedFiles.get_sha1s().

Insert a record stream into this container.

Parameters	stream	A stream of records to insert.
Returns	None
See Also

Return an iterable of keys of missing compression parents.

Check this after calling insert_record_stream to find out if there are any missing compression parents. If there are, the records that depend on them are not able to be inserted safely. For atomic KnitVersionedFiles built on packs, the transaction should be aborted or suspended - commit will fail at this point. Nonatomic knits will error earlier because they have no staging area to put pending entries into.

Iterate over the lines in the versioned files from keys.

This may return lines from other keys. Each item the returned iterator yields is a tuple of a line and a text version that that line is present in (not introduced in).

Ordering of results is in whatever order is most suitable for the underlying storage format.

If a progress bar is supplied, it may be used to indicate progress. The caller is responsible for cleaning up progress bars (because this is an iterator).

NOTES:

Lines are normalised by the underlying store: they will all have n terminators.
Lines are returned in arbitrary order.
If a requested key did not change any lines (or didn't have any lines), it may not be mentioned at all in the result.

Parameters	pb	Progress bar supplied by caller.
Returns	An iterator over (line, key).

Generate a line delta from delta_seq and new_content.

Merge annotations for content and generate deltas.

This is done by comparing the annotations based on changes to the text and generating a delta on the resulting full texts. If annotations are not being created then a simple delta is created.

Parse an original format knit record.

These have the last element of the key only present in the stored data.

Parse a record header for consistency.

Returns the header and the decompressor stream. as (stream, header_record)

Read text records from data file and yield result.

The result will be returned in whatever is the fastest to read. Not by the order requested. Also, multiple requests for the same record will only yield 1 response.

Parameters	records	A list of (key, access_memo) entries
Returns	Yields (key, contents, digest) in the order read, not the order requested

Read text records from data file and yield raw data.

This unpacks enough of the text record to validate the id is as expected but thats all.

Each item the iterator yields is (key, bytes,: expected_sha1_of_full_text).

Read text records from data file and yield raw data.

No validation is done.

Yields tuples of (key, data).

Convert key, digest, lines into a raw data block.

Parameters	key	The key of the record. Currently keys are always serialised using just the trailing component.
	dense_lines	The bytes of lines but in a denser form. For instance, if lines is a list of 1000 bytestrings each ending in n, dense_lines may be a list with one line in it, containing all the 1000's lines and their n's. Using dense_lines if it is already known is a win because the string join to create bytes in this function spends less time resizing the final string.
Returns	(len, a StringIO instance with the raw data ready to read.)

See VersionedFiles.keys.

b.k.KnitVersionedFiles(VersionedFilesWithFallbacks) : class documentation