Part of bzrlib.knit View In Hierarchy
Manages knit index files The index is kept in memory and read on startup, to enable fast lookups of revision information. The cursor of the index file is always pointing to the end, making it easy to append entries. _cache is a cache for fast mapping from version id to a Index object. _history is a cache for fast mapping from indexes to version ids. The index data format is dictionary compressed when it comes to parent references; a index entry may only have parents that with a lover index number. As a result, the index is topological sorted. Duplicate entries may be written to the index for a single version id if this is done then the latter one completely replaces the former: this allows updates to correct version and parent information. Note that the two entries may share the delta, and that successive annotations and references MUST point to the first entry. The index file on disc contains a header, followed by one line per knit record. The same revision can be present in an index file more than once. The first occurrence gets assigned a sequence number starting from 0. The format of a single line is REVISION_ID FLAGS BYTE_OFFSET LENGTH( PARENT_ID|PARENT_SEQUENCE_ID)* : REVISION_ID is a utf8-encoded revision id FLAGS is a comma separated list of flags about the record. Values include no-eol, line-delta, fulltext. BYTE_OFFSET is the ascii representation of the byte offset in the data file that the compressed data starts at. LENGTH is the ascii representation of the length of the data file. PARENT_ID a utf-8 revision id prefixed by a '.' that is a parent of REVISION_ID. PARENT_SEQUENCE_ID the ascii representation of the sequence number of a revision id already in the knit that is a parent of REVISION_ID. The ' :' marker is the end of record marker. partial writes: when a write is interrupted to the index file, it will result in a line that does not end in ' :'. If the ' :' is not present at the end of a line, or at the end of the file, then the record that is missing it will be ignored by the parser. When writing new records to the index file, the data is preceded by ' ' to ensure that records always start on new lines even if the last write was interrupted. As a result its normal for the last line in the index to be missing a trailing newline. One can be added with no harmful effects. :ivar _kndx_cache: dict from prefix to the old state of KnitIndex objects, where prefix is e.g. the (fileid,) for .texts instances or () for constant-mapped things like .revisions, and the old state is tuple(history_vector, cache_dict). This is used to prevent having an ABI change with the C extension that reads .kndx files.
Method | __init__ | Create a _KndxIndex on transport using mapper. |
Method | add_records | Add multiple records to the index. |
Method | scan_unvalidated_index | See _KnitGraphIndex.scan_unvalidated_index. |
Method | get_missing_compression_parents | See _KnitGraphIndex.get_missing_compression_parents. |
Method | check_header | Undocumented |
Method | get_build_details | Get the method, index_memo and compression parent for keys. |
Method | get_method | Return compression method of specified key. |
Method | get_options | Return a list representing options. |
Method | find_ancestry | See CombinedGraphIndex.find_ancestry() |
Method | get_parent_map | Get a map of the parents of keys. |
Method | get_position | Return details needed to access the version. |
Method | keys | Get all the keys in the collection. |
Method | _cache_key | Cache a version record in the history array and index cache. |
Method | _check_read | Undocumented |
Method | _check_write_ok | Assert if not writes are permitted. |
Method | _init_index | Initialize an index. |
Method | _load_prefixes | Load the indices for prefixes. |
Method | _partition_keys | Turn keys into a dict of prefix:suffix_list. |
Method | _dictionary_compress | Dictionary compress keys. |
Method | _reset_cache | Undocumented |
Method | _sort_keys_by_io | Figure out an optimal order to read the records for the given keys. |
Method | _split_key | Split key into a prefix and suffix. |
Parameters | records | a list of tuples: (key, options, access_memo, parents). |
random_id | If True the ids being added were randomly generated and no check for existence will be performed. | |
missing_compression_parents | If True the records being added are only compressed against texts already in the index (or inside records). If False the records all refer to unavailable texts (or texts inside records) as compression parents. |
Cache a version record in the history array and index cache. This is inlined into _load_data for performance. KEEP IN SYNC. (It saves 60ms, 25% of the __init__ overhead on local 4000 record indexes).
Get the method, index_memo and compression parent for keys. Ghosts are omitted from the result. :param keys: An iterable of keys. :return: A dict of key:(index_memo, compression_parent, parents, record_details). index_memo opaque structure to pass to read_records to extract the raw data compression_parent Content that this record is built upon, may be None parents Logical parents of this node record_details extra information about the content which needs to be passed to Factory.parse_record
Parameters | keys | The keys to look up parents for. |
Returns | A mapping from keys to parents. Absent keys are absent from the mapping. |
Returns | a tuple (key, data position, size) to hand to the access logic to get the record. |
Parameters | keys | The keys to generate references to. |
Returns | A string representation of keys. keys which are present are dictionary compressed, and others are emitted as fulltext with a '.' prefix. |
Sort keys, grouped by index and sorted by position.
Parameters | keys | A list of keys whose records we want to read. This will be sorted 'in-place'. |
positions | A dict, such as the one returned by _get_components_positions() | |
Returns | None |