b.d.DirState(object) : class documentation

Part of bzrlib.dirstate View In Hierarchy

Known subclasses: bzrlib.tests.test_dirstate.InstrumentedDirState

Record directory and metadata state for fast access.

A dirstate is a specialised data structure for managing local working tree state information. Its not yet well defined whether it is platform specific, and if it is how we detect/parameterize that.

Dirstates use the usual lock_write, lock_read and unlock mechanisms. Unlike most bzr disk formats, DirStates must be locked for reading, using lock_read. (This is an os file lock internally.) This is necessary because the file can be rewritten in place.

DirStates must be explicitly written with save() to commit changes; just unlocking them does not write the changes to disk.

Method __init__ Create a DirState object.
Method __repr__ Undocumented
Method add Add a path to be tracked.
Static Method from_tree Create a dirstate from a bzr Tree.
Method update_by_delta Apply an inventory delta to the dirstate for tree 0
Method update_basis_by_delta Update the parents of this tree after a commit.
Method get_ghosts Return a list of the parent tree revision ids that are ghosts.
Method get_lines Serialise the entire dirstate to a sequence of lines.
Method get_parent_ids Return a list of the parent tree ids for the directory state.
Class Method initialize Create a new dirstate on path.
Class Method on_file Construct a DirState on the file at path "path".
Method sha1_from_stat Find a sha1 given a stat lookup.
Method save Save any pending changes created during this session.
Method set_path_id Change the id of path to new_id in the current working tree.
Method set_parent_trees Set the parent trees for the dirstate.
Method set_state_from_inventory Set new_inv as the current state.
Method set_state_from_scratch Wipe the currently stored state and set it to something new.
Method update_minimal Update an entry to the state in tree 0.
Method lock_read Acquire a read lock on the dirstate.
Method lock_write Acquire a write lock on the dirstate.
Method unlock Drop any locks held on the dirstate.
Method _mark_modified Mark this dirstate as modified.
Method _mark_unmodified Mark this dirstate as unmodified.
Method _bisect Bisect through the disk structure for specific rows.
Method _bisect_dirblocks Bisect through the disk structure to find entries in given dirs.
Method _bisect_recursive Bisect for entries for all paths and their children.
Method _discard_merge_parents Discard any parents trees beyond the first.
Method _empty_parent_info Undocumented
Method _ensure_block Ensure a block for dirname exists.
Method _entries_to_current_state Load new_entries into self.dirblocks.
Method _split_root_dirblock_into_contents Split the root dirblocks into root and contents-of-root.
Method _entries_for_path Return a list with all the entries that match path for all ids.
Method _entry_to_line Serialize entry to a NULL delimited line ready for _get_output_lines.
Method _fields_per_entry How many null separated fields should be in each entry row.
Method _find_block Return the block that key should be present in.
Method _find_block_index_from_key Find the dirblock index for a key.
Method _find_entry_index Find the entry index for a key in a block.
Method _check_delta_is_valid Undocumented
Method _apply_removals Undocumented
Method _apply_insertions Undocumented
Method _check_delta_ids_absent Check that none of the file_ids in new_ids are present in a tree.
Method _raise_invalid Undocumented
Method _update_basis_apply_adds Apply a sequence of adds to tree 1 during update_basis_by_delta.
Method _update_basis_apply_changes Apply a sequence of changes to tree 1 during update_basis_by_delta.
Method _update_basis_apply_deletes Apply a sequence of deletes to tree 1 during update_basis_by_delta.
Method _after_delta_check_parents Check that parents required by the delta are all intact.
Method _observed_sha1 Note the sha1 of a file.
Method _sha_cutoff_time Return cutoff time.
Method _lstat Return the os.lstat value for this path.
Method _sha1_file_and_mutter Undocumented
Method _is_executable Is this file executable?
Method _is_executable_win32 On win32 the executable bit is stored in the dirstate.
Method _read_link Read the target of a symlink
Method _get_ghosts_line Create a line for the state file for ghost information.
Method _get_parents_line Create a line for the state file for parents information.
Method _get_entry_lines Create lines for entries.
Method _get_fields_to_entry Get a function which converts entry fields into a entry record.
Method _get_block_entry_index Get the coordinates for a path in the state structure.
Method _get_entry Get the dirstate entry for path in tree tree_index.
Static Method _inv_entry_to_details Convert an inventory entry (from a revision tree) to state details.
Method _iter_child_entries Iterate over all the entries that are children of path_utf.
Method _iter_entries Iterate over all the entries in the dirstate.
Method _get_id_index Get an id index of self._dirblocks.
Method _add_to_id_index Add this entry to the _id_index mapping.
Method _remove_from_id_index Remove this entry from the _id_index mapping.
Method _get_output_lines Format lines for final output.
Method _make_deleted_row Return a deleted row for fileid_utf8.
Method _num_present_parents The number of parent entries in each record row.
Method _read_dirblocks_if_needed Read in all the dirblocks from the file if they are not in memory.
Method _read_header This reads in the metadata header, and the parent ids.
Method _read_header_if_needed Read the header of the dirstate file if needed.
Method _read_prelude Read in the prelude header of the dirstate file.
Method _get_packed_stat_index Get a packed_stat index of self._dirblocks.
Method _maybe_fdatasync Flush to disk if possible and if not configured off.
Method _worth_saving Is it worth saving the dirstate or not?
Method _set_data Set the full dirstate data in memory.
Method _sort_entries Given a list of entries, sort them into the right order.
Method _make_absent Mark current_old - an entry - as absent for tree 0.
Method _maybe_remove_row Remove index if it is absent or relocated across the row.
Method _validate Check that invariants on the dirblock are correct.
Method _wipe_state Forget all state information about the dirstate.
Method _requires_lock Check that a lock is currently held by someone on the dirstate.
def __init__(self, path, sha1_provider, worth_saving_limit=0):
Create a DirState object.
ParameterspathThe path at which the dirstate file on disk should live.
sha1_provideran object meeting the SHA1Provider interface.
worth_saving_limitwhen the exact number of hash changed entries is known, only bother saving the dirstate if more than this count of entries have changed. -1 means never save hash changes, 0 means always save hash changes.
def __repr__(self):
Undocumented
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
Mark this dirstate as modified.
Parametershash_changed_entriesif non-None, mark just these entries as having their hash modified.
header_modifiedmark the header modified as well, not just the dirblocks.
def _mark_unmodified(self):
Mark this dirstate as unmodified.
def add(self, path, file_id, kind, stat, fingerprint):
Add a path to be tracked.
ParameterspathThe path within the dirstate - '' is the root, 'foo' is the path foo within the root, 'foo/bar' is the path bar within foo within the root.
file_idThe file id of the path being added.
kindThe kind of the path, as a string like 'file', 'directory', etc.
statThe output of os.lstat for the path.
fingerprintThe sha value of the file's canonical form (i.e. after any read filters have been applied), or the target of a symlink, or the referenced revision id for tree-references, or '' for directories.
def _bisect(self, paths):
Bisect through the disk structure for specific rows.
ParameterspathsA list of paths to find
ReturnsA dict mapping path => entries for found entries. Missing entries will not be in the map. The list is not sorted, and entries will be populated based on when they were read.
def _bisect_dirblocks(self, dir_list):
Bisect through the disk structure to find entries in given dirs.

_bisect_dirblocks is meant to find the contents of directories, which differs from _bisect, which only finds individual entries.

Parametersdir_listA sorted list of directory names ['', 'dir', 'foo'].
ReturnsA map from dir => entries_for_dir
def _bisect_recursive(self, paths):
Bisect for entries for all paths and their children.

This will use bisect to find all records for the supplied paths. It will then continue to bisect for any records which are marked as directories. (and renames?)

ParameterspathsA sorted list of (dir, name) pairs eg: [('', 'a'), ('', 'f'), ('a/b', 'c')]
ReturnsA dictionary mapping (dir, name, file_id) => [tree_info]
def _discard_merge_parents(self):
Discard any parents trees beyond the first.

Note that if this fails the dirstate is corrupted.

After this function returns the dirstate contains 2 trees, neither of which are ghosted.

def _empty_parent_info(self):
Undocumented
def _ensure_block(self, parent_block_index, parent_row_index, dirname):
Ensure a block for dirname exists.

This function exists to let callers which know that there is a directory dirname ensure that the block for it exists. This block can fail to exist because of demand loading, or because a directory had no children. In either case it is not an error. It is however an error to call this if there is no parent entry for the directory, and thus the function requires the coordinates of such an entry to be provided.

The root row is special cased and can be indicated with a parent block and row index of -1

Parametersparent_block_indexThe index of the block in which dirname's row exists.
parent_row_indexThe index in the parent block where the row exists.
dirnameThe utf8 dirname to ensure there is a block for.
ReturnsThe index for the block.
def _entries_to_current_state(self, new_entries):
Load new_entries into self.dirblocks.

Process new_entries into the current state object, making them the active state. The entries are grouped together by directory to form dirblocks.

Parametersnew_entriesA sorted list of entries. This function does not sort to prevent unneeded overhead when callers have a sorted list already.
ReturnsNothing.
def _split_root_dirblock_into_contents(self):
Split the root dirblocks into root and contents-of-root.

After parsing by path, we end up with root entries and contents-of-root entries in the same block. This loop splits them out again.

def _entries_for_path(self, path):
Return a list with all the entries that match path for all ids.
def _entry_to_line(self, entry):
Serialize entry to a NULL delimited line ready for _get_output_lines.
ParametersentryAn entry_tuple as defined in the module docstring.
def _fields_per_entry(self):
How many null separated fields should be in each entry row.

Each line now has an extra '\n' field which is not used
so we just skip over it

entry size::
    3 fields for the key
    + number of fields per tree_data (5) * tree count
    + newline
 
def _find_block(self, key, add_if_missing=False):
Return the block that key should be present in.
ParameterskeyA dirstate entry key.
ReturnsThe block tuple.
def _find_block_index_from_key(self, key):
Find the dirblock index for a key.
ReturnsThe block index, True if the block for the key is present.
def _find_entry_index(self, key, block):
Find the entry index for a key in a block.
ReturnsThe entry index, True if the entry for the key is present.
@staticmethod
def from_tree(tree, dir_state_filename, sha1_provider=None):
Create a dirstate from a bzr Tree.
ParameterstreeThe tree which should provide parent information and inventory ids.
sha1_provideran object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used.
Returnsa DirState object which is currently locked for writing. (it was locked by DirState.initialize)
def _check_delta_is_valid(self, delta):
Undocumented
def update_by_delta(self, delta):
Apply an inventory delta to the dirstate for tree 0

This is the workhorse for apply_inventory_delta in dirstate based trees.

ParametersdeltaAn inventory delta. See Inventory.apply_delta for details.
def _apply_removals(self, removals):
Undocumented
def _apply_insertions(self, adds):
Undocumented
def update_basis_by_delta(self, delta, new_revid):
Update the parents of this tree after a commit.

This gives the tree one parent, with revision id new_revid. The inventory delta is applied to the current basis tree to generate the inventory for the parent new_revid, and all other parent trees are discarded.

Note that an exception during the operation of this method will leave the dirstate in a corrupt state where it should not be saved.

Parametersnew_revidThe new revision id for the trees parent.
deltaAn inventory delta (see apply_inventory_delta) describing the changes from the current left most parent revision to new_revid.
def _check_delta_ids_absent(self, new_ids, delta, tree_index):
Check that none of the file_ids in new_ids are present in a tree.
def _raise_invalid(self, path, file_id, reason):
Undocumented
def _update_basis_apply_adds(self, adds):
Apply a sequence of adds to tree 1 during update_basis_by_delta.

They may be adds, or renames that have been split into add/delete pairs.

ParametersaddsA sequence of adds. Each add is a tuple: (None, new_path_utf8, file_id, (entry_details), real_add). real_add is False when the add is the second half of a remove-and-reinsert pair created to handle renames and deletes.
def _update_basis_apply_changes(self, changes):
Apply a sequence of changes to tree 1 during update_basis_by_delta.
ParametersaddsA sequence of changes. Each change is a tuple: (path_utf8, path_utf8, file_id, (entry_details))
def _update_basis_apply_deletes(self, deletes):
Apply a sequence of deletes to tree 1 during update_basis_by_delta.

They may be deletes, or renames that have been split into add/delete pairs.

ParametersdeletesA sequence of deletes. Each delete is a tuple: (old_path_utf8, new_path_utf8, file_id, None, real_delete). real_delete is True when the desired outcome is an actual deletion rather than the rename handling logic temporarily deleting a path during the replacement of a parent.
def _after_delta_check_parents(self, parents, index):
Check that parents required by the delta are all intact.
ParametersparentsAn iterable of (path_utf8, file_id) tuples which are required to be present in tree 'index' at path_utf8 with id file_id and be a directory.
indexThe column in the dirstate to check for parents in.
def _observed_sha1(self, entry, sha1, stat_value, _stat_to_minikind=_stat_to_minikind):
Note the sha1 of a file.
ParametersentryThe entry the sha1 is for.
sha1The observed sha1.
stat_valueThe os.lstat for the file.
def _sha_cutoff_time(self):
Return cutoff time.

Files modified more recently than this time are at risk of being undetectably modified and so can't be cached.

def _lstat(self, abspath, entry):
Return the os.lstat value for this path.
def _sha1_file_and_mutter(self, abspath):
Undocumented
def _is_executable(self, mode, old_executable):
Is this file executable?
def _is_executable_win32(self, mode, old_executable):
On win32 the executable bit is stored in the dirstate.
def _read_link(self, abspath, old_link):
Read the target of a symlink
def get_ghosts(self):
Return a list of the parent tree revision ids that are ghosts.
def get_lines(self):
Serialise the entire dirstate to a sequence of lines.
def _get_ghosts_line(self, ghost_ids):
Create a line for the state file for ghost information.
def _get_parents_line(self, parent_ids):
Create a line for the state file for parents information.
def _get_entry_lines(self):
Create lines for entries.
def _get_fields_to_entry(self):
Get a function which converts entry fields into a entry record.

This handles size and executable, as well as parent records.

ReturnsA function which takes a list of fields, and returns an appropriate record for storing in memory.
def get_parent_ids(self):
Return a list of the parent tree ids for the directory state.
def _get_block_entry_index(self, dirname, basename, tree_index):
Get the coordinates for a path in the state structure.
ParametersdirnameThe utf8 dirname to lookup.
basenameThe utf8 basename to lookup.
tree_indexThe index of the tree for which this lookup should be attempted.
ReturnsA tuple describing where the path is located, or should be inserted. The tuple contains four fields: the block index, the row index, the directory is present (boolean), the entire path is present (boolean). There is no guarantee that either coordinate is currently reachable unless the found field for it is True. For instance, a directory not present in the searched tree may be returned with a value one greater than the current highest block offset. The directory present field will always be True when the path present field is True. The directory present field does NOT indicate that the directory is present in the searched tree, rather it indicates that there are at least some files in some tree present there.
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
Get the dirstate entry for path in tree tree_index.

If either file_id or path is supplied, it is used as the key to lookup. If both are supplied, the fastest lookup is used, and an error is raised if they do not both point at the same row.

Parameterstree_indexThe index of the tree we wish to locate this path in. If the path is present in that tree, the entry containing its details is returned, otherwise (None, None) is returned 0 is the working tree, higher indexes are successive parent trees.
fileid_utf8A utf8 file_id to look up.
path_utf8An utf8 path to be looked up.
include_deletedIf True, and performing a lookup via fileid_utf8 rather than path_utf8, return an entry for deleted (absent) paths.
ReturnsThe dirstate entry tuple for path, or (None, None)
@classmethod
def initialize(cls, path, sha1_provider=None):
Create a new dirstate on path.

The new dirstate will be an empty tree - that is it has no parents, and only a root node - which has id ROOT_ID.

ParameterspathThe name of the file for the dirstate.
sha1_provideran object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used.
ReturnsA write-locked DirState object.
@staticmethod
def _inv_entry_to_details(inv_entry):
Convert an inventory entry (from a revision tree) to state details.
Parametersinv_entryAn inventory entry whose sha1 and link targets can be relied upon, and which has a revision set.
ReturnsA details tuple - the details for a single tree at a path + id.
def _iter_child_entries(self, tree_index, path_utf8):
Iterate over all the entries that are children of path_utf.

This only returns entries that are present (not in 'a', 'r') in tree_index. tree_index data is not refreshed, so if tree 0 is used, results may differ from that obtained if paths were statted to determine what ones were directories.

Asking for the children of a non-directory will return an empty iterator.

def _iter_entries(self):
Iterate over all the entries in the dirstate.

Each yelt item is an entry in the standard format described in the docstring of bzrlib.dirstate.

def _get_id_index(self):
Get an id index of self._dirblocks.

This maps from file_id => [(directory, name, file_id)] entries where that file_id appears in one of the trees.

def _add_to_id_index(self, id_index, entry_key):
Add this entry to the _id_index mapping.
def _remove_from_id_index(self, id_index, entry_key):
Remove this entry from the _id_index mapping.

It is an programming error to call this when the entry_key is not already present.

def _get_output_lines(self, lines):
Format lines for final output.
ParameterslinesA sequence of lines containing the parents list and the path lines.
def _make_deleted_row(self, fileid_utf8, parents):
Return a deleted row for fileid_utf8.
def _num_present_parents(self):
The number of parent entries in each record row.
@classmethod
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0):
Construct a DirState on the file at path "path".
ParameterspathThe path at which the dirstate file on disk should live.
sha1_provideran object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used.
worth_saving_limitwhen the exact number of hash changed entries is known, only bother saving the dirstate if more than this count of entries have changed. -1 means never save.
ReturnsAn unlocked DirState object, associated with the given path.
def _read_dirblocks_if_needed(self):
Read in all the dirblocks from the file if they are not in memory.

This populates self._dirblocks, and sets self._dirblock_state to IN_MEMORY_UNMODIFIED. It is not currently ready for incremental block loading.

def _read_header(self):
This reads in the metadata header, and the parent ids.

After reading in, the file should be positioned at the null just before the start of the first record in the file.

Returns(expected crc checksum, number of entries, parent list)
def _read_header_if_needed(self):
Read the header of the dirstate file if needed.
def _read_prelude(self):
Read in the prelude header of the dirstate file.

This only reads in the stuff that is not connected to the crc checksum. The position will be correct to read in the rest of the file and check the checksum after this point. The next entry in the file should be the number of parents, and their ids. Followed by a newline.

def sha1_from_stat(self, path, stat_result):
Find a sha1 given a stat lookup.
def _get_packed_stat_index(self):
Get a packed_stat index of self._dirblocks.
def save(self):
Save any pending changes created during this session.

We reuse the existing file, because that prevents race conditions with file creation, and use oslocks on it to prevent concurrent modification and reads - because dirstate's incremental data aggregation is not compatible with reading a modified file, and replacing a file in use by another process is impossible on Windows.

A dirstate in read only mode should be smart enough though to validate that the file has not changed, and otherwise discard its cache and start over, to allow for fine grained read lock duration, so 'status' wont block 'commit' - for example.

def _maybe_fdatasync(self):
Flush to disk if possible and if not configured off.
def _worth_saving(self):
Is it worth saving the dirstate or not?
def _set_data(self, parent_ids, dirblocks):
Set the full dirstate data in memory.

This is an internal function used to completely replace the objects in memory state. It puts the dirstate into state 'full-dirty'.

Parametersparent_idsA list of parent tree revision ids.
dirblocksA list containing one tuple for each directory in the tree. Each tuple contains the directory path and a list of entries found in that directory.
def set_path_id(self, path, new_id):
Change the id of path to new_id in the current working tree.
ParameterspathThe path inside the tree to set - '' is the root, 'foo' is the path foo in the root.
new_idThe new id to assign to the path. This must be a utf8 file id (not unicode, and not None).
def set_parent_trees(self, trees, ghosts):
Set the parent trees for the dirstate.
ParameterstreesA list of revision_id, tree tuples. tree must be provided even if the revision_id refers to a ghost: supply an empty tree in this case.
ghostsA list of the revision_ids that are ghosts at the time of setting.
def _sort_entries(self, entry_list):
Given a list of entries, sort them into the right order.

This is done when constructing a new dirstate from trees - normally we try to keep everything in sorted blocks all the time, but sometimes it's easier to sort after the fact.

def set_state_from_inventory(self, new_inv):
Set new_inv as the current state.

This API is called by tree transform, and will usually occur with existing parent trees.

Parametersnew_invThe inventory object to set current state from.
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
Wipe the currently stored state and set it to something new.

This is a hard-reset for the data we are working with.

def _make_absent(self, current_old):
Mark current_old - an entry - as absent for tree 0.
ReturnsTrue if this was the last details entry for the entry key: that is, if the underlying block has had the entry removed, thus shrinking in length.
def update_minimal(self, key, minikind, executable=False, fingerprint='', packed_stat=None, size=0, path_utf8=None, fullscan=False):
Update an entry to the state in tree 0.

This will either create a new entry at 'key' or update an existing one. It also makes sure that any other records which might mention this are updated as well.

If packed_stat and fingerprint are not given, they're invalidated in the entry.

Parameterskey(dir, name, file_id) for the new entry
minikindThe type for the entry ('f' == 'file', 'd' == 'directory'), etc.
executableShould the executable bit be set?
fingerprintSimple fingerprint for new entry: canonical-form sha1 for files, referenced revision id for subtrees, etc.
packed_statPacked stat value for new entry.
sizeSize information for new entry
path_utf8key[0] + '/' + key[1], just passed in to avoid doing extra computation.
fullscanIf True then a complete scan of the dirstate is being done and checking for duplicate rows should not be done. This should only be set by set_state_from_inventory and similar methods.
def _maybe_remove_row(self, block, index, id_index):
Remove index if it is absent or relocated across the row.

id_index is updated accordingly. :return: True if we removed the row, False otherwise

def _validate(self):
Check that invariants on the dirblock are correct.

This can be useful in debugging; it shouldn't be necessary in normal code.

This must be called with a lock held.

def _wipe_state(self):
Forget all state information about the dirstate.
def lock_read(self):
Acquire a read lock on the dirstate.
def lock_write(self):
Acquire a write lock on the dirstate.
def unlock(self):
Drop any locks held on the dirstate.
def _requires_lock(self):
Check that a lock is currently held by someone on the dirstate.
API Documentation for Bazaar, generated by pydoctor at 2022-06-16 00:25:16.