Part of bzrlib.dirstate View In Hierarchy
Known subclasses: bzrlib.tests.test_dirstate.InstrumentedDirState
A dirstate is a specialised data structure for managing local working tree state information. Its not yet well defined whether it is platform specific, and if it is how we detect/parameterize that.
Dirstates use the usual lock_write, lock_read and unlock mechanisms. Unlike most bzr disk formats, DirStates must be locked for reading, using lock_read. (This is an os file lock internally.) This is necessary because the file can be rewritten in place.
DirStates must be explicitly written with save() to commit changes; just unlocking them does not write the changes to disk.
Method | __init__ | Create a DirState object. |
Method | __repr__ | Undocumented |
Method | add | Add a path to be tracked. |
Static Method | from_tree | Create a dirstate from a bzr Tree. |
Method | update_by_delta | Apply an inventory delta to the dirstate for tree 0 |
Method | update_basis_by_delta | Update the parents of this tree after a commit. |
Method | get_ghosts | Return a list of the parent tree revision ids that are ghosts. |
Method | get_lines | Serialise the entire dirstate to a sequence of lines. |
Method | get_parent_ids | Return a list of the parent tree ids for the directory state. |
Class Method | initialize | Create a new dirstate on path. |
Class Method | on_file | Construct a DirState on the file at path "path". |
Method | sha1_from_stat | Find a sha1 given a stat lookup. |
Method | save | Save any pending changes created during this session. |
Method | set_path_id | Change the id of path to new_id in the current working tree. |
Method | set_parent_trees | Set the parent trees for the dirstate. |
Method | set_state_from_inventory | Set new_inv as the current state. |
Method | set_state_from_scratch | Wipe the currently stored state and set it to something new. |
Method | update_minimal | Update an entry to the state in tree 0. |
Method | lock_read | Acquire a read lock on the dirstate. |
Method | lock_write | Acquire a write lock on the dirstate. |
Method | unlock | Drop any locks held on the dirstate. |
Method | _mark_modified | Mark this dirstate as modified. |
Method | _mark_unmodified | Mark this dirstate as unmodified. |
Method | _bisect | Bisect through the disk structure for specific rows. |
Method | _bisect_dirblocks | Bisect through the disk structure to find entries in given dirs. |
Method | _bisect_recursive | Bisect for entries for all paths and their children. |
Method | _discard_merge_parents | Discard any parents trees beyond the first. |
Method | _empty_parent_info | Undocumented |
Method | _ensure_block | Ensure a block for dirname exists. |
Method | _entries_to_current_state | Load new_entries into self.dirblocks. |
Method | _split_root_dirblock_into_contents | Split the root dirblocks into root and contents-of-root. |
Method | _entries_for_path | Return a list with all the entries that match path for all ids. |
Method | _entry_to_line | Serialize entry to a NULL delimited line ready for _get_output_lines. |
Method | _fields_per_entry | How many null separated fields should be in each entry row. |
Method | _find_block | Return the block that key should be present in. |
Method | _find_block_index_from_key | Find the dirblock index for a key. |
Method | _find_entry_index | Find the entry index for a key in a block. |
Method | _check_delta_is_valid | Undocumented |
Method | _apply_removals | Undocumented |
Method | _apply_insertions | Undocumented |
Method | _check_delta_ids_absent | Check that none of the file_ids in new_ids are present in a tree. |
Method | _raise_invalid | Undocumented |
Method | _update_basis_apply_adds | Apply a sequence of adds to tree 1 during update_basis_by_delta. |
Method | _update_basis_apply_changes | Apply a sequence of changes to tree 1 during update_basis_by_delta. |
Method | _update_basis_apply_deletes | Apply a sequence of deletes to tree 1 during update_basis_by_delta. |
Method | _after_delta_check_parents | Check that parents required by the delta are all intact. |
Method | _observed_sha1 | Note the sha1 of a file. |
Method | _sha_cutoff_time | Return cutoff time. |
Method | _lstat | Return the os.lstat value for this path. |
Method | _sha1_file_and_mutter | Undocumented |
Method | _is_executable | Is this file executable? |
Method | _is_executable_win32 | On win32 the executable bit is stored in the dirstate. |
Method | _read_link | Read the target of a symlink |
Method | _get_ghosts_line | Create a line for the state file for ghost information. |
Method | _get_parents_line | Create a line for the state file for parents information. |
Method | _get_entry_lines | Create lines for entries. |
Method | _get_fields_to_entry | Get a function which converts entry fields into a entry record. |
Method | _get_block_entry_index | Get the coordinates for a path in the state structure. |
Method | _get_entry | Get the dirstate entry for path in tree tree_index. |
Static Method | _inv_entry_to_details | Convert an inventory entry (from a revision tree) to state details. |
Method | _iter_child_entries | Iterate over all the entries that are children of path_utf. |
Method | _iter_entries | Iterate over all the entries in the dirstate. |
Method | _get_id_index | Get an id index of self._dirblocks. |
Method | _add_to_id_index | Add this entry to the _id_index mapping. |
Method | _remove_from_id_index | Remove this entry from the _id_index mapping. |
Method | _get_output_lines | Format lines for final output. |
Method | _make_deleted_row | Return a deleted row for fileid_utf8. |
Method | _num_present_parents | The number of parent entries in each record row. |
Method | _read_dirblocks_if_needed | Read in all the dirblocks from the file if they are not in memory. |
Method | _read_header | This reads in the metadata header, and the parent ids. |
Method | _read_header_if_needed | Read the header of the dirstate file if needed. |
Method | _read_prelude | Read in the prelude header of the dirstate file. |
Method | _get_packed_stat_index | Get a packed_stat index of self._dirblocks. |
Method | _maybe_fdatasync | Flush to disk if possible and if not configured off. |
Method | _worth_saving | Is it worth saving the dirstate or not? |
Method | _set_data | Set the full dirstate data in memory. |
Method | _sort_entries | Given a list of entries, sort them into the right order. |
Method | _make_absent | Mark current_old - an entry - as absent for tree 0. |
Method | _maybe_remove_row | Remove index if it is absent or relocated across the row. |
Method | _validate | Check that invariants on the dirblock are correct. |
Method | _wipe_state | Forget all state information about the dirstate. |
Method | _requires_lock | Check that a lock is currently held by someone on the dirstate. |
Parameters | path | The path at which the dirstate file on disk should live. |
sha1_provider | an object meeting the SHA1Provider interface. | |
worth_saving_limit | when the exact number of hash changed entries is known, only bother saving the dirstate if more than this count of entries have changed. -1 means never save hash changes, 0 means always save hash changes. |
Parameters | hash_changed_entries | if non-None, mark just these entries as having their hash modified. |
header_modified | mark the header modified as well, not just the dirblocks. |
Parameters | path | The path within the dirstate - '' is the root, 'foo' is the path foo within the root, 'foo/bar' is the path bar within foo within the root. |
file_id | The file id of the path being added. | |
kind | The kind of the path, as a string like 'file', 'directory', etc. | |
stat | The output of os.lstat for the path. | |
fingerprint | The sha value of the file's canonical form (i.e. after any read filters have been applied), or the target of a symlink, or the referenced revision id for tree-references, or '' for directories. |
Parameters | paths | A list of paths to find |
Returns | A dict mapping path => entries for found entries. Missing entries will not be in the map. The list is not sorted, and entries will be populated based on when they were read. |
_bisect_dirblocks is meant to find the contents of directories, which differs from _bisect, which only finds individual entries.
Parameters | dir_list | A sorted list of directory names ['', 'dir', 'foo']. |
Returns | A map from dir => entries_for_dir |
This will use bisect to find all records for the supplied paths. It will then continue to bisect for any records which are marked as directories. (and renames?)
Parameters | paths | A sorted list of (dir, name) pairs eg: [('', 'a'), ('', 'f'), ('a/b', 'c')] |
Returns | A dictionary mapping (dir, name, file_id) => [tree_info] |
Note that if this fails the dirstate is corrupted.
After this function returns the dirstate contains 2 trees, neither of which are ghosted.
This function exists to let callers which know that there is a directory dirname ensure that the block for it exists. This block can fail to exist because of demand loading, or because a directory had no children. In either case it is not an error. It is however an error to call this if there is no parent entry for the directory, and thus the function requires the coordinates of such an entry to be provided.
The root row is special cased and can be indicated with a parent block and row index of -1
Parameters | parent_block_index | The index of the block in which dirname's row exists. |
parent_row_index | The index in the parent block where the row exists. | |
dirname | The utf8 dirname to ensure there is a block for. | |
Returns | The index for the block. |
Process new_entries into the current state object, making them the active state. The entries are grouped together by directory to form dirblocks.
Parameters | new_entries | A sorted list of entries. This function does not sort to prevent unneeded overhead when callers have a sorted list already. |
Returns | Nothing. |
After parsing by path, we end up with root entries and contents-of-root entries in the same block. This loop splits them out again.
Parameters | entry | An entry_tuple as defined in the module docstring. |
How many null separated fields should be in each entry row. Each line now has an extra '\n' field which is not used so we just skip over it entry size:: 3 fields for the key + number of fields per tree_data (5) * tree count + newline
Parameters | key | A dirstate entry key. |
Returns | The block tuple. |
Returns | The block index, True if the block for the key is present. |
Returns | The entry index, True if the entry for the key is present. |
Parameters | tree | The tree which should provide parent information and inventory ids. |
sha1_provider | an object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used. | |
Returns | a DirState object which is currently locked for writing. (it was locked by DirState.initialize) |
This is the workhorse for apply_inventory_delta in dirstate based trees.
Parameters | delta | An inventory delta. See Inventory.apply_delta for details. |
This gives the tree one parent, with revision id new_revid. The inventory delta is applied to the current basis tree to generate the inventory for the parent new_revid, and all other parent trees are discarded.
Note that an exception during the operation of this method will leave the dirstate in a corrupt state where it should not be saved.
Parameters | new_revid | The new revision id for the trees parent. |
delta | An inventory delta (see apply_inventory_delta) describing the changes from the current left most parent revision to new_revid. |
They may be adds, or renames that have been split into add/delete pairs.
Parameters | adds | A sequence of adds. Each add is a tuple: (None, new_path_utf8, file_id, (entry_details), real_add). real_add is False when the add is the second half of a remove-and-reinsert pair created to handle renames and deletes. |
Parameters | adds | A sequence of changes. Each change is a tuple: (path_utf8, path_utf8, file_id, (entry_details)) |
They may be deletes, or renames that have been split into add/delete pairs.
Parameters | deletes | A sequence of deletes. Each delete is a tuple: (old_path_utf8, new_path_utf8, file_id, None, real_delete). real_delete is True when the desired outcome is an actual deletion rather than the rename handling logic temporarily deleting a path during the replacement of a parent. |
Parameters | parents | An iterable of (path_utf8, file_id) tuples which are required to be present in tree 'index' at path_utf8 with id file_id and be a directory. |
index | The column in the dirstate to check for parents in. |
Parameters | entry | The entry the sha1 is for. |
sha1 | The observed sha1. | |
stat_value | The os.lstat for the file. |
Files modified more recently than this time are at risk of being undetectably modified and so can't be cached.
This handles size and executable, as well as parent records.
Returns | A function which takes a list of fields, and returns an appropriate record for storing in memory. |
Parameters | dirname | The utf8 dirname to lookup. |
basename | The utf8 basename to lookup. | |
tree_index | The index of the tree for which this lookup should be attempted. | |
Returns | A tuple describing where the path is located, or should be inserted. The tuple contains four fields: the block index, the row index, the directory is present (boolean), the entire path is present (boolean). There is no guarantee that either coordinate is currently reachable unless the found field for it is True. For instance, a directory not present in the searched tree may be returned with a value one greater than the current highest block offset. The directory present field will always be True when the path present field is True. The directory present field does NOT indicate that the directory is present in the searched tree, rather it indicates that there are at least some files in some tree present there. |
If either file_id or path is supplied, it is used as the key to lookup. If both are supplied, the fastest lookup is used, and an error is raised if they do not both point at the same row.
Parameters | tree_index | The index of the tree we wish to locate this path in. If the path is present in that tree, the entry containing its details is returned, otherwise (None, None) is returned 0 is the working tree, higher indexes are successive parent trees. |
fileid_utf8 | A utf8 file_id to look up. | |
path_utf8 | An utf8 path to be looked up. | |
include_deleted | If True, and performing a lookup via fileid_utf8 rather than path_utf8, return an entry for deleted (absent) paths. | |
Returns | The dirstate entry tuple for path, or (None, None) |
The new dirstate will be an empty tree - that is it has no parents, and only a root node - which has id ROOT_ID.
Parameters | path | The name of the file for the dirstate. |
sha1_provider | an object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used. | |
Returns | A write-locked DirState object. |
Parameters | inv_entry | An inventory entry whose sha1 and link targets can be relied upon, and which has a revision set. |
Returns | A details tuple - the details for a single tree at a path + id. |
This only returns entries that are present (not in 'a', 'r') in tree_index. tree_index data is not refreshed, so if tree 0 is used, results may differ from that obtained if paths were statted to determine what ones were directories.
Asking for the children of a non-directory will return an empty iterator.
Each yelt item is an entry in the standard format described in the docstring of bzrlib.dirstate.
This maps from file_id => [(directory, name, file_id)] entries where that file_id appears in one of the trees.
It is an programming error to call this when the entry_key is not already present.
Parameters | lines | A sequence of lines containing the parents list and the path lines. |
Parameters | path | The path at which the dirstate file on disk should live. |
sha1_provider | an object meeting the SHA1Provider interface. If None, a DefaultSHA1Provider is used. | |
worth_saving_limit | when the exact number of hash changed entries is known, only bother saving the dirstate if more than this count of entries have changed. -1 means never save. | |
Returns | An unlocked DirState object, associated with the given path. |
This populates self._dirblocks, and sets self._dirblock_state to IN_MEMORY_UNMODIFIED. It is not currently ready for incremental block loading.
After reading in, the file should be positioned at the null just before the start of the first record in the file.
Returns | (expected crc checksum, number of entries, parent list) |
This only reads in the stuff that is not connected to the crc checksum. The position will be correct to read in the rest of the file and check the checksum after this point. The next entry in the file should be the number of parents, and their ids. Followed by a newline.
We reuse the existing file, because that prevents race conditions with file creation, and use oslocks on it to prevent concurrent modification and reads - because dirstate's incremental data aggregation is not compatible with reading a modified file, and replacing a file in use by another process is impossible on Windows.
A dirstate in read only mode should be smart enough though to validate that the file has not changed, and otherwise discard its cache and start over, to allow for fine grained read lock duration, so 'status' wont block 'commit' - for example.
This is an internal function used to completely replace the objects in memory state. It puts the dirstate into state 'full-dirty'.
Parameters | parent_ids | A list of parent tree revision ids. |
dirblocks | A list containing one tuple for each directory in the tree. Each tuple contains the directory path and a list of entries found in that directory. |
Parameters | path | The path inside the tree to set - '' is the root, 'foo' is the path foo in the root. |
new_id | The new id to assign to the path. This must be a utf8 file id (not unicode, and not None). |
Parameters | trees | A list of revision_id, tree tuples. tree must be provided even if the revision_id refers to a ghost: supply an empty tree in this case. |
ghosts | A list of the revision_ids that are ghosts at the time of setting. |
This is done when constructing a new dirstate from trees - normally we try to keep everything in sorted blocks all the time, but sometimes it's easier to sort after the fact.
This API is called by tree transform, and will usually occur with existing parent trees.
Parameters | new_inv | The inventory object to set current state from. |
This is a hard-reset for the data we are working with.
Returns | True if this was the last details entry for the entry key: that is, if the underlying block has had the entry removed, thus shrinking in length. |
This will either create a new entry at 'key' or update an existing one. It also makes sure that any other records which might mention this are updated as well.
If packed_stat and fingerprint are not given, they're invalidated in the entry.
Parameters | key | (dir, name, file_id) for the new entry |
minikind | The type for the entry ('f' == 'file', 'd' == 'directory'), etc. | |
executable | Should the executable bit be set? | |
fingerprint | Simple fingerprint for new entry: canonical-form sha1 for files, referenced revision id for subtrees, etc. | |
packed_stat | Packed stat value for new entry. | |
size | Size information for new entry | |
path_utf8 | key[0] + '/' + key[1], just passed in to avoid doing extra computation. | |
fullscan | If True then a complete scan of the dirstate is being done and checking for duplicate rows should not be done. This should only be set by set_state_from_inventory and similar methods. |
id_index is updated accordingly. :return: True if we removed the row, False otherwise