b.osutils : module documentation

Part of bzrlib

No module docstring
Function get_unicode_argv Undocumented
Function make_readonly Make a filename read-only.
Function make_writable Undocumented
Function minimum_path_selection Return the smallset subset of paths which are outside paths.
Function quotefn Return a quoted filename filename
Function get_umask Return the current umask
Function kind_marker Undocumented
Function lexists Undocumented
Function fancy_rename A fancy rename, when you don't have atomic rename.
Function rmtree Replacer for shutil.rmtree: could remove readonly dirs/files
Function get_terminal_encoding Find the best encoding for printing to the screen.
Function normalizepath Undocumented
Function isdir True if f is an accessible directory.
Function isfile True if f is a regular file.
Function islink True if f is a symlink.
Function is_inside True if fname is inside dir.
Function is_inside_any True if fname is inside any of given dirs.
Function is_inside_or_parent_of_any True if fname is a child or a parent of any of the given files.
Function pumpfile Copy contents of one file to another.
Function pump_string_file Write bytes to file_handle in many smaller writes.
Function file_iterator Undocumented
Function sha_file Calculate the hexdigest of an open file.
Function size_sha_file Calculate the size and hexdigest of an open file.
Function sha_file_by_name Calculate the SHA1 of a file by reading the full text
Function sha_strings Return the sha-1 of concatenation of strings
Function sha_string Undocumented
Function fingerprint_file Undocumented
Function compare_files Returns true if equal in contents
Function local_time_offset Return offset of local zone from GMT, either at present or at time t.
Function format_date Return a formatted date string.
Function format_date_with_offset_in_original_timezone Return a formatted date string in the original timezone.
Function format_local_date Return an unicode date string formatted according to the current locale.
Function compact_date Undocumented
Function format_delta Get a nice looking string for a time delta.
Function filesize Return size of given open file.
Function rand_bytes Undocumented
Function rand_chars Return a random string of num alphanumeric characters
Function splitpath Turn string into list of parts.
Function joinpath Undocumented
Function parent_directories Return the list of parent directories, deepest first.
Function failed_to_load_extension Handle failing to load a binary extension.
Function report_extension_load_failures Undocumented
Function split_lines Split s into lines, but without removing the newline characters.
Function hardlinks_good Undocumented
Function link_or_copy Hardlink a file, or copy it if it can't be hardlinked.
Function delete_any Delete a file, symlink or directory.
Function has_symlinks Undocumented
Function has_hardlinks Undocumented
Function host_os_dereferences_symlinks Undocumented
Function readlink Return a string representing the path to which the symbolic link points.
Function contains_whitespace True if there are any whitespace characters in s.
Function contains_linebreaks True if there is any vertical whitespace in s.
Function relpath Return path relative to base, or raise exception.
Function canonical_relpaths Create an iterable to canonicalize a sequence of relative paths.
Function safe_unicode Coerce unicode_or_utf8_string into unicode.
Function safe_utf8 Coerce unicode_or_utf8_string to a utf8 string.
Function safe_revision_id Revision ids should now be utf8, but at one point they were unicode.
Function safe_file_id File ids should now be utf8, but at one point they were unicode.
Function normalizes_filenames Return True if this platform normalizes unicode filenames.
Function set_signal_handler A wrapper for signal.signal that also calls siginterrupt(signum, False)
Function terminal_width Return terminal width.
Function watch_sigwinch Register for SIGWINCH, once and only once.
Function supports_executable Undocumented
Function supports_posix_readonly Return True if 'readonly' has POSIX semantics, False otherwise.
Function set_or_unset_env Modify the environment, setting or removing the env_variable.
Function check_legal_path Check whether the supplied path is legal.
Function walkdirs Yield data about all the directories in a tree.
Class DirReader An interface for reading directories.
Class UnicodeDirReader A dir reader for non-utf8 file systems, which transcodes.
Function copy_tree Copy all of the entries in from_path into to_path.
Function path_prefix_key Generate a prefix-order path key for path.
Function compare_paths_prefix_order Compare path_a and path_b to generate the same order walkdirs uses.
Function get_user_encoding Find out what the preferred user encoding is.
Function get_host_name Return the current unicode host name.
Function recv_all Receive an exact number of bytes.
Function send_all Send all bytes on a socket.
Function dereference_path Determine the real path to a file.
Function supports_mapi Return True if we can use MAPI to launch a mail client.
Function resource_string Load a resource from a package and return it as a string.
Function file_kind_from_stat_mode_thunk Undocumented
Function file_kind Undocumented
Function until_no_eintr Run f(*a, **kw), retrying if an EINTR error occurs.
Function re_compile_checked Return a compiled re, or raise a sensible error.
Function getchar 0 Undocumented
Function getchar Undocumented
Function local_concurrency Return how many processes can be run concurrently.
Class UnicodeOrBytesToBytesWriter A stream writer that doesn't decode str arguments.
Function open_file This function is used to override the open builtin.
Function _posix_abspath Undocumented
Function _posix_realpath Undocumented
Function _win32_fixdrive Force drive letters to be consistent.
Function _win32_abspath Undocumented
Function _win98_abspath Return the absolute version of a path.
Function _win32_realpath Undocumented
Function _win32_pathjoin Undocumented
Function _win32_normpath Undocumented
Function _win32_getcwd Undocumented
Function _win32_mkdtemp Undocumented
Function _win32_rename We expect to be able to atomically replace 'new' with old.
Function _mac_getcwd Undocumented
Function _win32_delete_readonly Error handler for shutil.rmtree function [for win32]
Function _format_date Undocumented
Function _split_lines Split s into lines, but without removing the newline characters.
Function _delete_file_or_dir Undocumented
Function _cicp_canonical_relpath Return the canonical path relative to base.
Function _accessible_normalized_filename Get the unicode normalized path, and if you can access the file.
Function _inaccessible_normalized_filename Undocumented
Function _win32_terminal_size Undocumented
Function _ioctl_terminal_size Undocumented
Function _terminal_size_changed Set COLUMNS upon receiving a SIGnal for WINdow size CHange.
Function _is_error_enotdir Check if this exception represents ENOTDIR.
Function _walkdirs_utf8 Yield data about all the directories in a tree.
Function _local_concurrency 0 Undocumented
Function _local_concurrency 1 Undocumented
Function _local_concurrency 2 Undocumented
Function _local_concurrency 3 Undocumented
Function _local_concurrency 4 Undocumented
Function _local_concurrency Undocumented
def get_unicode_argv():
Undocumented
def make_readonly(filename):
Make a filename read-only.
def make_writable(filename):
Undocumented
def minimum_path_selection(paths):
Return the smallset subset of paths which are outside paths.
ParameterspathsA container (and hence not None) of paths.
ReturnsA set of paths sufficient to include everything in paths via is_inside, drawn from the paths parameter.
def quotefn(f):
Return a quoted filename filename

This previously used backslash quoting, but that works poorly on Windows.

def get_umask():
Return the current umask
def kind_marker(kind):
Undocumented
def lexists(f):
Undocumented
def fancy_rename(old, new, rename_func, unlink_func):
A fancy rename, when you don't have atomic rename.
ParametersoldThe old path, to rename from
newThe new path, to rename to
rename_funcThe potentially non-atomic rename function
unlink_funcA way to delete the target file if the full rename succeeds
def _posix_abspath(path):
Undocumented
def _posix_realpath(path):
Undocumented
def _win32_fixdrive(path):
Force drive letters to be consistent.

win32 is inconsistent whether it returns lower or upper case and even if it was consistent the user might type the other so we force it to uppercase running python.exe under cmd.exe return capital C:running win32 python inside a cygwin shell returns lowercase c:

def _win32_abspath(path):
Undocumented
def _win98_abspath(path):
Return the absolute version of a path. Windows 98 safe implementation (python reimplementation of Win32 API function GetFullPathNameW)
def _win32_realpath(path):
Undocumented
def _win32_pathjoin(*args):
Undocumented
def _win32_normpath(path):
Undocumented
def _win32_getcwd():
Undocumented
def _win32_mkdtemp(*args, **kwargs):
Undocumented
def _win32_rename(old, new):
We expect to be able to atomically replace 'new' with old.

On win32, if new exists, it must be moved out of the way first, and then deleted.

def _mac_getcwd():
Undocumented
def _win32_delete_readonly(function, path, excinfo):
Error handler for shutil.rmtree function [for win32] Helps to remove files and dirs marked as read-only.
def rmtree(path, ignore_errors=False, onerror=_win32_delete_readonly):
Replacer for shutil.rmtree: could remove readonly dirs/files
def get_terminal_encoding():
Find the best encoding for printing to the screen.

This attempts to check both sys.stdout and sys.stdin to see what encoding they are in, and if that fails it falls back to osutils.get_user_encoding(). The problem is that on Windows, locale.getpreferredencoding() is not the same encoding as that used by the console: http://mail.python.org/pipermail/python-list/2003-May/162357.html

On my standard US Windows XP, the preferred encoding is cp1252, but the console is cp437

def normalizepath(f):
Undocumented
def isdir(f):
True if f is an accessible directory.
def isfile(f):
True if f is a regular file.
def islink(f):
True if f is a symlink.
def is_inside(dir, fname):
True if fname is inside dir.

The parameters should typically be passed to osutils.normpath first, so that . and .. and repeated slashes are eliminated, and the separators are canonical for the platform.

The empty string as a dir name is taken as top-of-tree and matches everything.

def is_inside_any(dir_list, fname):
True if fname is inside any of given dirs.
def is_inside_or_parent_of_any(dir_list, fname):
True if fname is a child or a parent of any of the given files.
def pumpfile(from_file, to_file, read_length=-1, buff_size=32768, report_activity=None, direction='read'):
Copy contents of one file to another.

The read_length can either be -1 to read to end-of-file (EOF) or it can specify the maximum number of bytes to read.

The buff_size represents the maximum size for each read operation performed on from_file.

Parametersreport_activityCall this as bytes are read, see Transport._report_activity
directionWill be passed to report_activity
ReturnsThe number of bytes copied.
def pump_string_file(bytes, file_handle, segment_size=None):
Write bytes to file_handle in many smaller writes.
ParametersbytesThe string to write.
file_handleThe file to write to.
def file_iterator(input_file, readsize=32768):
Undocumented
def sha_file(f):
Calculate the hexdigest of an open file.

The file cursor should be already at the start.

def size_sha_file(f):
Calculate the size and hexdigest of an open file.

The file cursor should be already at the start and the caller is responsible for closing the file afterwards.

def sha_file_by_name(fname):
Calculate the SHA1 of a file by reading the full text
def sha_strings(strings, _factory=sha):
Return the sha-1 of concatenation of strings
def sha_string(f, _factory=sha):
Undocumented
def fingerprint_file(f):
Undocumented
def compare_files(a, b):
Returns true if equal in contents
def local_time_offset(t=None):
Return offset of local zone from GMT, either at present or at time t.
def format_date(t, offset=0, timezone='original', date_fmt=None, show_offset=True):
Return a formatted date string.
ParameterstSeconds since the epoch.
offsetTimezone offset in seconds east of utc.
timezoneHow to display the time: 'utc', 'original' for the timezone specified by offset, or 'local' for the process's current timezone.
date_fmtstrftime format.
show_offsetWhether to append the timezone.
def format_date_with_offset_in_original_timezone(t, offset=0, _cache=_offset_cache):
Return a formatted date string in the original timezone.

This routine may be faster then format_date.

ParameterstSeconds since the epoch.
offsetTimezone offset in seconds east of utc.
def format_local_date(t, offset=0, timezone='original', date_fmt=None, show_offset=True):
Return an unicode date string formatted according to the current locale.
ParameterstSeconds since the epoch.
offsetTimezone offset in seconds east of utc.
timezoneHow to display the time: 'utc', 'original' for the timezone specified by offset, or 'local' for the process's current timezone.
date_fmtstrftime format.
show_offsetWhether to append the timezone.
def _format_date(t, offset, timezone, date_fmt, show_offset):
Undocumented
def compact_date(when):
Undocumented
def format_delta(delta):
Get a nice looking string for a time delta.
ParametersdeltaThe time difference in seconds, can be positive or negative. positive indicates time in the past, negative indicates time in the future. (usually time.time() - stored_time)
ReturnsString formatted to show approximate resolution
def filesize(f):
Return size of given open file.
def rand_bytes(n):
Undocumented
def rand_chars(num):
Return a random string of num alphanumeric characters

The result only contains lowercase chars because it may be used on case-insensitive filesystems.

def splitpath(p):
Turn string into list of parts.
def joinpath(p):
Undocumented
def parent_directories(filename):
Return the list of parent directories, deepest first.

For example, parent_directories("a/b/c") -> ["a/b", "a"].

def failed_to_load_extension(exception):

Handle failing to load a binary extension.

This should be called from the ImportError block guarding the attempt to import the native extension. If this function returns, the pure-Python implementation should be loaded instead:

>>> try:
>>>     import bzrlib._fictional_extension_pyx
>>> except ImportError, e:
>>>     bzrlib.osutils.failed_to_load_extension(e)
>>>     import bzrlib._fictional_extension_py
def report_extension_load_failures():
Undocumented
def split_lines(s):
Split s into lines, but without removing the newline characters.
def _split_lines(s):
Split s into lines, but without removing the newline characters.

This supports Unicode or plain string objects.

def hardlinks_good():
Undocumented
def link_or_copy(src, dest):
Hardlink a file, or copy it if it can't be hardlinked.
def delete_any(path):
Delete a file, symlink or directory.

Will delete even if readonly.

def _delete_file_or_dir(path):
Undocumented
def has_symlinks():
Undocumented
def has_hardlinks():
Undocumented
def host_os_dereferences_symlinks():
Undocumented
def readlink(abspath):
Return a string representing the path to which the symbolic link points.

This his guaranteed to return the symbolic link in unicode in all python versions.

ParametersabspathThe link absolute unicode path.
def contains_whitespace(s):
True if there are any whitespace characters in s.
def contains_linebreaks(s):
True if there is any vertical whitespace in s.
def relpath(base, path):
Return path relative to base, or raise exception.

The path may be either an absolute path or a path relative to the current working directory.

os.path.commonprefix (python2.4) has a bad bug that it works just on string prefixes, assuming that '/u' is a prefix of '/u2'. This avoids that problem.

def _cicp_canonical_relpath(base, path):
Return the canonical path relative to base.

Like relpath, but on case-insensitive-case-preserving file-systems, this will return the relpath as stored on the file-system rather than in the case specified in the input string, for all existing portions of the path.

This will cause O(N) behaviour if called for every path in a tree; if you have a number of paths to convert, you should use canonical_relpaths().

def canonical_relpaths(base, paths):
Create an iterable to canonicalize a sequence of relative paths.

The intent is for this implementation to use a cache, vastly speeding up multiple transformations in the same directory.

def safe_unicode(unicode_or_utf8_string):
Coerce unicode_or_utf8_string into unicode.

If it is unicode, it is returned. Otherwise it is decoded from utf-8. If decoding fails, the exception is wrapped in a BzrBadParameterNotUnicode exception.

def safe_utf8(unicode_or_utf8_string):
Coerce unicode_or_utf8_string to a utf8 string.

If it is a str, it is returned. If it is Unicode, it is encoded into a utf-8 string.

def safe_revision_id(unicode_or_utf8_string, warn=True):
Revision ids should now be utf8, but at one point they were unicode.
Parametersunicode_or_utf8_stringA possibly Unicode revision_id. (can also be utf8 or None).
warnFunctions that are sanitizing user data can set warn=False
ReturnsNone or a utf8 revision id.
def safe_file_id(unicode_or_utf8_string, warn=True):
File ids should now be utf8, but at one point they were unicode.

This is the same as safe_utf8, except it uses the cached encode functions to save a little bit of performance.

Parametersunicode_or_utf8_stringA possibly Unicode file_id. (can also be utf8 or None).
warnFunctions that are sanitizing user data can set warn=False
ReturnsNone or a utf8 file id.
def normalizes_filenames():
Return True if this platform normalizes unicode filenames.

Mac OSX does, Windows/Linux do not.

def _accessible_normalized_filename(path):
Get the unicode normalized path, and if you can access the file.

On platforms where the system normalizes filenames (Mac OSX), you can access a file by any path which will normalize correctly. On platforms where the system does not normalize filenames (Windows, Linux), you have to access a file by its exact path.

Internally, bzr only supports NFC normalization, since that is the standard for XML documents.

So return the normalized path, and a flag indicating if the file can be accessed by that path.

def _inaccessible_normalized_filename(path):
Undocumented
def set_signal_handler(signum, handler, restart_syscall=True):
A wrapper for signal.signal that also calls siginterrupt(signum, False) on platforms that support that.
Parametersrestart_syscallif set, allow syscalls interrupted by a signal to automatically restart (by calling signal.siginterrupt(signum, False)). May be ignored if the feature is not available on this platform or Python version.
def terminal_width():

Return terminal width.

None is returned if the width can't established precisely.

The rules are: - if BZR_COLUMNS is set, returns its value - if there is no controlling terminal, returns None - if COLUMNS is set, returns its value,

From there, we need to query the OS to get the size of the controlling terminal.

Unices: - get termios.TIOCGWINSZ - if an error occurs or a negative value is obtained, returns None

Windows:

  • win32utils.get_console_size() decides,
  • returns None on error (provided default value)
def _win32_terminal_size(width, height):
Undocumented
def _ioctl_terminal_size(width, height):
Undocumented
def _terminal_size_changed(signum, frame):
Set COLUMNS upon receiving a SIGnal for WINdow size CHange.
def watch_sigwinch():
Register for SIGWINCH, once and only once.
def supports_executable():
Undocumented
def supports_posix_readonly():
Return True if 'readonly' has POSIX semantics, False otherwise.

Notably, a win32 readonly file cannot be deleted, unlike POSIX where the directory controls creation/deletion, etc.

And under win32, readonly means that the directory itself cannot be deleted. The contents of a readonly directory can be changed, unlike POSIX where files in readonly directories cannot be added, deleted or renamed.

def set_or_unset_env(env_variable, value):
Modify the environment, setting or removing the env_variable.
Parametersenv_variableThe environment variable in question
valueThe value to set the environment to. If None, then the variable will be removed.
ReturnsThe original value of the environment variable.
def check_legal_path(path):
Check whether the supplied path is legal. This is only required on Windows, so we don't test on other platforms right now.
def _is_error_enotdir(e):
Check if this exception represents ENOTDIR.

Unfortunately, python is very inconsistent about the exception
here. The cases are:
  1) Linux, Mac OSX all versions seem to set errno == ENOTDIR
  2) Windows, Python2.4, uses errno == ERROR_DIRECTORY (267)
     which is the windows error code.
  3) Windows, Python2.5 uses errno == EINVAL and
     winerror == ERROR_DIRECTORY

:param e: An Exception object (expected to be OSError with an errno
    attribute, but we should be able to cope with anything)
:return: True if this represents an ENOTDIR error. False otherwise.
def walkdirs(top, prefix=''):
Yield data about all the directories in a tree.

This yields all the data about the contents of a directory at a time.
After each directory has been yielded, if the caller has mutated the list
to exclude some directories, they are then not descended into.

The data yielded is of the form:
((directory-relpath, directory-path-from-top),
[(relpath, basename, kind, lstat, path-from-top), ...]),
 - directory-relpath is the relative path of the directory being returned
   with respect to top. prefix is prepended to this.
 - directory-path-from-root is the path including top for this directory.
   It is suitable for use with os functions.
 - relpath is the relative path within the subtree being walked.
 - basename is the basename of the path
 - kind is the kind of the file now. If unknown then the file is not
   present within the tree - but it may be recorded as versioned. See
   versioned_kind.
 - lstat is the stat data *if* the file was statted.
 - planned, not implemented:
   path_from_tree_root is the path from the root of the tree.

:param prefix: Prefix the relpaths that are yielded with 'prefix'. This
    allows one to walk a subtree but get paths that are relative to a tree
    rooted higher up.
:return: an iterator over the dirs.
def _walkdirs_utf8(top, prefix=''):
Yield data about all the directories in a tree.

This yields the same information as walkdirs() only each entry is yielded in utf-8. On platforms which have a filesystem encoding of utf8 the paths are returned as exact byte-strings.

Returnsyields a tuple of (dir_info, [file_info]) dir_info is (utf8_relpath, path-from-top) file_info is (utf8_relpath, utf8_name, kind, lstat, path-from-top) if top is an absolute path, path-from-top is also an absolute path. path-from-top might be unicode or utf8, but it is the correct path to pass to os functions to affect the file in question. (such as os.lstat)
def copy_tree(from_path, to_path, handlers={}):
Copy all of the entries in from_path into to_path.
Parametersfrom_pathThe base directory to copy.
to_pathThe target directory. If it does not exist, it will be created.
handlersA dictionary of functions, which takes a source and destinations for files, directories, etc. It is keyed on the file kind, such as 'directory', 'symlink', or 'file' 'file', 'directory', and 'symlink' should always exist. If they are missing, they will be replaced with 'os.mkdir()', 'os.readlink() + os.symlink()', and 'shutil.copy2()', respectively.
def path_prefix_key(path):
Generate a prefix-order path key for path.

This can be used to sort paths in the same way that walkdirs does.

def compare_paths_prefix_order(path_a, path_b):
Compare path_a and path_b to generate the same order walkdirs uses.
def get_user_encoding(use_cache=True):
Find out what the preferred user encoding is.

This is generally the encoding that is used for command line parameters and file contents. This may be different from the terminal encoding or the filesystem encoding.

Parametersuse_cacheEnable cache for detected encoding. (This parameter is turned on by default, and required only for selftesting)
ReturnsA string defining the preferred user encoding
def get_host_name():
Return the current unicode host name.

This is meant to be used in place of socket.gethostname() because that behaves inconsistently on different platforms.

def recv_all(socket, bytes):
Receive an exact number of bytes.

Regular Socket.recv() may return less than the requested number of bytes, dependning on what's in the OS buffer. MSG_WAITALL is not available on all platforms, but this should work everywhere. This will return less than the requested amount if the remote end closes.

This isn't optimized and is intended mostly for use in testing.

def send_all(socket, bytes, report_activity=None):
Send all bytes on a socket.

Regular socket.sendall() can give socket error 10053 on Windows. This implementation sends no more than 64k at a time, which avoids this problem.

Parametersreport_activityCall this as bytes are read, see Transport._report_activity
def dereference_path(path):
Determine the real path to a file.

All parent elements are dereferenced. But the file itself is not dereferenced. :param path: The original path. May be absolute or relative. :return: the real path to the file

def supports_mapi():
Return True if we can use MAPI to launch a mail client.
def resource_string(package, resource_name):
Load a resource from a package and return it as a string.

Note: Only packages that start with bzrlib are currently supported.

This is designed to be a lightweight implementation of resource loading in a way which is API compatible with the same API from pkg_resources. See http://peak.telecommunity.com/DevCenter/PkgResources#basic-resource-access. If and when pkg_resources becomes a standard library, this routine can delegate to it.

def file_kind_from_stat_mode_thunk(mode):
Undocumented
def file_kind(f, _lstat=os.lstat):
Undocumented
def until_no_eintr(f, *a, **kw):
Run f(*a, **kw), retrying if an EINTR error occurs.
def re_compile_checked(re_string, flags=0, where=''):
Return a compiled re, or raise a sensible error.

This should only be used when compiling user-supplied REs.

Parametersre_stringText form of regular expression.
flagseg re.IGNORECASE
whereMessage explaining to the user the context where it occurred, eg 'log search filter'.
def getchar 0():
Undocumented
def getchar():
Undocumented
def _local_concurrency 0():
Undocumented
def _local_concurrency 1():
Undocumented
def _local_concurrency 2():
Undocumented
def _local_concurrency 3():
Undocumented
def _local_concurrency 4():
Undocumented
def _local_concurrency():
Undocumented
def local_concurrency(use_cache=True):
Return how many processes can be run concurrently.

Rely on platform specific implementations and default to 1 (one) if anything goes wrong.

def open_file(filename, mode='r', bufsize=-1):
This function is used to override the open builtin.

But it uses O_NOINHERIT flag so the file handle is not inherited by child processes. Deleting or renaming a closed file opened with this function is not blocking child processes.

API Documentation for Bazaar, generated by pydoctor at 2010-03-19 00:10:14.