The reference implementation

The u1db reference implementation is written in Python, with a SQLite back end. It can be used as a real working implementation by Python code. It is also used to document and test how u1db should work; it has a comprehensive test suite. Implementation authors should port the u1db reference test suite in order to test that their implementation is correct; in particular, sync conformance is defined as being able to sync with the reference implementation.

Fetch with bzr branch lp:u1db or from Launchpad.

To open a new database, use u1db.open:

u1db.open(path, create)

Open a database at the given location.

Will raise u1db.errors.DatabaseDoesNotExist if create=False and the database does not already exist.

Parameters:
  • path – The filesystem path for the database to open.
  • create – True/False, should the database be created if it doesn’t already exist?
Returns:

An instance of Database.

Opening returns a Database object:

class u1db.Database

A JSON Document data store.

This data store can be synchronized with other u1db.Database instances.

close()
Release any resources associated with this database.
create_doc(content, doc_id=None)

Create a new document.

You can optionally specify the document identifier, but the document must not already exist. See ‘put_doc’ if you want to override an existing document. :param content: The JSON document string :param doc_id: An optional identifier specifying the document id. :return: Document

create_index(index_name, index_expression)

Create an named index, which can then be queried for future lookups. Creating an index which already exists is not an error, and is cheap. Creating an index which does not match the index_expressions of the existing index is an error. Creating an index will block until the expressions have been evaluated and the index generated.

Name:

A unique name which can be used as a key prefix

Index_expressions:
 

A list of index expressions defining the index information. Examples:

[“field”] to index alphabetically sorted on field. [“number(field, bits)”, “lower(field)”, “field.subfield”]

delete_doc(doc)
Mark a document as deleted. Will abort if the current revision doesn’t match doc.rev.
delete_index(index_name)

Remove a named index.

Parameter:index_name – The name of the index we are removing
Returns:None
get_doc(doc_id)

Get the JSON string for the given document.

Parameter:doc_id – The unique document identifier
Returns:a Document object.
get_doc_conflicts(doc_id)

Get the list of conflict texts for the given document.

The order of the conflicts is such that the first entry is the value that would be returned by “get_doc”.

Returns:[(doc_rev, doc)] a list of tuples of the revision for the content, and the JSON string of the content.
get_docs(doc_ids, check_for_conflicts=True)

Get the JSON content for many documents.

Parameters:
  • doc_ids – A list of document identifiers.
  • check_for_conflicts – If set to False, then the conflict check will be skipped, and ‘None’ will be returned instead of True/False.
Returns:

[Document] for each document id and matching doc_ids order.

get_from_index(index_name, key_values)

Return documents that match the keys supplied.

You must supply exactly the same number of values as the index has been defined. It is possible to do a prefix match by using ‘*’ to indicate a wildcard match. You can only supply ‘*’ to trailing entries, (eg [(‘val’, ‘*’, ‘*’)] is allowed, but [(‘*’, ‘val’, ‘val’)] is not.) It is also possible to append a ‘*’ to the last supplied value (eg [(‘val*’, ‘*’, ‘*’)] or [(‘val’, ‘val*’, ‘*’)], but not [(‘val*’, ‘val’, ‘*’)])

Returns:

List of [Document]

Parameters:
  • index_name – The index to query
  • key_values – A list of tuple of values to match. eg, if you have an index with 3 field,s then you would have: [(x-val1, x-val2, x-val3), (y-val1, y-val2, y-val3), ...])
get_sync_generation(other_replica_uid)

Return the last known database generation of the other db replica.

When you do a synchronization with another replica, the Database keeps track of what generation the other database replica was at. This way we only have to request data that is newer.

Parameter:other_replica_uid – The identifier for the other replica.
Returns:The generation we encountered during synchronization. If we’ve never synchronized with the replica, this is 0.
get_sync_target()

Return a SyncTarget object, for another u1db to synchronize with.

Returns:An instance of SyncTarget.
list_indexes()

List the definitions of all known indexes.

Returns:A list of [(‘index-name’, [‘field’, ‘field2’])] definitions.
put_doc(doc)

Update a document. If the document currently has conflicts, put will fail.

Parameter:doc – A Document with new content.
Returns:new_doc_rev - The new revision identifier for the document. The Document object will also be updated.
put_doc_if_newer(doc, save_conflict, replica_uid=None, replica_gen=None)

Insert/update document into the database with a given revision.

This api is used during synchronization operations.

If a document would conflict and save_conflict is set to True, the content will be selected as the ‘current’ content for doc.doc_id, even though doc.rev doesn’t supersede the currently stored revision. The currently stored document will be added to the list of conflict alternatives for the given doc_id.

This forces the new content to be ‘current’ so that we get convergence after synchronizing, even if people don’t resolve conflicts. Users can then notice that their content is out of date, update it, and synchronize again. (The alternative is that users could synchronize and think the data has propagated, but their local copy looks fine, and the remote copy is never updated again.)

Parameters:
  • doc – A Document object
  • save_conflict – If this document is a conflict, do you want to save it as a conflict, or just ignore it.
  • replica_uid – A unique replica identifier.
  • replica_gen – The generation of the replica corresponding to the this document. The replica arguments are optional, but are used during synchronization.
Returns:

state - If we don’t have doc_id already, or if doc_rev supersedes the existing document revision, then the content will be inserted, and state is ‘inserted’. If doc_rev is less than or equal to the existing revision, then the put is ignored and state is respecitvely ‘superseded’ or ‘converged’. If doc_rev is not strictly superseded or supersedes, then state is ‘conflicted’. The document will not be inserted if save_conflict is False.

resolve_doc(doc, conflicted_doc_revs)

Mark a document as no longer conflicted.

We take the list of revisions that the client knows about that it is superseding. This may be a different list from the actual current conflicts, in which case only those are removed as conflicted. This may fail if the conflict list is significantly different from the supplied information. (sync could have happened in the background from the time you GET_DOC_CONFLICTS until the point where you RESOLVE)

Parameters:
  • doc – A Document with the new content to be inserted.
  • conflicted_doc_revs – A list of revisions that the new content supersedes.
Returns:

None, doc will be updated with the new revision and has_conflict flags.

set_sync_generation(other_replica_uid, other_generation)

Set the last-known generation for the other database replica.

We have just performed some synchronization, and we want to track what generation the other replica was at. See also get_sync_generation. :param other_replica_uid: The U1DB identifier for the other replica. :param other_generation: The generation number for the other replica. :return: None

sync(url)
Synchronize documents with remote replica exposed at url.
whats_changed(old_generation)

Return a list of documents that have changed since old_generation. This allows APPS to only store a db generation before going ‘offline’, and then when coming back online they can use this data to update whatever extra data they are storing.

Parameter:old_generation – The generation of the database in the old state.
Returns:(cur_generation, [(doc_id, generation),...]) The current generation of the database, and a list of of changed documents since old_generation, represented by tuples with for each document its doc_id and the generation corresponding to the last intervening change and sorted by generation

Previous topic

The high-level API

Next topic

Conflicts, syncing, and revisions

This Page