l.s.d.postgresql : module documentation

Part of lp.services.database

PostgreSQL specific helper functions, such as database introspection and table manipulation
Function listReferences Return a list of all foreign key references to the given table column
Function listUniques Return a list of unique indexes on table that include the column
Function listSequences Return a list of (schema, sequence, table, column) tuples.
Function check_indirect_references Undocumented
Function generateResetSequencesSQL Return SQL that will reset table sequences to match the data in them.
Function resetSequences Reset table sequences to match the data in them.
Function estimateRowCount Ask the PostgreSQL query optimizer for an estimated rowcount.
Function have_table Is there a table of the given name?
Function table_has_column Does a table of the given name exist and have the given column?
Function drop_tables Drop given tables (a list, one name, or None), if they exist.
Function allow_sequential_scans Allow database to ignore indexes and scan sequentially when it wants?
Function all_tables_in_schema Return a set of all tables in the given schema.
Function all_sequences_in_schema Return a set of all sequences in the given schema.
Function fqn Return the fully qualified name by combining the namespace and name.
Class ConnectionString A libpq connection string.
def listReferences(cur, table, column, indirect=True, _state=None):

Return a list of all foreign key references to the given table column

table and column are both case sensitive strings (so they should usually be lowercase strings as per PostgreSQL default behaviour).

cur is an open DB-API cursor

returns [(from_table, from_column, to_table, to_column, update, delete)]

from entries refer to the to entries. If indirect is True, the this method is recursive - not only does it return all references to the given table column, but also all references to those references etc. ie (indirect references).

update is the update clause (eg. on update cascade) delete is the delete clause (eg. on delete cascade)

Entries are returned in order traversed, so with care this can be used to change keys.

>>> for r in listReferences(cur, 'a', 'aid'):
...     print repr(r)
(u'a', u'selfref', u'a', u'aid', u'a', u'a')
(u'b', u'aid', u'a', u'aid', u'c', u'c')
(u'c', u'aid', u'b', u'aid', u'a', u'a')
(u'd', u'aid', u'b', u'aid', u'a', u'a')

Of course, there might not be any references

>>> listReferences(cur, 'a', 'selfref')
[]
def listUniques(cur, table, column):

Return a list of unique indexes on table that include the column

cur must be an open DB-API cursor.

Returns [ (column, [...]) ]. The column passed in will always be included in the tuple.

Simple UNIQUE index

>>> listUniques(cur, 'b', 'aid')
[(u'aid',)]

Primary keys are UNIQUE indexes too

>>> listUniques(cur, 'a', 'aid')
[(u'aid',)]

Compound indexes

>>> listUniques(cur, 'c', 'aid')
[(u'aid', u'bid')]
>>> listUniques(cur, 'c', 'bid')
[(u'aid', u'bid')]

And any combination

>>> l = listUniques(cur, 'd', 'aid')
>>> l.sort()
>>> l
[(u'aid',), (u'aid', u'bid')]

If there are no UNIQUE indexes using the secified column

>>> listUniques(cur, 'a', 'selfref')
[]
def listSequences(cur):

Return a list of (schema, sequence, table, column) tuples.

table and column refer to the column that appears to be automatically populated from the sequence. They will be None if this sequence is standalone.

>>> for r in listSequences(cur):
...     print repr(r)
(u'public', u'a_aid_seq', u'a', u'aid')
(u'public', u'standalone', None, None)
def check_indirect_references(references):
Undocumented
def generateResetSequencesSQL(cur):
Return SQL that will reset table sequences to match the data in them.
def resetSequences(cur):

Reset table sequences to match the data in them.

Goes through the database resetting the values of sequences to match what is in their corresponding tables, where corresponding tables are known.

>>> cur.execute("SELECT nextval('a_aid_seq')")
>>> int(cur.fetchone()[0])
1
>>> cur.execute("SELECT nextval('a_aid_seq')")
>>> cur.execute("SELECT nextval('a_aid_seq')")
>>> resetSequences(cur)
>>> cur.execute("SELECT nextval('a_aid_seq')")
>>> int(cur.fetchone()[0])
1
def estimateRowCount(cur, query):

Ask the PostgreSQL query optimizer for an estimated rowcount.

Stats will only be acurate if the table has been ANALYZEd recently. With standard Ubuntu installs, the autovacuum daemon does this.

>>> cur.execute("INSERT INTO A (selfref) VALUES (NULL)")
>>> cur.execute("ANALYZE A")
>>> estimateRowCount(cur, "SELECT * FROM A")
1
>>> cur.executemany(
...     "INSERT INTO A (selfref) VALUES (NULL)",
...     [(i,) for i in range(100)]
...     )
>>> cur.execute("ANALYZE A")
>>> estimateRowCount(cur, "SELECT * FROM A")
101
def have_table(cur, table):

Is there a table of the given name?

Returns boolean answer.

>>> have_table(cur, 'thistabledoesnotexist_i_hope')
False
>>> cur.execute("CREATE TEMP TABLE atesttable (x integer)")
>>> have_table(cur, 'atesttable')
True
>>> drop_tables(cur, 'atesttable')
>>> have_table(cur, 'atesttable')
False
def table_has_column(cur, table, column):

Does a table of the given name exist and have the given column?

Returns boolean answer.

>>> cur.execute("CREATE TEMP TABLE atesttable (x integer)")
>>> table_has_column(cur, 'atesttable', 'x')
True
>>> table_has_column(cur, 'atesttable', 'z')
False
>>> table_has_column(cur, 'thistabledoesnotexist_i_hope', 'pphwt')
False
>>> drop_tables(cur, 'atesttable')
>>> table_has_column(cur, 'atesttable', 'x')
False
def drop_tables(cur, tables):

Drop given tables (a list, one name, or None), if they exist.

>>> cur.execute("CREATE TEMP TABLE foo (a integer)")
>>> have_table(cur, 'foo')
True
>>> table_has_column(cur, 'foo', 'a')
True
>>> cur.execute("CREATE TEMP TABLE bar (b varchar)")
>>> have_table(cur, 'bar')
True
>>> cur.execute("INSERT INTO foo values (1)")
>>> cur.execute("INSERT INTO bar values ('hi mom')")
>>> drop_tables(cur, ['thistabledoesnotexist_i_hope', 'foo', 'bar'])
>>> have_table(cur, 'foo')
False
>>> have_table(cur, 'bar')
False
>>> drop_tables(cur, [])    # No explosion
>>> drop_tables(cur, None)  # No wailing sirens
def allow_sequential_scans(cur, permission):

Allow database to ignore indexes and scan sequentially when it wants?

DO NOT USE THIS WITHOUT REVIEW BY A DBA. When you find yourself wanting this function, chances are you're really hiding a bug in your code.

This is an unfortunate hack. In some cases we have found that postgres will resort to costly sequential scans when a perfectly good index is available. Specifically, this happened when we deleted one-third or so of a table's rows without an ANALYZE (as done by autovacuum) on the indexed column(s). Telling the database to regenerate its statistics for one primary-key indexed column costs almost nothing, but it will block for an autovacuum to complete. Autovacuums can take a long time, and currently cannot be disabled temporarily or selectively.

Instead, this function lets us tell the database to ignore the index degradation, and rely on autovacuum to restore it periodically. Pass a True or a False to change the setting for the ongoing database session. Default in PostgreSQL is False, though we seem to have it set to True in some of our databases.

>>> allow_sequential_scans(cur, True)
>>> cur.execute("SHOW enable_seqscan")
>>> print cur.fetchall()[0][0]
on
>>> allow_sequential_scans(cur, False)
>>> cur.execute("SHOW enable_seqscan")
>>> print cur.fetchall()[0][0]
off
def all_tables_in_schema(cur, schema):
Return a set of all tables in the given schema.
ReturnsA set of quoted, fully qualified table names.
def all_sequences_in_schema(cur, schema):
Return a set of all sequences in the given schema.
ReturnsA set of quoted, fully qualified table names.
def fqn(namespace, name):

Return the fully qualified name by combining the namespace and name.

Quoting is done for the non trivial cases.

>>> print fqn('public', 'foo')
public.foo
>>> print fqn(' foo ', '$bar')
" foo "."$bar"
API Documentation for Launchpad, generated by pydoctor at 2022-06-16 00:00:12.