Project

General

Profile

Model » History » Version 9

« Previous - Version 9/28 (diff) - Next » - Current version
Elmer de Looff, 2011-09-21 18:17


Database abstraction model

Goal of this component

The µWeb framework provides a model module with the intention of simplifying database access. The design goal is to provide a rich abstraction that
  • takes away the tedious work of retrieving, creating and deleting records
  • can load its parent objects automatically if so required
  • does not get in the way of the developer

Making database interaction easier without restricting the abilities of the developer is our main goal. Some default mechanisms make assumptions on the way the database is organised, but these are well-documented, and it's entirely possible to change the behavior of these mechanisms.

Using the Record

The basic idea of the Record class is that it is a container for your database records, with related records automatically loaded as needed, and custom methods that provide more info, child objects, etc. Outlined below are the default features available, with minimal configuration requirements.

Basic Record usage

There are a few ways to use the Record class. The direct way to create a Record is to initiate it with a connection, and a dictionary of field -> value information. The Record is a dictionary subclass that largely copies all the functionality of a dictionary. Retrieving values for keys works exactly as you'd expect.

Creating your own Record

To create your own Record subclass, nothing is required beyond the class name. The following example substitutes a complete working example:

from underdark.libs.uweb import model
class Message(model.Record):
  """Abstraction class for messages stored in the database.""" 

Primary field definition

The Record requires that a table has a single-field unique column. It's advisable for this to be a PRIMARY index in the database, though this is not required. This field is used to automatically look up a record if it is referenced and requested elsewhere.

By default, this primary key field is assumed to be ID. If this is not the case for your table, you can easily change this by defining the _PRIMARY_KEY class constant:

from underdark.libs.uweb import model
class Country(model.Record):
  """Abstraction class for a country table.

  This class uses the ISO-3166-1 alpha2 country code as primary key.
  """ 
  _PRIMARY_KEY = 'alpha2'

Class and table relation

By default, the assumption is made that the table name is the same as the class name, with the first letter lowercase. The table related to the class Message would be message. To change this behavior, assign the correct table name to the _TABLE class constant. This new table name will then be used in all built-in Record methods:

from underdark.libs.uweb import model
class Message(model.Record):
  """Abstraction class for messages stored in the database.""" 
  _TABLE = 'MyMessage'

Record initialization

Initializing a Record object requires a database connection as first argument, and a dictionary with the record's data as second argument. This second argument can, alternatively, be an iterator of key+value tuples.

from underdark.libs.uweb import model
class Message(model.Record):
  """Abstraction class for messages stored in the database.""" 

# Caller side:
>>> record = {'ID': 1, 'message': 'First message!', 'author': 'Elmer'}
>>> message = Message(db_conn, record)
>>> print message
Message({'message': 'First message!', 'ID': 1, 'author': 'Elmer'})

This basic construction is rarely needed in code using the Record objects, but is important for alternative initializers, of which one is provided by default:

Alternative initializer: create Record from primary key

On the caller side, it's impractical to first query the database, and then instantiate a Record subclass from that. Alternative initializers provide a solution without requiring module-level functions that have poor cohesion to the relevant class. Alternative initializers are classmethods, working not on instance, but aiming to create and return one.

There is one such alternative initializer provided: FromKey, which loads a record from the database based on its primary key. Required for this to function are two arguments: A database connection, and the value for the primary key field:

from underdark.libs.uweb import model
class Message(model.Record):
  """Abstraction class for messages stored in the database.""" 

# Caller side:
>>> message = Message.FromKey(db_conn, 1)
>>> print message
Message({'message': u'First message!', 'ID': 1L, 'author': 'Elmer'})
# Unicode and long integer are side effects from the database read, not the Record class

On-demand loading of referenced records.

In databases that are more complex than a single table, information is often normalized. That is, the author information in our previously demonstrated message table will be stored in a separate author table. The author field on message records will be a reference to a record in the author table.

Consider the following tables in your database:

-- TABLE `message`
+----+--------+--------------------------------------------------+
| ID | author | message                                          |
+----+--------+--------------------------------------------------+
|  1 |      1 | First message!                                   |
|  2 |      2 | Robert'); DROP TABLE Students;--                 |
|  3 |      1 | You didn't think it would be this easy, did you? |
+----+--------+--------------------------------------------------+

-- TABLE `author`
+----+-------+--------------------+
| ID | name  | emailAddress       |
+----+-------+--------------------+
|  1 | Elmer | elmer@underdark.nl |
|  2 | Bobby | bobby@tables.com   |
+----+-------+--------------------+

And the following class definitions in Python:

from underdark.libs.uweb import model
class Author(model.Record):
  """Abstraction class for author records.""" 

class Message(model.Record):
  """Abstraction class for messages records.""" 

This makes it possible to retrieve a message, and from that Message object, retrieve the author information. This is done when the information is requested, and not pre-loaded beforehand. This means that retrieving a thousand Message objects will not trigger an additional 1000 queries to retrieve the author information, if that information might not be used at all.

>>> message = Message.FromKey(db_connection, 1)
>>> message
Message({'message': u'First message!', 'ID': 1L, 'author': 1})
# This is the same message we saw before, without author information.
# However, retrieving the author field specifically, provides its record:
>>> message['author']
Author({'emailAddress': u'elmer@underdark.nl', 'ID': 1, 'name': u'Elmer'})
>>> message
Message({'message': u'First message!', 'ID': 1L,
         'author': Author({'emailAddress': u'elmer@underdark.nl', 'ID': 1, 'name': u'Elmer'})})

This works on the assumption that any field name, that is also the table name of another Record class, is a reference to that record. In the case of the example above: The message table contains a field author. There exists a Record subclass for that table (namely Author, table 'author'). The value of message['author'] (1), is now used to load an Author record using the FromKey alternative initializer, with 1 as the primary key value.

  1. message['author'] uses the author field
  2. author table is abstracted by Author class
  3. message['author'] is replaced by Author.FromKey(db_connection, message['author']

This behavior can be modified using the _FOREIGN_RELATIONS class constant. This provides a mapping that specifies (and overrides) which Record classes should be used to resolve references from fields. The key for the mapping is a field name (string), and the corresponding value is a class or None. None specifies that the field does not represent a reference, and should be used as-is. Classes may be given as string because at the time of evaluation, not all classes exist, and attempting using a class directly might result in a NameError. Without this provision, the order of classes would be dictated by the model, and cross-references would not be possible at all.

An example case for a situation where the table names are plural, but the field names are singular:

from underdark.libs.uweb import model
class Author(model.Record):
  """Abstraction class for author records.""" 
  _TABLE = 'authors'

class Message(model.Record):
  """Abstraction class for messages records.""" 
  _TABLE = 'messages'
  _FOREIGN_RELATIONS = {'author': Author}

Loading child objects (1-to-n relations)

The model provides a generic method to retrieve child records (that is, 1 to n relations) of a Record. The desired relations should have an associated Record class. The method to use is _GetChildren, which is a private method of any Record class. As its argument, it needs the name of a child class. Returned is an iterator that yields instances of the given Record subclass.

Given its name and usage, the suggested usage of this is to wrap a more descriptive method around this:

from underdark.libs.uweb import model
class Author(model.Record):
  """Abstraction class for author records.""" 
  def Messages(self):
    """Returns an iterator for all messages written by this author.""" 
    return self._GetChildren(Message)

class Message(model.Record):
  """Abstraction class for messages records.""" 

# Caller code
>>> elmer = Author.FromKey(db_connection, 1)
>>> for message in elmer.Messages():
...   print message
Message({'message': u'First message!', 'ID': 1L,
         'author': Author({'emailAddress': u'elmer@underdark.nl', 'ID': 1, 'name': u'Elmer'})})
Message({'message': u"You didn't think it would be this easy, did you?", 'ID': 3L,
         'author': Author({'emailAddress': u'elmer@underdark.nl', 'ID': 1, 'name': u'Elmer'})})
# Reflowing to keep things legible

What you can see here is that all messages written by the given author are retrieved from the database, and presented. This is done with a single database query, where the child Record's table is searched for rows where the relation_field is equal to the parent Record's primary key value. This relation_field is an optional argument to the _GetChildren method, and defaults to the class' table name.
N.B. print and the methods (iter)items, (iter)values all cause the object's foreign relations to be retrieved.

An example with pluralized table names:

class Author(model.Record):
  """Abstraction class for author records.""" 
  _TABLE = 'authors'

  def Messages(self):
    """Returns an iterator for all messages written by this author.""" 
    return self._GetChildren(Message, relation_field='author')

class Message(model.Record):
  """Abstraction class for messages records.""" 
  _TABLE = 'messages'
  _FOREIGN_RELATIONS = {'author': Author}