Project

General

Profile

TemplateParser » History » Version 44

Version 43 (Elmer de Looff, 2012-06-01 13:27) → Version 44/56 (Elmer de Looff, 2012-06-01 13:28)

h1. TemplateParser

{{>toc}}

The µWeb TemplateParser is a in-house developed templating engine that provides tag replacement, tag-functions and template control functions. This document will describe the following:
* *[[TemplateParser#using|Using TemplateParser]]* inside a µWeb PageMaker
* The *[[TemplateParser#template|Template class]]*, used to parse the templating language
* The *[[TemplateParser#parser|Parser class]]*, which provides template loading and caching
* *[[TemplateParser#syntax|Template syntax]]*, an overview of the language's constructs and behaviors

First though, to help with understanding the TemplateParser, a minimal size template document:

<pre><code class="html">
Hello [title] [name]
</code></pre>

The above document contains two simple template tags. These tags are delimited by square brackets, and they will be replaced by the named argument provided during parsing. If this name is not present, then the literal presentation of the tag will remain in the output.

h1(#using). Using TemplateParser inside µWeb

Within the default µWeb @PageMaker@, there is a @parser@ property, which provides a [[TemplateParser#parser|Parser]] object. The class constant @TEMPLATE_DIR@ provides the template search directory. The default template directory is @'templates'@. *N.B.* This path is relative to the file that contains the PageMaker class.

An example of TemplateParser to create a complete response:
<pre><code class="python">
import uweb
import time

class PageMaker(uweb.PageMaker):
def VersionPage(self):
return self.parser.Parse(
'version.utp', year=time.strftime('%Y'), version=uweb.__version__)
</code></pre>

The example template for the above file could look something like this:

<pre><code class="html">
<!DOCTYPE html>
<html>
<head>
<title>µWeb version info</title>
</head>
<body>
<p>µWeb version [version] - Copyright 2010-[year] Underdark</p>
</body>
</html>
</code></pre>

And would result in the following output:

<pre><code class="html">
<!DOCTYPE html>
<html>
<head>
<title>µWeb version info</title>
</head>
<body>
<p>µWeb version 0.12 - Copyright 2010-2012 Underdark</p>
</body>
</html>
</code></pre>

With these initial small demonstrations behind us, let's explore the @TemplateParser@ further

h1(#template). Template class

The @Template@ class provides the interface for pre-parsing templates, loading them from files and parsing single templates to completion. During pre-parsing, constructs such as loops and conditional statements are converted to @TemplateLoop@ and @TemplateConditional@ objects, and their scopes nested appropriately in the @Template@. Tags are replaced by @TemplateTag@ instances, and text is captured in @TemplateText@. All of these provide @Parse@ methods, which together result in the combined parsed template output.

h2. Creating a template

A template is created simple by providing a string input to the @Template@'s constructor. This will return a valid Template instance (or raise an error if there is a problem with the [[TemplateParser#syntax|syntax]]:

<pre><code class="python">
>>> import templateparser
>>> template = templateparser.Template('Hello [title] [name]')
>>> template
Template([TemplateText('Hello '), TemplateTag('[title]'), TemplateText(' '), TemplateTag('[name]')])
</code></pre>

Above can be seen the various parts of the template, which will be combined to output once parsed.

h2. Loading a template from file

The @Template@ class provides a @classmethod@ called @FromFile@, which loads the template at the path.

Loading a template named @example.utp@ from the current working directory:

<pre><code class="python">
>>> import templateparser
>>> template = templateparser.Template.FromFile('example.utp')
>>> template
Template([TemplateText('Hello '), TemplateTag('[title]'), TemplateText(' '), TemplateTag('[name]')])
</code></pre>

h2. Parsing a template

Parsing a template can be done by calling the @Template@'s @Parse@ method. The keyword arguments provided to this call will from the replacement mapping for the template. In the following example, we will provide one such keyword, and leave the other undefined to show the (basic) behavior of the @Template.Parse@ method.

<pre><code class="python">
>>> import templateparser
>>> template = templateparser.Template('Hello [title] [name]')
>>> template.Parse(title='sir')
'Hello sir [name]'
</code></pre>

h1(#parser). Parser class

The @Parser@ class provides simple management of multiple @Template@ objects. It is mainly used to load templates from disk. When initiating a @Parser@, the first argument provides the search path from where templates should be loaded (the default is the current working directory). An optional second argument can be provided to preload the template cache: a mapping of names and @Template@ objects.

h2. Loading templates

Creating a parser object, and loading the 'example.utp' template from the 'templates' directory works like this:

<pre><code class="python">
>>> import templateparser
>>> # This sets the 'templates' directory as the search path for AddTemplate
>>> parser = templateparser.Parser('templates')
>>> # Loads the 'templates/example.utp' and stores it as 'example.utp':
>>> parser.AddTemplate('example.utp')
>>> parser.Parse('example.utp', title='mister', name='Bob Dobalina')
'Hello mister Bob Dobalina'
</code></pre>

The @AddTemplate@ method takes a second optional argument, which allows us to give the template a different name in the cache:

<pre><code class="python">
>>> parser = templateparser.Parser('templates')
>>> parser.AddTemplate('example.utp', name='greeting')
>>> parser.Parse('greeting', title='mister', name='Bob Dobalina')
'Hello mister Bob Dobalina'
</code></pre>

As you can see, the name of the template in the cache is not necessarily the same as the one on disk. Often though, this is not necessary to change, so @AddTemplate@ need only be called with one argument. Or not at all, as the following section will show.

h2. Template cache and auto-loading

The @Parser@ object behaves like a slightly modified dictionary to achieve this. Retrieving keys yields the associated template. Keys that are not present in the cache are _automatically_ retrieved from the filesystem:

<pre><code class="python">
>>> import templateparser
>>> parser = templateparser.Parser('templates')
>>> 'example.utp' in parser
False # Since we haven't loaded it, the template it not in the parser
>>> parser
Parser({}) # The parser is empty (has no cached templates)
</code></pre>

Attempting to parse a template that doesn't exist in the parser cache triggers an automatic load:

<pre><code class="python">
>>> parser['example.utp'].Parse(title='mister', name='Bob Dobalina')
'Hello mister Bob Dobalina'
>>> 'example.utp' in parser
True
>>> parser
Parser({'example.utp': Template([TemplateText('Hello '), TemplateTag('[title]'),
TemplateText(' '), TemplateTag('[name]')])})
</code></pre>

If these cannot be found, @TemplateReadError@ is raised:

<pre><code class="python">
>>> import templateparser
>>> parser = templateparser.Parser('templates')
>>> parser['bad_template.utp'].Parse(failure='imminent')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/var/lib/underdark/libs/uweb/templateparser.py", line 147, in __getitem__
self.AddTemplate(template)
File "/var/lib/underdark/libs/uweb/templateparser.py", line 171, in AddTemplate
raise TemplateReadError('Could not load template %r' % template_path)
underdark.libs.uweb.templateparser.TemplateReadError: Could not load template 'templates/bad_template.utp'
</code></pre>

h2. @Parse@ and @ParseString@ methods

For convencience and consistency, the @Parser@ comes with two handy methods to provide parsing of @Template@ objects, one from its cache, one from raw template strings. It is recommended to use these over the previously shown direct key-based access:

<pre><code class="python">
>>> import templateparser
>>> parser = templateparser.Parser('templates')
>>> parser.Parse('example.utp', title='mister', name='Bob Dobalina')
'Hello mister Bob Dobalina'
>>> parser.ParseString('Hello [title] [name]', title='mister', name='Bob Dobalina')
'Hello mister Bob Dobalina'</code></pre>

h1(#syntax). Templating language syntax

The templating syntax is relatively limited, but with the limited syntax it provides a flexible and rich system to create templates. Covered in these examples are:
* Simple tags (used in various examples above)
* Tag indexing
* Tag functions
* Template language constructs

All examples will consist of three parts:
# The example template
# The python invocation string (the template will be named 'example.utp')
# The resulting output (as source, not as parsed HTML)

h2. Simple tags

This is an example for the most basic form of template tags. The tag is enclosed by square brackets as such: @[tag]@. Tags that match a provided argument to the Parse call get replaced. If there is no argument that matches the tag name, it is returned in the output verbatim. This is also demonstrated in the below example

The example below is a repeat of the example how to use TemplateParser inside µWeb, and shows the template result:

<pre><code class="html">
<!DOCTYPE html>
<html>
<head>
<title>µWeb version info</title>
</head>
<body>
<p>µWeb version [version] - Copyright 2010-[year] Underdark</p>
<p>
This [paragraph] is not replaced because there is no
paragraph argument provided to the parser.
</p>
</body>
</html>
</code></pre>

<pre><code class="python">
>>> parser.Parse('version.utp', year=time.strftime('%Y'), version=uweb.__version__)
</code></pre>

<pre><code class="html">
<!DOCTYPE html>
<html>
<head>
<title>µWeb version info</title>
</head>
<body>
<p>µWeb version 0.11 - Copyright 2010-212 Underdark</p>
<p>
This [paragraph] is not replaced because there is no
paragraph argument provided to the parser.
</p>
</body>
</html>
</code></pre>

h3. Tag characters

Tag names are created from the same characters as valid Python variable names. This means they can contain upper and lower case letters, numbers and underscores. In regex terms, a tag should match @\w+@.

*N.B.:* Some names are illegal in Python as variable names but valid as tag names (tag names may start with a number). You can use these and pass the replacements as a dictionary using ** if you have a need for it.

h2. Tag indexing

In addition to simple (re)placement of strings using the @TemplateParser@, you can also provide it with a @list@, @dictionary@, or other indexable object, and from it, fetch various @indices@, @keys@ or @attributes@. The separation character between the _tagname_ and the _index_ is the _colon_ (":"):

h3. List/tuple index addressing

This works for lists and tuples, but also for any other object that supports indexing. That is, every object that accepts integers on its @__getitem__@ method.

<pre><code class="html">
This is [var:0] [var:1].
</code></pre>

<pre><code class="python">
>>> parser.Parse('message.utp', var=('delicious', 'spam'))
</code></pre>

<pre><code class="html">
This is delicious spam.
</code></pre>

h3. Dictionary key addressing

This works for dictionaries, but also for any other object that behaves like a key-value mapping. That is, every object that accepts strings on its @__getitem__@ method.

<pre><code class="html">
This is [var:adjective] [var:noun].
</code></pre>

<pre><code class="python">
>>> parser.Parse('message.utp', var={'adjective': 'delicious', 'noun': 'spam'})
</code></pre>

<pre><code class="html">
This is delicious spam.
</code></pre>

h3. Attribute name addressing

This works for any object that has named attributes. If the attribute is a method, it will *not* be executed automatically, the return value will simply be the (un)bound method itself.

<pre><code class="html">
This is [var:adjective] [var:noun].
</code></pre>

<pre><code class="python">
>>> class Struct(object):
... pass
...
>>> var = Struct()
>>> var.adjective = 'delicious'
>>> var.noun = 'spam'
>>> parser.Parse('message.utp', var=var)
</code></pre>

<pre><code class="html">
This is delicious spam.
</code></pre>

h3. Lookup order

For objects and constructs that provide multiple ways of looking up information, the lookup order can be very important. For any of the first three steps, if they are successful, the retrieved value is returned, and no further attempts are made:

# If the @needle@ is parseable as integer, it will first be used as an index. This will also work for mappings with numeric keys;
# If the above fails, the @needle@ is assumed to be a string-like mapping key, and this is attempted
# If the above fails, the @needle@ is used as an attribute name;
# If all of the above fail, *@TemplateKeyError@* is raised, as the @needle@ could not be found on the object.

h3. Nested indexes

There may be cases where the value you need is not at the top-level index of an object. This is not a problem, since TemplateParser supports arbitrary-depth nested structures in its index-lookup:

<pre><code class="html">
This is a variable from [some:levels:down:1].
</code></pre>

<pre><code class="python">
>>> class Struct(object):
... pass
...
>>> var = Struct()
>>> var.levels = {'down': ('the sky', 'the depths')}
>>> parser.Parse('message.utp', some=var)
</code></pre>

<pre><code class="html">
This is a variable from the depths.
</code></pre>

h3. Valid index characters

Indexes may be constructed from upper and lower case letters, numbers, underscores and dashes. There are no restrictions on first character, only a minimum length of one. Regex-wise, they need to match @[\w-]+@

h2. Tag functions

Once you arrive at the tag/value you want, there's often some things that need to happen before the resulting template is sent to the requesting client (browser). HTML escaping is an obvious one, but url quoting of single arguments may also be helpful, as well as uppercasing, printing the length of a list (instead of the raw list) and various other uses.

h3. Default html escaping

Using a tag function is a fairly straightforward process, just add the name of the function after the tagname, separated by a pipe ( | ):

<pre><code class="html">
And he said: [message|html]
</code></pre>

<pre><code class="python">
>>> parser.Parse('message.utp', message='"Hello"')
</code></pre>

<pre><code class="html">
And he said: &quot;Hello&quot;
</code></pre>

Using the *html* tag function makes the tag value safe for printing in an HTML document. Because we believe this is _really_ important, the html escaping tag function is always applied when no other tag function is applied:

<pre><code class="html">
And he said: [message]
</code></pre>

<pre><code class="python">
>>> parser.Parse('message.utp', message='"Hello"')
</code></pre>

<pre><code class="html">
And he said: &quot;Hello&quot;
</code></pre>

Only when you use another tag function, or specifically tell @TemplateParser@ to push the _raw_ tag value into the output, are the quotes allowed through unchanged:

<pre><code class="html">
And he said: [message|raw]
</code></pre>

<pre><code class="python">
>>> parser.Parse('message.utp', message='"Hello"')
</code></pre>

<pre><code class="html">
And he said: "Hello"
</code></pre>

h3. Predefined tag functions

* *html* &ndash; This tag function escapes content to be safe for inclusion in HTML pages. This means that the ampersand ( & ), single and double quotes ( ' &nbsp;and&nbsp; " ) and the pointy brackets ( < &nbsp;and&nbsp; > ) are converted to their respective "character entity references":http://en.wikipedia.org/wiki/Character_entity_reference
* _default_ &ndash; This is the tag function that will be executed when no other tag functions have been specified for a tag. By default, this will do the same as the *html* tag function. This can be adjusted by assigning another tag function to this name.
* *raw* &ndash; This tag function passes the tag through without change. This is the function to use when you have no tag function to apply, but do not want the tag to be HTML-escaped.
* *url* &ndash; This tag function prepares the tag for use in URLs. Space are converted to plus-signs ( + ), and other characters that are considered unsafe for URLs are converted to "percent-notation":http://en.wikipedia.org/wiki/Percent-encoding.

h3. Adding custom functions

Custom methods can be added to a @Parser@ object using the method @RegisterFunction@. This takes a name, and a single-argument function. When this function is encountered in a tag, it will be given the current tag value, and its result will be output to the template, or passed into the next function:

<pre><code class="python">
>>> from uweb import templateparser
>>> parser = templateparser.Parser()
>>> parser.RegisterFunction('len', len)
>>> template = 'The number of people in this group: [people|len].'
>>> parser.ParseString(template, elements=['Eric', 'Michael', 'John', 'Terry'])
'The number of people in this group: 4.'
</code></pre>

*N.B.:* Using custom functions (or in fact any function other than _html_ or no function) will suppress HTML escaping. If your content is still user-driven, or not otherwise made safe for output, *it is strongly recommended you apply html escaping*. This can be achieved by chaining functions, as explained below.

h3. Function chaining

Multiple function calls can be chained after one another. The functions are processed left to right, and the result of each function is passed into the next, without any intermediate editing or changes:

Setting up the parser and registering our tag function
<pre><code class="python">
>>> from uweb import templateparser
>>> parser = templateparser.Parser()
>>> parser.RegisterFunction('first', lambda x: x[0])
</code></pre>

Working just one tag function returns the first element from the list:
<pre><code class="python">
>>> template = 'The first element of list: [elements|first].'
>>> parser.ParseString(template, elements=['Eric', 'Michael', 'John', 'Terry'])
'The first element of list: Eric.'
</code></pre>

Repeating the function on the string returns the first character from that string:
<pre><code class="python">
>>> template = 'The first element of the first element of list: [elements|first|first].'
>>> parser.ParseString(template, elements=['Eric', 'Michael', 'John', 'Terry'])
'The first element of the first element of list: E.'
</code></pre>

h3. Valid function name characters

Tag function names may be constructed from upper and lower case letters, numbers, underscores and dashes. There are no restrictions on first character, only a minimum length of one. Regex-wise, they need to match @[\w-]+@


h2. TemplateLoop

As a language construct, TemplateParser has an understanding of iteration. The @TemplateLoop@ can be compared to the Python @for@-loop, or the @foreach@ construct in other languages (lazy iteration over the values of an iterable).

h3. Syntax and properties

*Syntax: @{{ for local_var in [collection] }}@*
* The double accolades (curly braces) indicate the beginning and end of the construct;
* The @for@ keyword indicates the structure to execute;
* @local_var@ is the name which references the loop variable;
* @[collection]@ is the tag that provides the iteratable.

*Properties*
* The local name is stated without brackets (as it's no tag itself)
* When it needs to be placed in the output, the local name should have brackets (like any other tag)
* *N.B.* The local variable does _not_ bleed into the outer scope after the loop has completed.
It is therefore possible (though not recommended) to name the loop variable after the iterable: @{{ for collection in [collection] }}@.

h3. Example of a @TemplateLoop@

<pre><code class="html">
<html>
<body>
<ul>
{{ for name in [presidents] }}
<li>President [name]</li>
{{ endfor }}
</ul>
</body>
</html>
</code></pre>

<pre><code class="python">
>>> parser.Parse('rushmore.utp', presidents=['Washington', 'Jefferson', 'Roosevelt', 'Lincoln'])
</code></pre>

<pre><code class="html">
<html>
<body>
<ul>
<li>President Washington</li>
<li>President Jefferson</li>
<li>President Roosevelt</li>
<li>President Lincoln</li>
</ul>
</body>
</html>
</code></pre>

h2. Inlining templates

Often, there will be snippets of a template that will see a lot of reuse. Page headers and footers are often the same on many pages, and having several redundant copies means that changes will have to be replicated to each of these occurrances. To reduce the need for this, TemplateParser has an @inline@ statement. Using this you can specify a template that is available in the @[[TemplateParser#Parser]]@ instance and the statement will be replaced by the template.

Of course, if the inlined template is not already in the @Parser@ instance, the autoloading mechanism will trigger, and the named template will be search for in the @Parser@'s template directory.

First, we will define our inline template, @'inline_hello.utp'@:

<pre><code class="html">
<p>Hello [name]</p>
</code></pre>

Secondly, our main template, @'hello.utp'@:

<pre><code class="html">
<h1>Greetings</h1>
{{ inline inline_hello.utp }}
</code></pre>

Then we parse the template:

<pre><code class="python">
>>> parser.Parse('hello.utp', name='Dr John')
</code></pre>

<pre><code class="html">
<h1>Greetings</h1>
<p>Hello Dr John</p>
</code></pre>

h2. Conditional statements

Often, you'll want the output of your template to be dependent on the value, presence, or boolean value of another tag. For instance, we may want a print a list of attendees to a party. We start the @if@ conditional by checking the boolean value of the @attendees@ tag. If this list if not-empty, we will print the attendee names, but if it's empty (or contains only a single entry), we'll tell the user in more intelligent ways than giving them a list with zero entries:

<pre><code class="html">
<h1>Party attendees</h1>
{{ if len([attendees]) > 1 }}
<ol>
{{ for attendee in [attendees] }}
<li>[attendee:name]</li>
{{ endfor }}
</ol>
{{ elif [attendees] }}
<p>only [attendees:0:name] is attending.</p>
{{ else }}
<p>There are no registered attendees yet.</p>
{{ endif }}
</code></pre>

For the case where there are several attendees:

<pre><code class="python">
>>> parser.Parse('party.utp', attendees=[
... {'name': 'Livingstone'},
... {'name': 'Cook'},
... {'name': 'Drake'}])
</code></pre>

<pre><code class="html">
<h1>Party attendees</h1>
<ol>
<li>Livingstone</li>
<li>Cook</li>
<li>Drake</li>
</ol>
</code></pre>

For the case where there is one attendee:

<pre><code class="python">
>>> parser.Parse('party.utp', attendees=[{'name': 'Johnny'}])
</code></pre>

<pre><code class="html">
<h1>Party attendees</h1>
<p>Only Johnny is attending.</p>
</code></pre>

And in the case where there are no attendees:

<pre><code class="python">
>>> parser.Parse('party.utp', attendees=[])
</code></pre>

<pre><code class="html">
<h1>Party attendees</h1>
<p>There are no registered attendees yet.</p>
</code></pre>

h3. Properties of conditional statements

* *All template keys must be referenced as proper tag*
This is to prevent mixing of the template variables with the functions and reserved names of Python itself. Conditional expressions are evaluated using @eval()@, and proper tags are replaced by temporary names, the values of which are stored in a retrieve-on-demand dictionary. This makes them perfectly safe with regard to the value of template replacements, but some care should be taken with the writing of the conditional expressions.
* *It is possible to index tags in conditional statements*
This allows for decisions based on the values in those indexes/keys. For instance, @Person@ objects can be checked for gender, so that the correct gender-based icon can be displayed next to them.
* *Referencing a tag or index that doesn't exist raises @TemplateNameError*
Unlike in regular template text, there is no suitable fallback value for a tag or index that cannot be retrieved. However, in most cases this can be prevented by making use of the following property:
* *Statement evaluation is lazy*
Template conditions are processed left to right, and short-circuited where possible. If the first member of an @or@ group succeeds, the return value is already known. Similarly, if the first member of an @and@ group fails, the second part need not be evaluated. This way @TemplateNameErrors@ can often be prevented, as in most cases, presence of indexes can be confirmed before accessing.

h2. Template unicode handling

Any @unicode@ object found while parsing, will automatically be encoded to UTF-8:

<pre><code class="python">
>>> template = 'Underdark [love] [app]'
>>> output = parser.ParseString(template, love=u'\u2665', app=u'\N{micro sign}Web')
>>> output
'Underdark \xe2\x99\xa5 \xc2\xb5Web' # The output in its raw UTF-8 representation
>>> output.decode('UTF8')
u'Underdark \u2665 \xb5Web' # The output converted to a Unicode object
>>> print output
Underdark ♥ µWeb # And the printed UTF-8 as we desired it.
</code></pre>