Project

General

Profile

Request » History » Version 10

« Previous - Version 10/23 (diff) - Next » - Current version
Elmer de Looff, 2012-05-01 17:31
Environment!


Request

The Request object is an abstraction of the incoming HTTP request. This allows one simple interface that is independent of the underlying server that µWeb runs on (either Standalone using BaseHTTPServer, or Apache mode on mod_python).

From PageMaker methods, the request object is accessible as the self.req member. The request object contains all the information about the incoming request: query arguments, post data, cookies and environment data. It is also the object where you define cookies that need to be provided to the client.

Query arguments

All query arguments provided by the client are present on the request object. They are also accessible directly on the PageMaker object. The following code demonstrates both ways to access a query argument:

...
<form>
  <label for="name">Name: </label><input id="name" name="name" />
  <input type="submit" value="Tell us your name" />
</form>
...
def NameFromQuery(self):
  # Retrieves the 'name' argument from the request object:
  name = self.req.vars['get'].getfirst('name')

  # Retrieves the 'name' argument directly from the PageMaker instance (linked to the request):
  name = self.get.getfirst('name')
  return name

Using the getfirst method, you get a single string returned from the query argument mapping, or a None if no such value exists. Much like a dictionary's get method, you can provide a second argument to the method, and have that returned instead as the default.

Now, HTTP allows the client to provide the same query argument multiple times. Using getfirst you would only get the very first defined argument. So a request that looks like http://example.org/group?name=Bob&name=Mark&name=Jenny would only return 'Bob' in the previous example. To get all their names printed, you can use the following:

...
<form action="/group">
  <h2>Names in this group</h2>
  <!-- These would likely be generated with Javascript, but written here for demonstrative purposes -->
  <label for="name_1">Name: </label><input id="name_1" name="name" />
  <label for="name_2">Name: </label><input id="name_2" name="name" />
  <label for="name_3">Name: </label><input id="name_3" name="name" />
  <input type="submit" value="Send these names" />
</form>
...
def MemberNames(self):
  names = self.get.getlist('name')
  return ', '.join(names)

This returns a neat comma-separated string with all the provided names. The getlist method does not take a default, but will instead return an empty list when there are no values for the requested argument name.

Post data

Submitted form data is available on the request object as well. The interface is similar to that of the query arguments, and the FieldStorage class already present in the cgi module. If we take our initial example form handler, but now receive the data through HTTP POST, the code would look like this:

...
<form method="post">
  <label for="name">Name: </label><input id="name" name="name" />
  <input type="submit" value="Tell us your name" />
</form>
...
def NameFromPost(self):
  # Retrieves the 'name' value from the request object:
  name = self.req.vars['post'].getfirst('name')

  # Retrieves the 'name' value directly from the PageMaker instance (linked to the request):
  name = self.post.getfirst('name')
  return name

Like with the query arguments, getfirst accepts a second argument that provides a default other than None.

Multiple values are again possible in the FieldStorage, and these work similar to how they do in query arguments:

...
<form action="/group" method="post">
  <h2>Names in this group</h2>
  <!-- These would likely be generated with Javascript, but written here for demonstrative purposes -->
  <label for="name_1">Name: </label><input id="name_1" name="name" />
  <label for="name_2">Name: </label><input id="name_2" name="name" />
  <label for="name_3">Name: </label><input id="name_3" name="name" />
  <input type="submit" value="Send these names" />
</form>
...
def MemberNames(self):
  names = self.post.getlist('name')
  return ', '.join(names)

Uploading files

Processing an uploaded file is done using the the same FieldStorage system as the rest of the POST data, and roughly looks like the following. When performing file uploads, be sure to define the enctype of your form, or the uploaded file will have no contents.

...
<form method="post" enctype="multipart/form-data">
  <label for="avatar">Avatar: </label><input id="avatar" name="avatar" type="file" />
  <input type="submit" value="submit!" />
</form>
...
def UpdateAvatar(self):
  # Retrieve the currently logged-in user
  user = self.GetCurrentUser()

  # This gets the name of the file that was uploaded
  avatar_name = self.post['avatar'].filename

  # This retrieves the content of the uploaded file, 
  avatar_data = self.post['avatar'].value

  self.SaveAvatar(user, avatar_data)
  return 'Your avatar has been replaced by %r' % avatar_name

Structured data using POST

One of the things that has been extended on the basic FieldStorage in µWeb is the way it treats square backets ( [ and ] ) in POST data. A form field with the name person[name] will result in a dictionary person being created in the resulting FieldStorage:

...
<form method="post">
  <label for="name">Name: </label><input id="name" name="person[name]" />
  <label for="age">Age: </label><input id="age" name="person[age]" />
  <label for="job">Job: </label><input id="job" name="person[job]" />
  <input type="submit" value="Update your profile" />
</form>
...
def PersonalData(self):
  person = self.post.getfirst('person')
  return uweb.Response(json.dumps(person), content_type="application/json")

In the above code here, the person variable is a dictionary retrieved from the POST data, which is then presented to the client in JSON, by using a custom repsonse.

Note that the 'numeric' age value is a string. This is of course because everything submitted in forms is in the form of a string. Conversion to appropriate types will have to be handled by the PageMaker. The person dictionary itself looks like this:

{'age': '28', 'job': 'Engineer', 'name': 'Elmer'}

N.B.: When using structured form data, you still need to use the getfirst method, because there might me separate (non-dictionary) values for the form name. There will never be more than one dictionary in the form values; if a single key is set more than once, the last-set value will be the one present in the dictionary.

Cookies

Reading cookies

Cookies provided by the client will also end up in the request object. They are both present on the request itself, as self.req.vars['cookies'], or through the PageMaker instance itself as self.cookies (both are from the scope of the PageMaker instance).

The cookie storage itself is a plain Python dictionary, which makes for particularly easy access.

def CookieInfo(self):
  sample = self.cookies['sample']
  return 'The sample cookie is set to %r' % sample

Cookies cannot be set by using this dictionary though, for that the AddCookie method is required:

Setting cookies

Response cookies are set using the request object. The method to use for this is AddCookie, the easiest use of which looks like this:

def SetCookie(self):
  self.req.AddCookie('example', 'this is an example cookie value set by µWeb')
  return 'A cookie named "example" was set.'

This creates a cookie that does not expire, will be provided with every request to the originating domain, and can be read from Javascript. To change these default behaviors, there are a number of optional arguments that can be provided, as detailed below. Of course, while the examples show one argument used at a time, they can all be combined:

def ShortLivedCookie(self):
  """Sets an expiry time of the cookie, in this case 10 seconds.""" 
  self.req.AddCookie('quick', 'I will be gone soon', max_age=10)

def SecureCookie(self):
  """Sets a cookie with the 'secure' flag enabled.

  This means the cookie will only be provided with requests that the browser
  considers secure. This typically means they will only be present in requests
  that use SSL (https://).
  """ 
  self.req.AddCookie('secret', 'This server adores you', secure=True)

def HttpOnlyCookie(self):
  """Sets a cookie that is only transferred in HTTP requests.

  The cookie will not be readable from Javascript. This defaults to False.
  """ 
  self.req.AddCookie('secret', 'Please no Javascript', httponly=True)

def PathBoundCookie(self):
  """Sets a cookie that is is only valid for the path '/admin'.

  This means that the client (browser) will only provide it for requests
  that go to '/admin' or a deeper nested path (such as '/admin/users'
  but will not be provided for requests that go to '/blog'
  """ 
  self.req.AddCookie('user', 'bobbytables', path='/login')

def DomainBoundCookie(self):
  """Sets a cookie that is is only valid for the specified domain.

  By default, if a cookie is set for 'www.example.com' it will not be provided
  for requests that go to 'example.com' itself. If we set the cookie to be valid
  for '.domain.com', it will be valid for domain.com and all sub-domains.

  Explicitly specified domains MUST begin with a dot, or they will be rejected
  as per RFC2109. Additionally, cookies set by 'x.y.example.com' MAY NOT set
  their valid domain to be '.example.com' or they will be rejected.

  If the 'domain' is not specified, the cookie will be valid for the domain that
  set the cookie (as per HTTP_HOST from the environment)
  """ 
  self.req.AddCookie('session', 'SMqfUYLk3vCjkWL6', domain='.example.com')

Headers

Incoming headers

Request headers are made available in the headers member of the request object. This works like a regular dictionary (though writing to this dictionary is not guaranteed to be successful), where all the keys are in lower-case. The get method works to retrieve the header, with an optional default if the header wasn't provided by the client.

def Headers(self):
  # Hostname that the client (browser) requested:
  host = self.req.headers['host']

  # Retrieves the user-agent from the request
  user_agent = self.req.headers.get('user-agent', 'unknown')

  return 'The host %r was visited by the user-agent identified as %r.' % (host, user_agent)

Outgoing headers

While it is possible to provide response headers via the Request object, it is strongly advised to provide them using the Response object. This generally leads to clearer code, and has less caveats than using the methods laid out below.

Adding headers to outgoing responses can be done using the AddHeader method of the request object. Please note that cookies can be set more easily (as described above), as well as creating redirects. Responding with a custom content-type and HTTP status code will be explained below. Setting your own headers, for example to provide ETags, is done like this:

def TaggedResponse(self, content):
  self.req.AddHeader('ETag', hashlib.sha1(conten).hexdigest())
  return content

The above example returns a simple ETag based on the SHA-1 hash of the content returned.

Content-type

The content-type of the reply would usually be configured by returning a custom Response object. When it is not desirable to use this, the content-type can be set using the SetContentType method of the request object:

def CustomContent(self):
  with file('lolcat.jpg') as image:
    self.req.SetContentType('image/jpeg')
    return image.read()

N.B.: Note that returning a Response object will override the content-type set on the request object. That is, returning the image in the example above using Response (without the content-type defined there) will create a response with the default text/html content-type.

HTTP Response code

The HTTP response code on the webserver reply can also be set directly on the request object when a full Response object is for any reason not desirable:

def FourOhFour(self, path):
  self.req.SetHttpCode(404)
  return "Sorry, we don't have a page that looks like %r" % path

N.B.: Note that returning a Response object will override the HTTP status code set on the request object. That is, using a Response object in the example above (without providing the httpcode argument) will return a HTTP 200 OK response.

Environment

The request object also has available a number of environment variables. This is a dictionary that exists as the env member of the object. By default a number of commonly used (or useful) variables will be available, as well as all the HTTP request headers. These latter ones will be prefixed 'HTTP_', and all dashes will be converted to underscores, such that Content-type becomes HTTP_CONTENT_TYPE. The other variables are explained below:

Header Meaning
CONTENT_LENGTH The length of the HTTP POST data. Value default to 0 if no POST has been performed.
CONTENT_TYPE The content-type of the HTTP POST data, or empty string.
HTTP_HOST The host that the client is requesting (e.g. underdark.nl).
HTTP_REFERER The URL from where people were linked to this page, where provided by the client.
HTTP_USER_AGENT Browser identification string of the client.
PATH_INFO Path portion of the request URL (e.g. /admin/login).
QUERY_STRING The raw string of query arguments as received by the server (before parsing, refer to query arguments for an easier interface).
REMOTE_ADDR The client's network address (IPv4 or IPv6, whichever is provided by the underlying system).
REQUEST_METHOD The request method. Typically one of GET, HEAD or POST.
UWEB_MODE This is the operational method of µWeb, which is either STANDALONE or MOD_PYTHON.

Extended environment

If more information is required, the environment dictionary can be extended with a number of additional points of information. This can be done by calling the ExtendedEnvironment method. This will expand the environment dictionary, and for the benefit of the caller, return it as well. Note that calling this routinely might significantly reduce performance, as extending the environment will among things, perform reverse-DNS lookups.

The additional values are the following:

Header Meaning
DOCUMENT_ROOT Working directory of the web server (for the given VirtualHost if running on Apache).
RAW_REQUEST Raw HTTP request as received by the server, before any parsing.
REMOTE_HOST Fully Qualified Domain Name of the client, checked using DNS.
SERVER_NAME Local machine name of the computer running the web server.
SERVER_PORT Listening port of the web server.
SERVER_LOCAL_NAME The FQDN name of the server as known to the web server (Apache VirtualHost affects this).
SERVER_LOCAL_IP The local IP that the server runs on. If the server has multiple IPs configured, the one used to connect to the client will be present here.
SERVER_PROTOCOL The protocol used for the current connection. Typically HTTP/1.0 or HTTP/1.1.

Apache-only extended environment

Requesting ExtendedEnvironment on Apache will also add the following keys to the env dictionary:

Header Meaning
AUTH_TYPE Authentication that was used. Typically one of basic or digest.
CONNECTION_ID Connection ID as provided by Apache.
MODPYTHON_HANDLER Module name of the Request Router used for the site.
MODPYTHON_INTERPRETER Name of the VirtualHost of the requested site.
MODPYTHON_PHASE Phase that mod_python was in when ExtendedEnvironment was called. This typically is PythonHandler
REMOTE_USER Username of the authenticated user (available only when using HTTP Authentication).