Project Home

The Problem is Binding URL Paths to Code and Data

Author: Martin Blais <blais@furius.ca>
Date: 2005-04-24

Abstract

Note about my understanding of RESTfulness and motivation.

Introduction

Today I have realized something important about web applications while designing a security mechanism for my framework. I think I have understood the real value of making a system RESTful. Even after reading the design of Zope3, which is making more and more sense now, I had not quite grasped the importance of this. This document describes my understanding of the issue, with a specific example that should motivate the desire for RESTfulness.

Stateful Design

The way my system is currently designed ties the notion of "current user" with the meaning of a URL path on my server. For example, when I access the "/profile" URL, the results are different depending on which is the currently logged in user.

An important problem with this approach lies in the definition of a flexible system for permissions. If, for example, I want to define a permission group for an administrator to be able to access any user's /profile page, I would have to create some special codes for the administrator page to access that functionality.

It is obvious that a better system would be for each URL to represent a unique resource, and for a permission system to simply restrict which resources are available depending on the credentials of the currently logged in user. That is, to make the system RESTful.

RESTfulness is important

REST stands for REpresentational State Tranfer. The basic idea is that each URL uniquely represents a resource in the system. Basically, given a specific URL (including its GET or POST arguments), you should obtain the same resource, and/or trigger a specific state transfer that does not depend on context/session.

You want to be able to separate authentication from session data. Basically, the less you can do with session-specific data, the better.

A Specific Example

In my case, the way I should serve user-specific pages should be:

/users/<username>/profile

This way, even if a user other than <username> is logged in, and has permissions to access this page, he can. This is how administrator users should be able to access user profiles. They use the very same URLs/codes, and their credentials allow them access to those resources. Users != <username> simply don't have the credentials to access those pages, but technically, if they could, the resource URL would be the same.

This allows the creation of very flexible access policies. These policies could be created either by writing special code to accept or reject access to specific resources, or by using a special regexp syntax (a special case of the former).

Another interesting consequence of defining access policies this way is that we can decouple the implementation of the resources with the implementation of the access mechanisms. This makes the system much more flexible.

Requests are Function Calls

Essentially, each request maps onto a function call within our system. Let's examine the anatomy of a function call, as it is often represented in most computer languages, in this case we will choose Python:

def foo( bar1, bar2, bar3=def1, bar4=def2, *kwds )

A function has:

The difference between systems which implement the various RPC protocols and a URL is simple: a URL is simply an encoding of the function call parameters.

How we map URLs to these functions/arguments is a centrally important feature of a web framework.

Any combination of the informations above, availalble in the URL itself, or the POST values, are candidates for mapping into a function and into arguments. Also, parts of the URL path can be used as arguments, and arguments could map into a function name or influence it.

Objects

If we think of "users" as objects in our system, we can think of including their reference in the path itself, e.g.:

/users/<username>/edit

This maps naturally on an hierarchical data structure. This is the approach that Zope takes: "folders" are containers for objects, which map directly onto the ZoDB, which have methods associated to them (via the ugly ZCML), to perform a specific method call on these objects.

Databases and Flexibility

Unfortunately, most efficient and well-tested database systems today exist in the form of flat SQL databases, with indexes between their tables. This is a reality that must be dealt with, and requiring the user to use a specific kind of storage is not a very flexible approach. Often an interface needs to be build around existing data storage systems.

We would like to design a more flexible system of mapping URLs to resources.

Notes

In draco, currently:

def foo( self, path, args ):

'path' is NEVER used in my web application. I don't see it becoming used in the future.

Mapping URLs to Resources

One problem that occurs is when you want to include part of the parameters in the request path itself. For example, considering that we would want to make our system RESTful, a natural approach would be to prefer:

/users/<username>/profile

to

/profile?user=username

So the question is, how do we map URLs to function calls? We need to create a scheme to be able to do that efficiently and flexibly.

Also note the possibility of mixed schemes:

/users/<username>/profile?lang=<lang>

Global Arguments

During the lookup, we should be able to recursively (?) process and remove some input arguments. For example, if we are to make the system purely RESTful, even with regards to the current language being used, all the pages on a website should be able to accept the 'lang' argument. The effect of this argument is to set a global variable for all string lookups, by changing the very definition of the _() gettext function itself. None of the functions called will want to process the 'lang' input variable.

Kinds of Input

Let's examine what we get as inputs to our system:

  • a URL path, with /-separated components;
  • query arguments (GET form)
  • POST variables (POST form)

Our scheme must consider all of these when mapping to a specific function call.

Abstracting the Internal Organization

Another aspect of the mapping is that we may want to hide the internal organization of our program codes. Also, by providing a more flexible approach we may open the door to future reorganization of the code, providing space for growth as well as backwards compatibility.

Serving Files

How do we serve files? We should be able to serve files normally.

Automated Argument Parsing

Wouldn't it be great if the argument parsing was performed automatically?

For example, each resource could be represented as a class, with the following interface:

class Profile:

    form = ... <form definition>

    def validate( self, args ):
        ... returns an error, redirect, or success

    def handle( self, arg1, arg2, arg3 ):
        ... at this point we can expect the arguments to be legal

Possible Solutions

Recursive tree lookup

We could create a tree of resources, like is done in Twisted, where the process of parsing the URL becomes a chain-of-responsibility by searching down the tree.

+ we can alter per-component behaviour of the lookup
- the hierarchy of URL components must match the hierarchy of objects/code

? we cannot combine query arguments in the code lookup, well, we could, if
we provide them in the lookup function, e.g.

def getChild( prepath, postpath, query_args )

Regular expressions

We could create a special-purpose set of rules to lookup code by using regular expressions on the request URL combined with conditions on the query arguments.

+ very flexible, we can combine query arguments in the code lookup
+ the hierarchy of components is entirely disconnected from the code
- it might be a little slow due to the linear list of regular expressions to
search

Privileges Mechanism

Can we do like in Zope3? e.g.

role.grant(resource)

Questions

Question:How do we identity a permission? In other words, how do we bind a permission to a resource?
Question:How do we specify permissions for a particular user?
Question:With the resource tree mapping URL/query to a function, how do we simply serve files?