Home
entries friends calendar user info

Advertisement

Friends
jcalderone
[info]jcalderone
Add to Memories
Tell a Friend

Welcome to the 14th installment of "Twisted Web in 60 seconds". In many of the previous installments, I've demonstrated how to serve content by using existing resource classes or implementing new ones. In this installment, I'll demonstrate how you can use Twisted Web's basic or digest HTTP authentication to control access to these resources.

Guard, the Twisted Web module which provides most of the APIs which will be used in this example, helps you to add authentication and authorization to a resource hierarchy. It does this by providing a resource which implements getChild to return a dynamically selected resource. The selection is based on the authentication headers in the request. If those headers indicate the request is made on behalf of Alice, then Alice's resource will be returned. If they indicate it was made on behalf of Bob, his will be returned. If the headers contain invalid credentials, an error resource is returned. Whatever happens, once this resource is returned, URL traversal continues as normal from that resource.

The resource which implements this is HTTPAuthSessionWrapper, though it is directly is directly responsible for very little of the process. It will extract headers from the request and hand them off to a credentials factory to parse them according to the appropriate standards (eg HTTP Authentication: Basic and Digest Access Authentication) and then it hands the resulting credentials object off to a portal, the core of Twisted Cred, a system for uniform handling of authentication and authorization. I am not going to discuss Twisted Cred in much depth here. To make use of it with Twisted Web, the only thing you really need to know is how to implement a realm.

You need to implement a realm because the realm is the object which actually decides which resources are used for which users. This can be as complex or as simple as it suitable for your application. For this example, I'll keep it very simple: each user will have a resource which is a static file listing of the public_html directory in their UNIX home directory. First, I need to import implements from zope.interface and IRealm from twisted.cred.portal. Together these will let me mark this class as a realm (this is mostly - but notentirely - a documentation thing). I'll also need File for the actual implementation later.

  from zope.interface import implements

 from twisted.cred.portal import IRealm
 from twisted.web.static import File

 class PublicHTMLRealm(object):
     implements(IRealm)

A realm only needs to implement one method, requestAvatar. This is called after any successful authentication attempt (ie, Alice supplied the right password). Its job is to return the avatar for the user who succeeded in authenticating. An avatar is just an object that represents a user. In this case, it will be a File. In general, with Guard, the avatar must be a resource of some sort.

      def requestAvatar(self, avatarId, mind, *interfaces):
         if IResource in interfaces:
             return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
         raise NotImplementedError()

A few notes on this method:

  • The avatarId parameter is essentially the username. It's the job of some other code to extract the username from the request headers and make sure it gets passed here.
  • The mind is always None when writing a realm to be used with Guard. You can ignore it until you want to write a realm for something else.
  • Guard always passed IResource for the interfaces parameter. If interfaces only contains interfaces your code doesn't understand, raising NotImplementedError is the thing to do, as above. You'll only need to worry about getting a different interface when you write a realm for something other than Guard.
  • If you want to track when a user logs out, that's what the last element of the returned tuple is for. It will be called when this avatar logs out. lambda: None is the idiomatic no-op logout function.
  • Notice that I have written the path handling code in this example very poorly. This example may be vulnerable to certain unintentional information disclosure attacks. This sort of problem is exactly the reason FilePath exists. However, that's an example for another day...

We're almost ready to set up the resource for this example. To create an HTTPAuthSessionWrapper, though, we need two things. First, a portal, which requires the realm above, plus at least one credentials checker:

  from twisted.cred.portal import Portal
 from twisted.cred.checkers import FilePasswordDB

 portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])

FilePasswordDB is that credentials checker I mentioned. It knows how to read passwd(5)-style (loosely) files to check credentials against. It is responsible for the authentication work after HTTPAuthSessionWrapper extracts the credentials from the request.

Next we need either BasicCredentialFactory or DigestCredentialFactory. The former knows how to challenge HTTP clients to do basic authentication; the latter, digest authentication. I'll use digest here:

  from twisted.web.guard import DigestCredentialFactory

 credentialFactory = DigestCredentialFactory("md5", "example.org")

The two parameters to this constructor are the hash algorithm and the http authentication realm which will be used. The only other valid hash algorithm is "sha" (but be careful, MD5 is more widely supported than SHA). The http authentication realm is mostly just a string that is presented to the user to let them know why they're authenticating (you can read more about this in the RFC).

With those things created, we can finally instantiate HTTPAuthSessionWrapper:

  from twisted.web.guard import HTTPAuthSessionWrapper

 resource = HTTPAuthSessionWrapper(portal, [credentialFactory])

There's just one last thing that needs to be done here. When I introduced rpy scripts, I mentioned that they're evaluated in an unusual context. This is the first example which actually needs to take this into account. It so happens that DigestCredentialFactory instances are actually stateful. Authentication will only succeed if the same instance is used to generate challenges and examine the responses to those challenges. However, the normal mode of operation for an rpy script is for it to be re-executed for every request. This leads to a new DigestCredentialFactory being created for every request, preventing any authentication attempt from ever succeeding.

There are two ways to deal with this. First, the better of the two ways, I could move almost all of the code into a real Python module, including the code which instantiates the DigestCredentialFactory. This would make ensure the same instance was used for every request. Second, the easier of the two ways, I could add a call to cache to the beginning of the rpy script:

  cache()

cache is part of the globals of any rpy script, so you don't need to import it (it's okay to be cringing at this point). Calling cache makes Twisted re-use the result of the first evaluation of the rpy script for subsequent requests too. Just what I want in this case.

Here's the complete example (with imports re-arranged to the more conventional style):

cache()

from zope.interface import implements

from twisted.cred.portal import IRealm, Portal
from twisted.cred.checkers import FilePasswordDB
from twisted.web.static import File
from twisted.web.resource import IResource
from twisted.web.guard import HTTPAuthSessionWrapper, DigestCredentialFactory

class PublicHTMLRealm(object):
   implements(IRealm)

   def requestAvatar(self, avatarId, mind, *interfaces):
       if IResource in interfaces:
           return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
       raise NotImplementedError()

portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])

credentialFactory = DigestCredentialFactory("md5", "localhost:8080")
resource = HTTPAuthSessionWrapper(portal, [credentialFactory])

And voila, a password-protected per-user Twisted Web server.

I've gotten several requests to write something about sessions, so there's a good chance that's what you'll find in the next installment.

Tags: , , , , , ,

jcalderone
[info]jcalderone
Add to Memories
Tell a Friend

As of Monday, the 9th, I will be considering opportunities for short term consulting and contract work. Please feel free to contact me if you have a software challenge to tackle, particularly if it involves one or more of Python, Twisted, networking, event-driven architectures, massive scaling, or open source software.

Immediately following the demise of Divmod this summer, I took a job at a major international corporation. A number of factors conspired to make this decision non-viable in the long term. Today I gave notice. I'm excited to be able to get back to doing what I love - solving challenging, interesting problems in a flexible, open environment.

One of the other things I've been unable to do since even before Divmod's end is commit serious time to Twisted development and maintenance. This is something else I'm looking forward to re-engaging in. I made a sizable dent in Twisted's open ticket count last fall and winter, thanks to funding from the Twisted Software Foundation (in turn thanks to all of the Twisted founding sponsors). I'll be able to continue this work thanks to this year's sponsors (visible on the front page of the Twisted site), though perhaps not to the same extent. If you'd like to help out in this regard, become a sponsor! All donations are useful and appreciated!

Tags: ,

jcalderone
[info]jcalderone
Add to Memories
Tell a Friend

  • The Player of Games. Iain Banks.

  • The State of the Art. Iain Banks.

  • Use of Weapons. Iain Banks.

  • Excession. Iain Banks.

  • Inversions. Iain Banks.

  • The Planck Dive. Greg Egan.

  • The Name of the Wind. Patrick Rothfuss.

  • Red Seas Under Red Skies. Scott Lynch.

Tags:

jcalderone
[info]jcalderone
Add to Memories
Tell a Friend
A bunch of us carved pumpkins for Halloween, and Ying was thoughtful enough to bring along her very nice camera and got some nice shots.

Tags: , ,

jcalderone
[info]jcalderone
Add to Memories
Tell a Friend
jcalderone
[info]jcalderone
Add to Memories
Tell a Friend

Welcome to the 13th installment of Twisted Web in 60 seconds. For a while, I've been writing about how you can implement pages by working with the Twisted Web resource model. The very first example I showed you used an existing Resource subclass to serve static content from the filesystem. In this installment, I'll show you how to use WSGIResource, another existing Resource subclass which lets you serve WSGI applications in a Twisted Web server.

First, a few things about WSGIResource. It is a multithreaded WSGI container. Like any other WSGI container, you can't do anything asynchronous in your WSGI applications, even though this is a Twisted WSGI container. In the latest release of Twisted as of this post, 8.2, WSGIResource also has a few significant bugs. These are fixed in trunk (and the fixes will be included in 9.0), so if you want to play around with WSGI in any significant way, you probably want trunk for now.

The first new thing in this example is the import of WSGIResource:

  from twisted.web.wsgi import WSGIResource

Nothing too surprising there. We still need one of the other usual suspects, too:

  from twisted.internet import reactor

You'll see why in a minute. Next, we need a WSGI application. Here's a really simple one just to get things going:

  def application(environ, start_response):
     start_response('200 OK', [('Content-type', 'text/plain')])
     return ['Hello, world!']

If this doesn't make sense to you, take a look at one of these fine tutorials. Otherwise, or once you're done with that, the next step is to create a WSGIResource instance - as this is going to be another rpy script example.

  resource = WSGIResource(reactor, reactor.getThreadPool(), application)

I need to dwell on this line for a minute. The first parameter passed to WSGIResource is the reactor. Despite the fact that the reactor is global and any code that wants it can always just import it (as, in fact, this rpy script simply does itself), passing it around as a parameter leaves the door open for certain future possibilities. For example, having more than one reactor. There are also testing implications. Consider how much easier it is to unit test a function that accepts a reactor - perhaps a mock reactor specially constructed to make your tests easy to write ;) - rather than importing the real global reactor. Anyhow, that's why WSGIResource requires you to pass the reactor to it.

The second parameter passed to WSGIResource is a thread pool. WSGIResource uses this to actually call the application object passed in to it. To keep this example short, I'm passing in the reactor's internal threadpool here, letting me skip its creation and shutdown-time destruction. For finer control over how many WSGI requests are served in parallel, you may want to create your own thread pool to use with your WSGIResource. But for simple testing, using the reactor's is fine (although I'm cheating here a little - I apologize - getThreadPool is a new API, not present in 8.2: you need trunk for this example to work; please ask Chris Armstrong to release 9.0 already).

The final argument is the application object. This is pretty typical of how WSGI containers work.

The example, sans interruption:

  from twisted.web.wsgi import WSGIResource
 from twisted.internet import reactor

 def application(environ, start_response):
     start_response('200 OK', [('Content-type', 'text/plain')])
      return ['Hello, world!']

 resource = WSGIResource(reactor, reactor.getThreadPool(), application)

Up to the point where the WSGIResource instance defined here exists in the resource hierarchy, the normal resource traversal rules apply - getChild will be called to handle each segment. Once the WSGIResource is encountered, though, that process stops and all further URL handling is the responsibility of the WSGI application. Of course this application does nothing with the URL, so you won't be able to tell that.

Oh, and as was the case with the first static file example, there's also a command line option you can use to avoid a lot of this. If you just put the above application function, without all of the WSGIResource stuff, into a file, say, foo.py, then you can launch a roughly equivalent server like this:

  $ twistd -n web --wsgi foo.application

Tune in next time, when I'll discuss HTTP authentication.

Tags: , , , , ,

profile
User: [info]amiramir
Name: Divine Modern
calendar
Back November 2005
12345
6789101112
13141516171819
20212223242526
27282930
page summary
tags