EuroCUP 2008 presentation
The following is text to accompany my presentation for EuroCUP 2008. I do not have a license for OEChem on my public facing web server machine so I cannot have a live demo for any of the code examples.
AJAX and the OpenEye Tools
My name is Andrew Dalke. I'm an independent software consultant and instrutor based in Göteborg (Gothenburg), Sweden. I mostly focus on developing computational chemistry tools and helping scientists become more capable in using computers to do their research.
Suppose you want a web page that shows a graphical 2D depiction of a compound given its SMILES. One very traditional way to do this - the Daylight libraries have supported it for over 10 years - is with a CGI script serving images based on the GET query parameters. The HTML might look like
<html> ... <img src="/depict.cgi?smiles=CC(=O)Oc1ccccc1C(=O)O" /> ... </html>
The web page gets the HTML, figure out it needs an image, and makes an HTTP request to the src URL. The web server, which is usually Apache, gets the request, converts it into a CGI request, and runs the program named "depict.cgi". This program uses the CGI parameter to create the requested depiction. In real life the CGI script may in turn call another program to do the actual depiction.
This interface was developed about 15 years ago and is still a valid way to write web applications. There are many other ways to handle the interface between the outside world and the actual work which needs to be done. The modern term for the different layers, which can include database access, session maintenance, and output templates, are now called the "web application stack." Ruby on Rails is a popular "full stack" system developed over the last 4, and Django and TurboGears are roughly similar systems for Python. All my examples are based on TurboGears.
The web server implementation should not affect how the web interface works. That it, there should be no reason to change any of the URLs or get different HTML back from the server. In practice though you a few things do change. For example, using the extension ".cgi" in the URL is a bit of a cheat. It's there because that's one way Apache can tell if a file is a data file or an executable CGI script. In use it's a "leaky abstraction" because it lets some of the internal implementation decisions leak into the public. This can make it harder to port to other system.
In my case I'm using TurboGears, which by default doesn't do well with periods in the URL, so for my examples I'll remove the ".cgi" from the URL.
The TurboGears code is structured very similarly to the Apache code. An HTTP request comes in, TurboGears converts that into a Python function call (instead of CGI request), and calls the function that handles the request. In this case that Python function doesn't know anything about chemistry. It leaves the details up to OpenEye's ogham toolkit for 2D structure depiction.
The biggest architecture different is that everything is done through Python and Python libraries, and everything occurs in the same process space. I don't have to start up a new program for every request.
By the way, if you're curious on how I get ogham to generate a PNG output as a string, rather than as a GIF or other non-PNG file, see my earlier essay on "OE8BitImage to PNG." It was a fun bit of reverse engineering.
My web page example had a single hard-coded SMILES. What if I want something more interactive, where the user can input a SMILES and see the depiction image? I'll do this with an HTML form, which sends the "smiles" parameter to the "/depict" service on the web browser. This is the same service I used for the HTML image.
Viewing just the image is very static. The image just sits there. I would rather see the structure I submitted and also have a form for submitting a new SMILES to depict. In this case I'll submit the form to a new "/show_depict" handler, which will respond with HTML that includes an img element for requested SMILES and includes the form for doing a new "/show_depict" depiction. Note that this requires two requests to the server; the first to "/show_depict" for the HTML and and the second to "/depict" to get the depiction image.
[pages 9 and 10]
This is out-dated!
Here's the same form rewritten for use with jQuery. You see at the time I include the jQuery code, which is available as a single file from this URL. I then have a script block that sets up the interactive page. What this is saying is:
When the document is fully loaded (that is, all the HTML has been parsed), Find the elements with tag name "form" (there is only one) When its submit button is pressed ... call this anonymous function. ("anonymous" means "does not have a name")
The anonymous submit function does the following:
Select the "#smiles" element (that's the element with id 'smiles'). Get it's "val" property, which in this case is the input text for that field. Escape it to make a depiction URL. Assign the URL to the "src" attribute of the "#depiction" element (the element with id 'depiction') Finally, "return false" to tell the browser it does not need to send the form.
[pages 13 and 14]
It's also complicated because things like "control-v" for "paste", and "home" for "go to start of the input", and the backspace key are also handled as key input, but aren't simple changes to the text field. The easiest solution I found was to wait until after the event happens, let the browser do whatever is appropriate to the key input, and only then examine the contents of the text field.
Don't be put off by seeing that MochiKit's last release was in 2006. It's a stable, well-developed and mature library.
The "update_image" function should be very familiar. It's the code that extracts the text value from the "#smiles" element, constructs the image URL, and assigns it to the "#depiction" element's "src" field.
One of the many nice things about the OpenEye toolkit is it will handle partial SMILES strings as input. OEParseSmiles parses as much as it can understand and return True on success. If it returns False then the SMILES was not correct or was incomplete, but the molecule object will contain as much of the molecule as it was able to parse. It's a valid molecule object, and the depiction code has no problems laying it out.
The example I depicts the molecule while typing in the SMILES string. I'm going to change it a bit and also display the IUPAC name for the SMILES string using OpenEye's naming code on the server. Again, this will be a highly interactive server where I can see the name while I am typing it.
This is a bit more complex than the image example because I need data from the server. I want to know if the SMILES string is a valid SMILES string (it could be an incomplete input) and the IUPAC name for the molecule, or at least as much of the input as OEParseSmiles could understand.
My one change to the HTML is to include a "Name: " field below the image, which is where the IUPAC name will go. That's a label and an empty text span element, with the id "compound_name."
The last line of real code shows jQuery's function call chaining. The '$("#compound_name")' selects the element with id "compound_name", which is the text span. The ".text(smi2name_result.name)" gets the "name" from the results dictionary and assigns it to the text content of the spam element. This is what displays the name to the user.
The result of calling ".text(...)" is the same query object. I can use it to change other properties of my selection. So I'll change the CSS "color" property and so it shows the red or black status value.
In case you're curious, here's most of the code on the server to implement "smi2name" using TurboGears. I left out only the scaffolding code that TurboGears writes for you and the lines to import the right OpenEye libraries into the Python module.
The hardest part to get working was the mouseover support for the depiction. I ended up making extensive use of CSS, which tells the web page how to lay out a page. I used 4 layers on top of each other to get things working. The bottom layer is the Ogham depiction, and is the PNG image you've seen elsewhere. This is generated on the web server but only needs to change if the SMILES or the image size changes.
On top of that, the third layer is a semi-transparent image showing which atoms have been selected, either from mouse selection or from the SMARTS/atom index selection. This must occur on the server because that's what understands SMARTS, and must be recreated if the size or SMILES changes.
The top two layers are for mouseover support. The top layer is a transparent image containing only an image map. Each hotspot on the map is a circle, centered on the center of an atom. I use this to tell if the mouse is over an atom. If the image size changes then I make a JSON request to the server to get the new atom locations and scaled atom radius.
The four layers are aligned so to the user it looks like one coherent view, despite the implementation complexity.
Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me
Copyright © 2001-2010 Dalke Scientific Software, LLC.