Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2005/04/24/html_templates

HTML Templates

In my previous essay I showed how to generate a web page using Python statement to print text to a file. It was rather messy because it mixed Python statements and HTML text; neither one nor the other. It's hard to edit and maintain because usually two languages don't mix well.

The two languages here do different things. Python is used to make content and HTML is used present content. The standard solution is to separate the two almost entirely, most frequently as an HTML template. There are many different approaches. The one I prefer is through Zope's Page Template language. The template language is HTML with a few new attributes. Because it's HTML you can look at and edit the template file using standard HTML tools. The new attributes are commands for things like "insert text here" and "for each item in ...". The source of the data comes from Python code outside of the template.

Zope is a large package for doing web-based application development in Python. Parts of Zope, like its page template language and its object database, are available outside of Zope. simpleTAL is another implementation of the the page template language for Python. I've found the two to be comparable and tend to use ZPT.

The template can contain variable names and expressions based on the names. When ZPT evaluates the template it needs data for those variables. This is called the context and is a dictionary built by Python code. The variables can be string and numbers, containers like a list or dictionary, or more complex data structures. Page template expressions describe how to get to the needed data in the variable. For details you should read the documentation.

Looking at the existing scatter plot creation script shows which data is needed:

The image sizes are numbers. To show how the template generation works I'll make a simple context with only those four elements:

context = {"img_width": 30,
           "img_height": 35,
           "cmpd_width": 50,
           "cmpd_height": 75}
and define a Python string containing a template which demonstrates how to use those values in different ways.
template = """<html><body>
This image <a href="http://example.com/"
       tal:attributes="width img_width; height img_height" />
has width <span tal:replace="img_width">123</span>
and height <span tal:replace="img_height">456</span>.<br />

The compound images will be
<b tal:content="string:${cmpd_width}x${cmpd_height}">1x2</b>.
</body></html>
"""
While I define the template as a string here it's usually best to store the template in an HTML file and load it when needed. This lets you write the template using an HTML-aware editor and by people who don't need to know Python.

This small template show a few features of the template language. From the top, the first is how to define the width and height attributes of the img tag. There can only be one tal:attributes term in an element so the ";" is used to join two definitions into one single string. The next two examples show the tal:replace command which removes the element with its contents and replaces it with the value of the expression. Even though they will be removed I put example numbers (the 123 and 456) in the element's content to make it easier to understand the template.

The tal:content statement is similar to the tal:replace statement. It removes the content of the element but leaves the start and end tags unchanged. In this case I'm replacing the text with the result of a string expression. There are several ways to get data from the context; a path expression (the default), a string template, or even by evaluating a bit of Python code.

The ZPT implementation of the Zope template language requires an unfortunate bit of complication, recommended by ZPT home page. It defines a new method to call the page template like a function. Strictly speaking it isn't needed because the underlying pt_render() method can be called directly. But I'll use it because that's what they suggest.

import sys
from ZopePageTemplates import PageTemplate

# From the ZPT home page
class PageTemplate(PageTemplate):
    def __call__(self, context={}, *args):
        if not context.has_key('args'):
            context['args'] = args
        return self.pt_render(extra_context=context)

pt = PageTemplate()
pt.write(template)
sys.stdout.write(pt(context=context))
I wrote the results of template evaluation to stdout. It could as easily been a file. Here's the output.
<html><body>
This image <a href="http://example.com/" width="30"
              height="35" />
has width 30
and height 35.<br />

The compound images will be
<b>50x75</b>.
</body></html>

I chose to give each property its own unique name in the context. I didn't need to do that. I could have had one entry store the two scatter plot image properties and another store the two compound image properties. Here I'll use a dictionary but I could have used one of several approaches.

context = {"img": {"width": 30, "height": 35},
           "cmpd": {"width": 50, "height": 75}}
The template must be changed to handle this new data structure. Instead of using just the name I need to use a path expression to get the correct value. For example, img/width refers to the width term of the img dictionary, and is 30.
template = """<html><body>
This image <a href="http://example.com/"
       tal:attributes="width img/width; height img/height" />
has width <span tal:replace="img/width">123</span>
and height <span tal:replace="img/height">456</span>.<br />

The compound images will be
<b tal:content="string:${cmpd/width}x${cmpd/height}">1x2</b>.
</body></html>
"""
When evaluated, this produces output identical to the previous example.

I'll stick with the old way of defining each image size with its own variable name. I showed that alternative to make it easier to understand how to deal with the scatter plot point data. Each point has a name, location, and URL. In my previous essay I stored the point properties in four lists, one for each property. In the Zope template language it's easier to iterate over one list so my example will take that approach.

context = {"points":
  [ {"name": "A", "x": 1.2, "y": 2.3, "url": "imgs/A.gif"},
    {"name": "B", "x": 9.8, "y": 8.7, "url": "imgs/B.gif"},
  ]}
As you can see, points refers to a list of dictionaries.

The tal:repeat statement is the template equivalent of a for-loop. It repeats the given element once for each element in the list, and defines a new local variable name used to refer to the current element. Here's a template using the previously defined context:

template = """<html><body>
<ul>
 <li tal:repeat="point points">
   <a tal:attributes="href point/url" tal:content="point/name"></a> is at
   position <span tal:replace="string:${point/x}, ${point/y}" /> </li>
</body></html>
"""
and here's the output:
<html><body>
<ul>
 <li>
   <a href="imgs/A.gif">A</a> is at
   position 1.2, 2.3 </li>
 <li>
   <a href="imgs/B.gif">B</a> is at
   position 9.8, 8.7 </li>
</ul></body></html>

Only the tal:repeat command is new here so the rest should be pretty understandable. One word of caution. The ZPT parser is a stickler for balanced tags and if it doesn't find one it expects then it gives the opaque error message

ZopePageTemplates.PTRuntimeError: Page Template (unknown) has errors.
I got this message in the previous template because I forgot to close the li tag. While correctly required, most browsers don't need it so I forget to write it. If you get that error message there are a few solutions:

Getting back to the goal, which is to modify the scatter plot HTML page generation to use a template. The template needs a context so I'll start with that. This code will replace the current template generation code at the end of the main() function. At this point the image sizes are available in the variables cmpd_{width,height} and img_{width,height}, the x coordinates are in the list x, y coordinates in y, compound identifiers in cids, and the URLs in the list imgnames. I'll turn the 4 parallel lists into a list of dictionaries, one dictionary per point.

    context = {"cmpd_width": cmpd_width,
               "cmpd_height": cmpd_height,
               "img_width": img_width,
               "img_height": img_height,
               "points": [dict(x=x, y=img_height-y, cid=cid, url=url)
                            for (x, y, cid, url) in
                              zip(xcoords, ycoords, cids, imgnames)],
               }
This used the relatively new keyword constructor for dictionaries. Here's an example of it:
>>> dict(x=1.23, y=2.34, cid="ABC00001", url="imgs/ABC00001.gif")
{'url': 'imgs/ABC00001.gif', 'y': 2.3399999999999999, 'cid': 'ABC00001', 'x': 1.23}
>>> 

Next is the template. It needs to insert the correct image attributes and the <AREA> elements for each point. I'll save this in the file named scatter_template.py.

<HTML><HEAD>
 <TITLE>MW vs. XLogP</TITLE>
</HEAD>
<BODY>
<SCRIPT>
function mouseover(name) {
  var cid = document.getElementById("cid");
  cid.innerHTML = name;
}
function show(filename) {
  var cmpd_img = document.getElementById("cmpd_img");
  cmpd_img.src = filename;
}

</SCRIPT>
Mouse is over: <SPAN id="cid"></SPAN><BR>
Pick a point to see the depiction<BR>
<IMG SRC="mw_v_xlogp.png" ismap usemap="#points"
  tal:attributes="WIDTH img_width; HEIGHT img_height">
<IMG ID="cmpd_img" tal:attributes="WIDTH cmpd_width; HEIGHT cmpd_width">
<MAP name="points">
 <AREA shape="circle" tal:repeat="point points"
      tal:attributes="coords string:${point/x},${point/y},5;
                      onmouseover string:javascript:mouseover('${point/cid}');
                      href string:javascript:show('${point/url}')">
</MAP>
</BODY>
</HTML>

The rest of the code is the mechanics of opening files and calling the page template. I've gone over the details already so I'll end with the newest version of the scatter plot generation code, with the new parts in bold.

import subprocess, os
from itertools import *

from openeye.oechem import *

from matplotlib.figure import Figure
from matplotlib.patches import Polygon
from matplotlib.backends.backend_agg import FigureCanvasAgg
import matplotlib.numerix as nx

from ZopePageTemplates import PageTemplate

class PageTemplate(PageTemplate):
    def __call__(self, context={}, *args):
        if not context.has_key('args'):
            context['args'] = args
        return self.pt_render(extra_context=context)


def make_gif(smiles, filename, width = 200, height = 200):
    p = subprocess.Popen([os.environ["OE_DIR"] + "/bin/mol2gif",
                          "-width", str(width), "-height", str(height),
                          "-gif", "-", filename],
                         stdin = subprocess.PIPE,
                         stderr = subprocess.PIPE)
    p.stdin.write(smiles + "\n")
    p.stdin.close()
    errmsg = p.stderr.read()
    errcode = p.wait()
    if errcode:
        raise AssertionError("Could not save %r as an image to %r:\n%s" %
	                     (smiles, filename, errmsg))
                     

# True only for those molecules with an XLOGP field
def has_xlogp(mol):
    return OEHasSDData(mol, "PUBCHEM_CACTVS_XLOGP")

def get_data(mol):
    cid = OEGetSDData(mol, "PUBCHEM_COMPOUND_CID")
    weight = OEGetSDData(mol, "PUBCHEM_OPENEYE_MW")
    xlogp = OEGetSDData(mol, "PUBCHEM_CACTVS_XLOGP")
    if (cid == "" or weight == "" or xlogp == ""):
        raise AssertionError( (cid, weight, xlogp) )

    return cid, float(weight), float(xlogp)

def main():
    filename = "/Users/dalke/databases/compounds_500001_510000.sdf.gz"
    ifs = oemolistream(filename)

    imgdir = "imgs"
    if not os.path.isdir(imgdir):
        os.mkdir(imgdir)

    # Width and height for each compound image, in pixels
    cmpd_width = cmpd_height = 320

    cids = []
    weights = []
    xlogps = []
    imgnames = []
    # Get the first 100 compounds that have an XLogP field
    for mol in islice(ifilter(has_xlogp, ifs.GetOEGraphMols()),
                      0, 100):
        
        cid, weight, xlogp = get_data(mol)

        imgname = os.path.join(imgdir, "%s.gif" % (cid,))
        make_gif(OECreateCanSmiString(mol), imgname,
                 cmpd_width, cmpd_height)

        cids.append(cid)
        weights.append(weight)
        xlogps.append(xlogp)
        imgnames.append(imgname)
    
    fig = Figure(figsize=(4,4))
    ax = fig.add_subplot(111)

    sc = ax.scatter(weights, xlogps)
    
    ax.set_xlabel("Atomic weight")
    ax.set_ylabel("CACTVS XLogP")

    # Make the PNG and get the scatter plot image size
    canvas = FigureCanvasAgg(fig)
    canvas.print_figure("mw_v_xlogp.png", dpi=80)
    img_width = fig.get_figwidth() * 80
    img_height = fig.get_figheight() * 80

    # Convert the data set points into screen space coordinates
    trans = sc.get_transform()
    xcoords, ycoords = trans.seq_x_y(weights, xlogps)

    context = {"cmpd_width": cmpd_width,
               "cmpd_height": cmpd_height,
               "img_width": img_width,
               "img_height": img_height,
               "points": [dict(x=x, y=img_height-y, cid=cid, url=url)
                            for (x, y, cid, url) in
                              zip(xcoords, ycoords, cids, imgnames)],
               }
   
    pt = PageTemplate()
    template = open("scatter_template.html").read()
    pt.write(template)

    f = open("mw_v_xlogp.html", "w")
    f.write(pt(context=context))
   
    
if __name__ == "__main__":
    OEThrow.SetLevel(OEErrorLevel_Error)
    main()

There are many HTML template languages even when limited to those available for Python. Other popular ones include CherryTemplate, Quixote, Nevow, and I'll probably get emails from people reminding me of the dozen more that exist. All use different approachs to the same goal - simplfy making web pages. The differences are mostly in whose life is made simpler; the programmer, the web page designer, or someone who does both? I've found it best to make a clear distinction between content and presentation and have a template language which can be handled by normal HTML tools. That's why I prefer Zope's template language. You may have your own requirements which suggest using another system.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2020 Andrew Dalke Scientific AB