Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2007/10/18/resolver

Resolver

[NOTE: Fuzzyman, one of the Resolver developers, responded to this article.]

Resolver is a spreadsheet/Python program hybrid. The cells of the spreadsheet correspond to data definitions in Python. Python code can define functions used in expressions in cells, change fields in the spreadsheet, and change the presentation of the spreadsheet and its cells.

The Python code contains many sections, some of which are editable and some of which are not. These are: "Import statements", "Worksheet creation", "Pre-constants user code", "Constants and Formatting", "Pre-formulae user code", "Formula code", and "Post-formulae user code". The "user code" sections are editable. Note the British use of "formulae" instead of the American use of "formulas". Hmm, and is there a capitalization inconsistency between "Import statements" and "Constants and Formatting"?

During recalculation the Resolver engine processes the sections in various steps. For example, after the worksheets creation code the "pre-constants user code" is run, which defines functions that can be used by the "constants and formatting" section that populates fields in the speadsheet. Control goes back and forth between user-defined sections and sections defined by the contents of the spreadsheet.

Resolver also does some work during the processing. For example, after the formulas (sorry, "formulae") are defined, the engine computes all of the values, including inter-cell dependencies. The post-formulae code can then operate on the computed values, for example to highlight a field which is out of bounds. Because of this, the entire spreadsheet is recomputed each time (but read this on partial recalculation).

The Resolver screencast is well done and you should watch it to get a better idea of how things work in action. Fuzzyman works for Resolver and has independently set up a very useful site called Resolver Hacks which is more user and developer oriented than the main Resolver corporate page. As far as I can tell there's no link from the corporate page to the hacks page but I found it probably because of Fuzzyman's blog. IRL he's Michael Foord.

The idea of a spreadsheet comes from the financial world, and Resolver continues that tradition. But there's nothing in either which constrains them to the financial world. Many computational chemists use Excel to hold and analyze chemistry data, and there are support tools to help them. Probably the most well-known is Accord for Excel. It adds the ability to view structures in 2D, color and align structures, compute physical properties, and more.

The Resolver people sent emails to people in non-financial communities to see if there was interest. I was one of those people, with my background writing software for chemists and biologists. After watching the screencast I sent them a set of questions, mostly answered now on the hacks page, and especially the Bugs, Features and Futures section. They replied, and I was given access to the beta. Emily and I sat down a couple of days ago to look at it, and here's the notes I took about it, fleshed out a bit.

Giving it a whirl

We installed it on her personal machine, as I have a Mac and I don't know how to get .Net code working on it. The install went without a problem. We started with the simple example spreadsheet then Emily watch the screencast while I tried a few things based on the code therein.

One of the difficulties we had was figuring out what the different blocks of Python code were for. There's nothing which explains the control flow, or that while some Python sections are editable, others are not. We tried changing a Python formula that was defined in a cell but while we could click on it we couldn't change it. That's understandable because it's very hard to keep the 1-to-1 mapping between the cells and the Python code if anyone can edit the code. That's why only some of the blocks are editable.

On the other hand, because that code is machine generated, it should be possible to figure out which cell generated the code so that a click or right-click can take you to that cell for editing. After all, if I can see the code I want to be able to change it.

While experimenting I created two cells, one on top of the other, with the same formula in each. Resolver showed an "X" through the cell. Perhaps that's the expect way to indicate a duplicated formula by experts but neither of us are Excel users. We tried to figure it out using a right-click but found nothing in the context menu or elsewhere helped out.

Want my own editor!

I use emacs. Emily uses Wing IDE and like the tooltip help common to most modern IDEs. The Resolver code editor does have syntax highlighting but doesn't have tooltips, nor even intelligent indentation. And Emily was used to using control-/ to comment out a block of code and control-R to "run" the code, rather than using F9 to recalulate. I expect that last is because Resolver mixes the spreadsheet and the programming expectations together. Emily's comment was "seems to me I'll get frustrated with this quite quickly."

These editing features could be added, but why spend time doing that when so many editors already exist? Instead we created a .py file next to the .rsl (the extension for a Resolver spreadsheet) file and imported it. This works, with caveats. The IDEs don't know about the .Net environment inside Resolver, so can't pull in those tooltips. Loaded modules are cached, so the code in Resolver needs to use the "import with reload" hack.

The PYTHONPATH is wonky. It seems to be set to the directory containing the .rsl file but when we clicked on a .rsl file through a web browser, the newly loaded spreadsheet's cwd was the same as the other spreadsheet. We didn't investigate the full behaviour. Still, there should be same way to set the PYTHONPATH to point to any locally installed Python modules. If there is, we don't know it.

Resolver, outside of the IronPython runtime, doesn't know anything about the Python files. You can't sent the .rsl and assume it works, if it references imported Python code. Instead you need to zip all the files together and exchange it that way. One solution I was thinking of was to make the .rsl file more like a project file, with the ability to add external files to some list. In that way it could watch those files for changes, and support some way to bundle everything together more automatically.

Another "spreadsheet world meetings programming world" question was how to handle version control. This is somewhat mentioned in the bugs, feature and futures page, but only from the spreadsheet developers viewpoint. I think it's mostly a matter of figuring out which of the many possible solutions is a good enough match to what's needed.

Testing

Emily brought up the question of how to test a spreadsheet and any associated Python files which depend on the Resolver runtime. Is there a way to run things "headless"? She was very happy to see the section Testing Spreadsheet Logic, which is based on unittest. She's a TextTest fan, which compares stdout, stderr and execution times to a golden reference. Making Resolver work with that does not appear hard so we didn't spend much time exploring that option.

Accessing web services

There are chemical informatics libraries for C, C++, Python, Java and Ruby but none for .Net that I know of. While it might be possible to write bindings for the C/C++ libraries, I have no experience with them. Instead, we experimented with having Resolver call a web service. The hacks site has a relevant example plotting currency changes where the data is fetched over the web.

You can see from the example that it uses a .Net library for doing the data fetch, rather than using urllib. This is because IronPython does not implement the socket library, though you can get an emulated version from FePy.

It looks like Resolver freezing while Python is doing the network I/O. That is, when Emily clicked on the .rsl file to open it, Resolver didn't seem to respond for almost a minute before finally displaying the spreadsheet. We assume it was loading the new code and doing the network fetch.

Some of the calculations we want to do might take minutes to run. In the web browser code we poll the server every few seconds to update the status (either as a reload or through an AJAX request). How would we do an asyncronous task like that in Resolver? For example, I might spawn off a polling thread. After every poll I get a status value, which I'll use to color a field and probably also write something like "10% done" ... along with a cancel button? When it's finally done I want to display the result in the cell, remove any controls and status coloring, and let any other cells which may depend on that newly added value be computed.

Limits? Console?

How big can a spreadsheet be? I think Excel maxes out at 64K, which causes a problem with some of the data sets chemists use. We didn't experiment to see what the limits are, and I can't find a page stating the Resolver limitations like this, if any.

We really do want an interactive interpreter console, in order to more quickly test things out and interact with the spreadsheet. It's coming.

Plotting (err, "charts") and graphics

The plotting in that example used a 3rd party .Net package called ZedGraph. (See also basics of ZedGraph and Resolver.) The plots it makes are colorful, and a scientist would look at them with mistrust. I'm only somewhat joking. The connection between ZedGraph and Resolver is pretty coarse. The end result of the plot is a bitmap, which Resolver displays. There's no way to adjust the scales, zoom in, pick points, etc. Apparently better plotting is coming, but plotting is a genuinely hard subject.

It is possible to use an external plotting package, like this example using gnuplot. Oh, and the spreadsheet world calls them "charts" instead of plots.

As I mentioned, I would like some way to override how a cell is displayed so I can stick a 2D (or 3D) chemical drawing in the cell, perhaps along with some text label on the bottom. I would also like some way to, say, click on a cell and have some window pop up, perhaps containing a larger 2D depiction or showing molecular properties. Best would be if I could control everything inside of that window, perhaps in its own thread. I don't know if that's possible or in the near-term plans.

Database? Running as a web page?

It's possible to hook Resolver up to a database and populate the spreadsheet that way, but I didn't find any examples of using it. We also didn't experiment with turning the spreadsheet into a web page.

Gotta talk to users

I like the idea of making things more programmable. I've been one of many advocating a Mathematica-style workbook approach, like SAGE does, but I don't like the systems I've seen so far. I do like Resolver. The next step, now that we've explored it a bit, is to talk with the chemists and learn more about what they use spreadsheets for and if/how Resolver might improve things. If that work out, then talk more with the Resolver people to perhaps advocate for certain prioritizations.

One last thing I think the Resolver team should do is to improve their "about" splash screen. It's very hard to read. :)

Read Fuzzyman response.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2020 Andrew Dalke Scientific AB