Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2020/09/03/new_chemfp_3x_licensing_model

New chemfp licensing model in chemfp 3.4

Background: chemfp is a Python package for high-performance cheminformatics fingerprint similarity search. There are two development tracks. Chemfp 1.x is the no-cost/open source version, which only supports Python 2.7, and chemfp 3.x is the more advanced and capable version which supports Python 2.7 and Python 3.6+.

I have a new license model for the chemfp 3.x development track, starting with chemfp 3.4. You can now install chemfp 3.4.1 as a pre-compiled package for (many) Linux-based OSes and use most of its functionality, at no cost, with the following:

python -m pip install chemfp -i https://chemfp.com/packages
There are some restrictions. The base license agreement does not allow you to: These restrictions are lifted if you have a valid license key.

Even without a license key, a lot of functionality remains. I expect many people may be interested in clustering <50,000 fingerprints, or using chemfp's "toolkit" API, or other features which I'll highlight over the next few weeks.

Or, you can install chemfp, see that it works for your project, then request a license key to evaluate chemfp for a more extensive check, and then purchase a binary or source code license.

No-cost academic licensing is also available under an academic license.

Why the change?

When I started the project 10 years ago, I only offered source code licensing (under the MIT license). I come out of the free/open source software (FOSS) tradition. Most FOSS projects are funded by academic or industrial R&D. I wanted to see if I could develop a self-funded FOSS project, paid either through development/consultant contracts or as license costs.

My initial plan was to get people to pay for improvements and access to the latest version; an older version would be available at no cost. That worked, for a while, when there were enough clear improvements that one company could justify paying the full development costs.

But consider the Python 2 to 3 transition. It took about two months to develop. I had to change the chemfp API to be distinct about "strings" and "bytes", and port the C extension to support the new C API - chemfp has a lot of C code for performance reasons. (Okay, cloc reports 28kSLOC of C code, and 22kSLOC of Python code.) And I had to figure out how I was going to handle file I/O in Python 3. And I wanted to support Python 2.7 and 3.5+ at the same time, because not all of my customers or potential customers were going to migrate at the same time.

No one customer is going to pay for that 2 months of work, nor should they. It's a diffuse need which is best shared across multiple customers.

My "FOSS source code only" model also made the sales process harder. Who will pay for a $5,000+ software package without being able to evaluate it? Few. So, if they evaluate something with a FOSS license, ... they can still use it after the evaluation is over, right? 'Cause that's what FOSS is all about, yes?

My "FOSS source code only" model also made it hard to segment the market. Academics expect no-cost or low-cost software, right? But they are the ones most likely to do exactly what a FOSS license allows, and place it on, say, a public sourcehut repository for anyone to use. While pharmaceutical companies rarely distribute source code. Which means my economic risk is higher if I distribute to an academic than if I distribute to a pharmaceutical company, while my revenue is lower.

The solution over the years was to move to a more traditional model. I have pre-compiled packages which work on Python 2.7, 3.6, 3.7, and 3.8 for most Linux-based OSes. These require a license key to use its full capabilities. People can buy a time-based license key, or pay for source code access under a proprietary model.

I haven't quite given up on FOSS licensing. I still offer chemfp under the MIT license, though it's also the most expensive option.

For more details about my efforts to fund chemfp in a FOSS context, see my chemfp paper or the relevant Hacker News discussion.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2020 Andrew Dalke Scientific AB