Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2020/09/15/chemfp_34_changes

Changes in chemfp 3.4

In a previous essay I talked about the new licensing model in the recent chemfp 3.4 release. In short, no-cost academic licensing is now available, a pre-compiled version of the package, with some restrictions on use, is available for no-cost use on for Linux-based OSes.

The 3.4 release had the unofficial title back in action. I took time off from development to (among other things) write a paper about the chemfp project and take parental leave for our second kid.

Improved chemistry toolkit support

The world doesn't stop for me. Open Babel 3.0 was released, and all three toolkits (including RDKit and OEChem/OEGraphSim) added new structure formats and new fingerprint types since the chemfp 3.3 release. Here's a few highlights:

Performance improvements and ZStandard support

I added a number of performance improvements:

In addition, the sdf2fps progam, the "text" toolkit, and chemfp's interface to RDKit's structure formats all support ZStandard input and output.

Other tool improvements

There are a number of small tool improvements, like adding a --help-formats command-line option to give more detailed information about the support format types and options for each of the toolkits. (Previously much of this information was available from --help but that lead to information overload.)

One nice change is that simsearch now accepts a structure query as command-line input or a file, rather than an FPS file. Simsearch will read the target file to get the fingerprint type, then use that to parse the query structures correctly. For example:

% simsearch --query 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C' chembl_24_1.fps.gz -k 4
#Simsearch/1
#num_bits=2048
#type=Tanimoto k=4 threshold=0.0
#software=chemfp/3.4
#targets=chembl_24_1.fps.gz
#target_source=chembl_24.fps.gz
4	Query1	CHEMBL113	1.00000	CHEMBL1232048	0.70968	CHEMBL446784	0.67742	CHEMBL1738791	0.66667 

CHANGELOG

For the full list of changes see the What's New section of the documentation.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2020 Andrew Dalke Scientific AB