See Assignment #1 for the instructions of how to submit this assignment. The short version is to send me a tar or zip archive of a directory named "assignment3" with your name and answers in the README file. You will also include an image file in your submission.
ASSIGNMENT #3
Question 1
Warfarin is a blood thinner which is also a rat poison.
- How many PubChem. records are present that match warfarin?
- What is the molecular weight of warfarin?
- What is the canonical SMILES for warfarin?
- What's the cool scientific way of saying "rat poison"?
Bring up CID: 942. It's a highly toxic alkaloid.
- What's the common name for it?
- What's the IUPAC name?
CID: 445354 is the part of the rhodopsin in your eye. It's similar to a vitamin compound.
- What's the name of the vitamin?
And one last one. Again, use PubChem for this; don't search the PDB directly.
- Which PDB structure ("PDB" = "Protein Data Bank") contains a gramicidin structure? The PDB code is 4 letters long.
Question 2
Here's a set of depictions. Create the SMILES string for each one. You might use the OpenEye depict demo or the Daylight depict demo to test and compare your SMILES strings.
What names does the OpenEye namer generate for each structure?
If you would like additional exercises then try the Daylight SMILES practice page. These are optional and there is no need to send me any answers.
Question 3
These questions all use SMARTS. To test your answers use the OpenEye depict page. I found that using the "COB" (for "color on black") option instead of "BOW" (for "black on white") gives a more readable picture. You may want to toggle the "cp2txt" option which copies the input text into the results window.
Given the SMILES string c1ncccc1OC:
- How many times does the SMARTS pattern "C" match?
- How many times does the pattern "c" match?
- How many times does "[#6]" match?
- In English describe what the "[#6]" matches.
- What SMARTS pattern matches the aromatic nitrogen?
- How many times does "Occ" match?
Question 4
Here's a SMILES string CC(C)C=CCCCCC(=O)NCC1=CC(=C(C=C1)O)OC for a compound I like.
- Depict it using one of the available on-line programs. Include the image a a file in your submission. In your README file tell me where I can find it. The image must be between 150 and 350 pixels on a side.
- Use PubChem to do a structure search for this compound. What's is the CID and the common name for this structure?
Copyright © 2001-2008 Dalke Scientific Software, LLC.






