Main Projects
From I571_2006_Wiki
The I571 main project is designed to let you tackle in some depth an area of chemical informatics. I have listed projects with a variety of styles below. Try to pick a topic that you’ll enjoy doing, and that will expand your knowledge of a subject that interests you. If you have a project idea of your own, send me an email with a brief overview of your idea, and I’ll likely be happy for you to do that as a project. Projects may be added to this list during the semester. I expect all project reports, etc. to be submitted to me by email by Friday, December 1st.
Simple Combinatorial Chemistry Enumerator tool
Write a program that will take a list of SMILES strings of chemical structures, will break any amide C-N bonds, and will reassemble the parts to make a larger list of smiles that represent all of the structures that could be made by joining together all of the possible fragments from the original list (i.e. by allowing any broken “C” to bond to any broken “N” from the list of structures. You will probably want to use one of the 2D structure handling toolkits, like OpenEye.
Semantic Web for Chemical Informatics
For this project, you should research the web and journal articles for information about the Semantic Web and discuss some ways in which chemical informatics could be positively impacted by Semantic Web technologies. You should make reference to XML, CML, web services, WSDL, SOAP, ontologies, RDF, OWL, agents and any other technologies you think relevant. Your report should be approximately 5-8 pages of 12-point text.
Discussion of the relation of chemical informatics to bioinformatics, genomics, proteomics and Systems Biology
Chemical Informatics is becomingly increasingly related to other fields, particularly bioinformatics, genomics, proteomics, and the emerging umbrella field of “Systems Biology”. Discuss the ways in which interaction is currently developing between these fields (for example, the inter-relation of information generated by each of the disciplines), speculate on where this might go in the future, and describe some of the software available that allows integration of different kinds of data. You may want to look at resources such as COPASI (www.copasi.org) and Daylight CABINET (see http://www.daylight.com/meetings/mug05/Dixon/index.html). Your essay should be approximately 5-8 pages of 12-point text.
Data Download and Re-use Policies of Commercial Chemistry Database Producers
There is considerable variation among commercial database producers in terms of what data can be downloaded, re-packaged, and stored in local systems.
- Compare and contrast the policies of the following for the chemistry databases that are available at Indiana University with respect to: (a) the number of references or records that can be downloaded and retained, (b) the time period during which they may be kept, (c) restrictions on sharing or re-use of the data, (d) any other significant differences you encounter in their policies.
- Chemical Abstracts Service (SciFinder Scholar)
- Elsevier MDL (Beilstein, Gmelin)
- Cambridge Crystallographic Data Centre (Cambridge Structural Database)
- Thomson Scientific (Web of Knowledge/Science Citation Index)
- Describe the techniques for downloading the data from the various databases (canned formats available, whether selected fields can be obtained, etc.)
- Create an integrated database with data from at least three different sources. (You might want to use EndNote or other citation manager software that is available free to IU students or import the data into Excel or other appropriate software package.)
Evaluating KNIME workflow engine
We have a number of cheminformatics web service workflows implemented in Taverna. Re-implement these workflows (and services) in KNIME, and compare the two workflow engines. You might even like to implement some new workflows.
Integration of Software Packages to Extract Chemical Information
Make an application that integrates two or more chemoinformatics software packages or algorithms in an innovative way. You might wish to use a web service / workflow infrastructure for this
