Programming for Chemical and
Life Science Informatics - I573
Spring 2009
Introduction

This
class (formerly I590) is a 3CH graduate course aimed at giving students a broad knowledge of the
programming, algorithmic and software techniques used in the chemical and life science informatics
disciplines, though the main focus will be on cheminformatics. In the area of programming we'll be
looking at tools to help write programs and toolkits that allow us to focus on domain specific
problems. We'll also look at some theoretical topics, such as graph theory and machine learning,
which are widely used in cheminformatics and bioinformatics problems. And finally, we'll also cover a
variety of technologies that are playing an important role in todays cheminformatics and
bioinformatics projects. Examples of such technologies include workflows, wikis and blogs, ontologies
and so on.
You should be comfortable in at least one programming language. The class will mainly focus on
Python and Java, though the examples should be easily translatable to other languages. There is no
language restriction for assignments, so you can use whatever you're comfortable with.
The
class is located in Bloomington, but is also offered as a Distance Education course to any graduate
in the US through teleconference and web conferencing services. A few of the lectures will be given
by guest lecturers from industry and academia.
The address of the mailing list for the class is SP09-BL-INFO-I573-13748@oncourse.iu.edu. Mails
sent to this address will be recieved by all members of the class.
Use the links below to jump to the details
-
Place & Time
-
Office Hours
-
Distance Students
-
Books & References
-
Course Evaluation
-
Course Outline
-
Class Schedule
-
Academic Policy
Place & Time
The class will be held on Tuesdays and Thursdays from 9:30am - 10:45am in I 105.
Office Hours
By appointment
Distance Students
- From your phone, dial 800-940-6112, and enter the passcode
- Go to http://breeze.iu.edu/i573 and log in as a guest
- If you have any accessing problems during the class, try sending a chat message in Breeze. If
that doesn't work, interrupt the teleconference, or call Rajarshi's cellphone (number will be given
out in email)
Books & References
Blogs
Scientific programming
Python
Java
R
- Introductory tutorial
- Brief overview
of R and some basic statistics
- An excellent tutorial that covers some
advanced topics
-
R Graph Gallery is the best place to go if
you want to see how to draw a certain type of graph
-
ESS - R support for Emacs
- Extending R
-
SQL
- SQL Tutorials (focusing on the language itself)
-
- Postgres specific information
-
- Chemistry cartridges
-
-
gNova is specific to PostgreSQL and is based on OEChem (commercial)
-
Torus also allows Markush
searching within an Oracle DB and is based on the BCI toolkit (commercial)
-
DayCart is a cartridge for Oracle
based on the Daylight toolkit (commercial)
-
Tigress an open source cartridge for
PostgreSQL (based on OpenBabel)
-
MyChem is a cartridge for MySQL and is
based on OpenBabel
- Programmatic Database Access
-
- Java uses JDBC and
specific databases require their own JDBC drivers (jar files). The PostgreSQL driver is
here
- Python uses the DataBase API (DB-API) to support DB agnostic access. The
preferred Python package for PostgreSQL access is Pyscopg2. A short tutorial is available
- You can access a PostgreSQL DB via C using libpq
Toolkits
Academic Policy
The principles of academic honesty and professional ethics will be vigorously enforced in this
course, following the
IU Code of Student
Rights, Responsibilities, and Conduct, and the
School of Informatics
Academic Regulations. This includes the usual standards on acknowledgment of help, contributions
and joint work, even when you are encouraged to build on libraries and other software written by
other people. Cases of academic misconduct (including cheating, fabrication, plagiarism,
interference, or facilitating academic dishonesty) will be reported to
IUB Office of Student Ethics, a branch of the Office of
the Dean of Students. Your submission of work to be graded in this class implies acknowledgement of
this policy. If you need clarification or have any questions, please see the instructor during office
hours.
Evaluation
- 10% - Participation
- 30% - Homework assignments
- 50% - Project
- 10% - Presentation
Course Outline
- Scientific computing
- Software engineering practice
-
- Methodologies
- Debugging
- Version control
- Domain specific toolkits
-
- Chemistry
- Biology
- Statistics
- Databases for chemistry
-
- Issues and challenges
- Cartridges for chemisry
- Mathematical and computational tools
-
- Machine learning
- Optimization methods
- Graph theory
- Programming for the web
-
- Web services
- Semantic web languages
- Distributed computing
- Scientific visualization
Course Schedule