CLARITY: A Common Lisp Data Alignment Repository

CLARITY is a tool and method for the storage and comparison of data with numeric and set membership components.  It is specifically geared toward timecourse microarray data, which is used to study the activity of genes across time. By using data that has been redescribed into Gene Ontology (GO) terms, and summarizing the regulation of these terms across time using characters (together forming sequences) we are able to use standard DNA alignment techniques with these sequences. Due to the prevalence and size of microarray experiments, finding a way to analyze new experiments as well as consider their relation to prior work is a significant contribution. CLARITY aims to help researchers update their prior findings as new data becomes available, and to automate this process. Contained in the software is a novel method of comparison, based on Gantt charts, which allows us to reason about numerical data in a way that we do DNA sequences. I.e. Rather than just looking at the numbers describing one gene across time as compared to another, we can consider insertions, deletions, mutations in these sequences of numbers. This is done efficiently using a global alignment with an affine (linear) gap score, a variant of the Needleman-Wunsch and Smith-Waterman algorithms

The Gene Ontology (GO), is a controlled vocabulary used to aid the description of genes and gene product attributes in a variety of organisms. There are three different ontologies that describe molecular functions, biological processes, and cellular components.


CLARITY provides novel methods for computing and visualizing the relationship among large amounts of data. The graphical interface allows the user to browse the database contents in text form as well as see the data arranged into a phylogenetic tree. There is more information on this interface here

Screenshot of the phylogeny interface

The CLARITY Dictionary

The CLARITY dictionary contains all the definitions comprising the library.


This project was sponsored by the Google Summer of Code program. The Lisp NYC group provided moral guidance and support. The NYU Courant Bioinformatics Group provided all the rest.

Site Map

None yet.

Questions? Queries? Suggestions? Comments? Please direct them at me.


News in chronological order, most recent on top.

  • 2006-06-9
    Started the site.