lecture image CCT Colloquium Series
Protein Data Mining and Feature Fusion for Translational Bioinformatics
Sumeet Dua, Associate Professor
Louisiana Tech University
Johnston Hall 338
December 12, 2008 - 11:30 am
One of the daunting challenges facing Biology, and consequently multidisciplinary research in Computer Science, is to assign biochemical and cellular functions to the thousands of hitherto uncharacterized gene products discovered by several international gene-sequencing projects. These research endeavors are producing high dimensional, heterogeneously distributed data at an unprecedented rate, much more rapidly than the corresponding development of computational techniques capable of novel knowledge analysis and discovery. Data mining offers the promise of precise, objective, and accurate in-silico analysis of this emerging data using knowledge discovery routines that reveal embedded patterns, trends, and anomalies in order to create models for faster and more accurate physiological discovery. Protein structure classification and comparison has become a central area in the field of bioinformatics. Proteins serve as one of the major structural elements of living systems and their interactions determine most of the molecular and cellular operations within these systems. The quantity, complexity, and availability of protein structure databases has been increasing at a nearly exponential rate, leading to the demand for the development of automatic and expeditious techniques for protein structure comparison, classification, modeling, and functional prediction. The determination of a protein's structure and function from its amino acid sequence has also provided an exciting challenge. Successful methods for the categorization of protein structural classes from sequence information involve multiple physio-chemical properties and machine learning algorithms. In this presentation, I will present novel data mining algorithms for three-dimensional (3D) structure-based classifications of proteins that we have developed that use a coherent feature space derived from the multidimensional physicochemical property scales of proteins to accurately determine hydrophobic cores and to classify structures in a supervisory fashion. Our results demonstrate that discriminatory residue interaction patterns shared among proteins of the same family can be employed for both the structural and the functional annotation of proteins. Extensive experimentation demonstrates much enhanced results with boosted specificity and sensitivity of protein structural classification comparing the novel feature sets to the previous results in the area. The presentation will conclude with some directions for future investigation and improvement.
Speaker's Bio:
Dr. Sumeet Dua is currently Upchurch Endowed Associate Professor and Graduate Coordinator of Computer Science and Coordinator of IT Research at Louisiana Tech University, Ruston, LA. He is also the adjunct faculty in School of Medicine, Louisiana State University Health Sciences Center, New Orleans. National Science Foundation (NSF), National Institutes of Health (NIH), Air Force Research Laboratory (AFRL) and Louisiana Board of Regents (LA-BoR) have funded his research by over $2.5 million dollars in the past 6 years. He recurrently serves as a study section member of Special Emphasis Study Section at National Institutes of Health (NIH) and served as a panelist for National Science Foundation. He has received numerous awards including Engineering and Science Foundation Award for Faculty Excellence awarded in 2006 and Faculty Research Recognition Award in 2007, and was granted early tenure and promotion in August 2007. He has advised over 22 graduate students who have found positions in leading industry and academic institutions, including IBM, Oracle and Philips research. His areas of expertise include database mining, pattern recognition, data warehousing, clinical informatics and bioinformatics had has several publications in the area. He is frequently invited to give talks at leading academic institutions, conferences and industry and serves on the Louisiana Board of regents speaking of science speakers’ bureau. Dr. Dua is a Senior Member of the IEEE Computer Society, Senior Member of the Association for Computing Machinery (ACM), and member of SPIE, International Society for Computational Biology and American Association for Advancement of Science.
Refreshments will be served.
This lecture has a reception.