Skip to main content
notice

Master Thesis Defense: Qing Ye

April 23, 2019
|


Speaker: Qing Ye

Supervisor: Dr. G. Butler

Examining Committee: Drs. T. Glatard, A. Krzyzak, T.-H. Chen (Chair)

Title: Classifying Transport Proteins Using Profile Hidden Markov Models and Specificity Determining Sites

Date: Tuesday, April 23, 2019

Time: 10:00 a.m.

Place: EV 1.162

ABSTRACT

This thesis develops methods to predict the substrates transported across a membrane by a given transmembrane protein. Our methods use tools that predict specificity determining sites (SDS) after computing a multiple sequence alignment (MSA), and then building a Hidden Markov Model (HMM) using HMMER. In bioinformatics, HMMER is a set of widely used applications for sequence analysis based on HMM. Specificity determining sites (SDS) are the key positions of a protein sequence which play crucial role in developing functional variation within the protein family during the course of evolution.

We have established a prediction pipeline which integrated the steps of data pre-processing, model building and model evaluation. The pipeline contains similarity search, multiple sequence alignment, specificity determining site prediction and construction of a Hidden Markov Model.

We did comprehensive testing and analysis of different combinations of MSA and SDS tools in our pipeline. The best performing combination was MUSCLE with Xdet, and the performance analysis showed that the overall average Matthews Correlation Coefficient (MCC) across the seven substrate classes of the dataset was 0.71.




Back to top

© Concordia University