Limsoon Wong KRDL

17
Show & Tell Limsoon Wong KRDL Datamining: Turning Biological Data into Gold

description

Datamining: Turning Biological Data into Gold. Limsoon Wong KRDL. Jonathan’s blocks. Jessica’s blocks. Whose block is this?. What is Datamining?. Jonathan’s rules: Blue or Circle Jessica’s rules: All the rest. What is Datamining?. Question: Can you explain how?. - PowerPoint PPT Presentation

Transcript of Limsoon Wong KRDL

Page 1: Limsoon Wong KRDL

Show & Tell

Limsoon WongKRDL

Datamining: Turning Biological Data

into Gold

Page 2: Limsoon Wong KRDL

Show & Tell

Jonathan’s rules : Blue or CircleJessica’s rules : All the rest

What is Datamining?

Whose block is this?

Jonathan’s blocks

Jessica’s blocks

Page 3: Limsoon Wong KRDL

Show & Tell

What is Datamining?

Question: Can you explain how?

Page 4: Limsoon Wong KRDL

Show & Tell

What are the Benefits? To the patient:

Better drug, better treatment To the pharma:

Save time, save cost, make more $ To the scientist:

Better science

Page 5: Limsoon Wong KRDL

Show & Tell

The Datamining Process

Page 6: Limsoon Wong KRDL

Show & Tell

Epitope Prediction

TRAP-559AAMNHLGNVKYLVIVFLIFFDLFLVNGRDVQNNIVDEIKYSEEVCNDQVDLYLLMDCSGSIRRHNWVNHAVPLAMKLIQQLNLNDNAIHLYVNVFSNNAKEIIRLHSDASKNKEKALIIIRSLLSTNLPYGRTNLTDALLQVRKHLNDRINRENANQLVVILTDGIPDSIQDSLKESRKLSDRGVKIAVFGIGQGINVAFNRFLVGCHPSDGKCNLYADSAWENVKNVIGPFMKAVCVEVEKTASCGVWDEWSPCSVTCGKGTRSRKREILHEGCTSEIQEQCEEERCPPKWEPLDVPDEPEDDQPRPRGDNSSVQKPEENIIDNNPQEPSPNPEEGKDENPNGFDLDENPENPPNPDIPEQKPNIPEDSEKEVPSDVPKNPEDDREENFDIPKKPENKHDNQNNLPNDKSDRNIPYSPLPPKVLDNERKQSDPQSQDNNGNRHVPNSEDRETRPHGRNNENRSYNRKYNDTPKHPEREEHEKPDNNKKKGESDNKYKIAGGIAGGLALLACAGLAYKFVVPGAATPYAGEPAPFDETLGEEDKDLDEPEQFRLPEENEWN

Page 7: Limsoon Wong KRDL

Show & Tell

Epitope Prediction Results

Prediction by our ANN model for HLA-A11 29 predictions 22 epitopes 76% specificity

1 66 100Rank by BIMAS

Number of experimental binders 19 (52.8%) 5 (13.9%) 12 (33.3%)

Prediction by BIMAS matrix for HLA-A*1101

Page 8: Limsoon Wong KRDL

Show & Tell

Gene Expression Analysis

Clustering gene expression profiles Classifying gene expression profiles

find stable differentially expressed genes

Page 9: Limsoon Wong KRDL

Show & Tell

Gene Expression Analysis Results

The Discovery System• Correlation test• Voter selection• Class prediction

Page 10: Limsoon Wong KRDL

Show & Tell

Protein Interaction Extraction“What are the protein-protein interaction pathways

from the latest reported discoveries?”

Page 11: Limsoon Wong KRDL

Show & Tell

Protein Interaction Extraction Results

Rule-based system for processing free texts in scientific abstracts

Specialized in extracting

protein names extracting

protein-protein interactions

Page 12: Limsoon Wong KRDL

Show & Tell

Transcription Start Prediction

Page 13: Limsoon Wong KRDL

Show & Tell

Transcription Start Prediction Results

Page 14: Limsoon Wong KRDL

Show & Tell

Medical Record Analysis

Looking for patterns that are valid novel useful understandable

age sex chol ecg heart sick49 M 266 Hyp 171 N64 M 211 Norm 144 N58 F 283 Hyp 162 N58 M 284 Hyp 160 Y58 M 224 Abn 173 Y

Page 15: Limsoon Wong KRDL

Show & Tell

Medical Record Analysis Results

DeEPs, a novel “emerging pattern’’ method

Beats C4.5, CBA, LB, NB, TAN in 21 out of 32 UCI benchmarks

Works for gene expressions

Page 16: Limsoon Wong KRDL

Show & Tell

Under the Hood

Artificial neural network Neighbourhood analysis Non-linear analysis Template matching Emerging pattern Hidden markov models Bayesian inference Decision tree induction ...

Page 17: Limsoon Wong KRDL

Show & Tell

Behind the Scene Epitope Prediction

Vladimir Brusic Judice Koh Seah Seng Hong Zhang Guanglan Yu Kun

Transcription Start Prediction Vladimir Bajic Seah Seng Hong

Gene Expression Analysis Zhang Louxin Zhang Zhuo Zhu Song

Medical Records Li Jinyan

Protein Interaction Extraction Ng See Kiong Zhang Zhuo