Notes 20100511 CS 798 Course Notes Phylogeny

From SnOwy - Ed's Wiki Notebook

Jump to: navigation, search

lecturer: Dr. Dan G. Brown

Contents

Using Dissimilarity Matrices

A dissimilarity matrix should look like ...

          7
         / \
        5   4
       / | | \
      A  B C  D

(we really wanted to label A to D as 1 up to n (but that was confusing since the weights are given by integers too).)

  A B C D
A 0 5 7 7
B   0 7 7
C     0 4
D       0

Let's consider a symmetric matrix...

What does this have to do with phylogeny?

Theorem -- ultrametric matrix


1 2 3 4 5

(1,4) starting from 1-- find the shortest path (is 4), join; remove (1).

(1,4) 2 3 4 5

(4,3) continuing from 3-- find the shortest path (is 3), join; remove (4).

((1,4),3) 2 5

(((1,4),3),5) 2

((((1,4),3),5),2)

                  highest weight edge in L
                  |
*---*---*---*---*---*---*---*---*
        i    i' p   q     j
{---------------}   {-----------}
        |                 |
        Lp                Lq
            D[p,q]
           / \
          /   \
         /     \
   lp in Lp    lq in Lq

Stipulations on the above algorithm

Dirty Data: only almost ultrametric

Utilizing the Shortest Pair

Phylogeny is not Clustering

  1. Internal nodes are actually varieties of life
  2. Models are designed by population geneticists

UPGMA or Average Linkage Clustering

               /\              /\
              /  \            /  \
             /   /\          /   /\
            /\  /\ \        /   /\ \
           i j             k

Stipulations on UPGMA

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox