CS798 Final Project Log 20100810

From SnOwy - Ed's Wiki Notebook

Jump to: navigation, search

20100812.230514

Ideal Topology Template: BA BA BA BA BA BA BA BA

B = Beta Strand in template
A = Alpha Helix in template
b = beta strand outside of template
a = alpha helix outside of template
o = "random coil"
? = "missing density"
d = "domain"

0 = not feasible
1 = feasible
2 = expedite
  • firebrick = todo
  • blue = checkout
  • forestgreen = done
  • v:w:x:y:z => beta start, non-sse start, alpha start, non-sse start, end (exclusive).
  • BETANON-SSEALPHANON-SSE

  • 1 : 1A4M.pdb (7 ; mono; 352) - BaaaA BA BA BA BA BA BaA BAaa
  • 1 : 1A53.pdb (60; mono; 248) - aaBbbA BA BA BA BA BbbA BA BA
  • 1 : 1A80.pdb (72; mono; 277) - bbBA BA BA BA BA BA BaA BAa
  • 1 : 1ADS.pdb (68; mono; 315) - bbBA BA BA BA BA BA BAa BAao
  • 1 : 1AFS.pdb (42; mono; 322) - bbBA BA BA BA BA BA BaA BAa

1AJ0.pdb

DILTAG Analysis 1AJ0 tagQ.png
  • 2 : 1AJ0.pdb (29; hodi; 282) - bbBA BA BA BA BA BA BA BA
    • (1:16 MKLFAQGTSLDLSHP)
    1. 16:23:35:51:54 HVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGAT
    2. 54:59:71:89:92 IIDVGGESTRPGAAEVSVEEELQRVIPVVEAIAQRFEV
    3. 92:97:100:109:113 WISVDTSKPEVIRESAKVGAH
    4. 113:117:125:133:135 IINDIRSLSEPGALEAAAETGL
    5. 135:140:157:175:179 PVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIA
    6. 179:186:194:211:211 KEKLLLDPGFGFGKNLSHNYSLLARLAEFHHF
      • Chimera labels KEK as helix-like
      • 211:211 => no non-sse tail
    7. 211:222:222:229:253 NLPLLVGMSRKSMIGQLLNVGPSERLSGSLACAVIAAMQGAH
      • Chimera labels 211:222 as non-sse-like -- is straight chain (X), occupies the space where one would expect a strand
      • 222:222 => no non-sse between X and helix
      • underline used to delineate between two otherwise consecutive bold regions :(
      • 232:250 are labelled as helix-like -- likely function as non-sse-like
    8. 253:257:259:276:283 IIRVHDVKETVEAMRVVEATLSAKENKRYE
__prefix__ = MKLFAQGTSLDLSHP
HVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGAT
IIDVGGESTRPGAAEVSVEEELQRVIPVVEAIAQRFEV
WISVDTSKPEVIRESAKVGAH
IINDIRSLSEPGALEAAAETGL
PVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIA
KEKLLLDPGFGFGKNLSHNYSLLARLAEFHHF
NLPLLVGMSRKSMIGQLLNVGPSERLSGSLACAVIAAMQGAH
IIRVHDVKETVEAMRVVEATLSAKENKRYE
  • The above images are broken apart by visual inspection into eight and four parts respectively;
    • Trying HMMrep to do the cutting now since it's been known to do well with TIM Barrels.
  • HMMrep output below...
Repeats        3
P-value  0.00028
Length        60
Offset        66
ID Probab  P-value RepScore RepScoreNorm Cols Query HMM Template HMM
A1  78.50  2.4e-05    16.14         0.42   38   27-65      86-124  
A2  89.00  5.2e-08    34.90         0.58   60   66-125     66-125  
A3  81.42  1.7e-04    17.74         0.57   32  141-175     75-106  
A1  Mon_Aug_16_14:   27-65    +0 --------------------KHANLMINAGA...T-IIDVGGEStRPGAAEVSVEEELQRVIP-...............
A2  Mon_Aug_16_14:   66-125  +15 VVEAIAQRFEVWISVDTSKPEVIRESAKVGA...HIINDIRSLS.EPGALEAAAETGLPVCLMHmqgnpktmqeapkyd
A3  Mon_Aug_16_14:  141-175   +0 ---------DVFAEVNRYFIEQIARCEQAGIakeKLLLDPGFGF.-------------------...............
  • Alignment:

1AJ0 align.png

  • Trace:

1AJ0 trace.png

  • DILTAG with new alignment:

1AJ0 tagHR.png

  • 1 : 1AK5.pdb (27; tetr; 503) - obbBA BA BA BA BA BA BA BbbAa
  • 1 : 1AQ0.pdb (4 ; mono; 312) - BA BA BaA BA BA BA BA BA
  • 0 : 1AQM.pdb (70; mono; 669) - BA BbbA BbbbbbbA BA Bo Ba BA BbbAd

1AW5.pdb

DILTAG Analysis 1AW5 tagQ.png
  • 2 : 1AW5.pdb (36; octo; 342) - oBboA BA BA BA BA BA BA BA
    • (1:43 MHTAEFLETEPTEISSVLAGGYNHPLLRQWQSERQLTKNMLI)
    1. 43:49:71:81:85 FPLFISDNPDDFTEIDSAPNINRIGVNRLKDYLKPLVAKGLR
      • underlined segments are a pair of matching anti-parallel strands
    2. 85:92:111:123:127 SVILFGVPLIPGTKDPVGTAADDPAGPVIQGIRFIREKFPEL
      • underlined segment is a spurious short helix region
    3. 127:133:154:172:175 YIICDVCLCEYTSHGHCGVLYDDGTINRERSVSRLAAVAVNYAKAGAH
    4. CVAPSDMIDGRIRDIKRGLINANLAHK
    5. TFVLSYAAKFSGNLYGPARDAACSAPSNGDRKCYQLPPAGRGLARRALERDMSEGAD
      • not a beta-strand, but is straight and fits into template: TFVLSYAAKFSG
      • tertiary structure for this segment is unsolved: RDAACSAPSNGDRK
    6. GIIVKPSTFYLDIVRDASEICKDL
    7. PICAYHVSGEYAMLHAAAEKGVVDLKTIAFESHQGFLRAGAR
      • skewed helix: SGEYAMLHAAAE
      • template fitting helix: LKTIAFESHQGFLRA
    8. LIITYLAPEFLDWLDE
__prefix__ = MHTAEFLETEPTEISSVLAGGYNHPLLRQWQSERQLTKNMLI
FPLFISDNPDDFTEIDSAPNINRIGVNRLKDYLKPLVAKGLR
SVILFGVPLIPGTKDPVGTAADDPAGPVIQGIRFIREKFPEL
YIICDVCLCEYTSHGHCGVLYDDGTINRERSVSRLAAVAVNYAKAGAH
CVAPSDMIDGRIRDIKRGLINANLAHK
TFVLSYAAKFSGNLYGPARDAACSAPSNGDRKCYQLPPAGRGLARRALERDMSEGAD
GIIVKPSTFYLDIVRDASEICKDL
PICAYHVSGEYAMLHAAAEKGVVDLKTIAFESHQGFLRAGAR
LIITYLAPEFLDWLDE 
Repeats        3
P-value   0.0091
Length        77
Offset       128
ID Probab  P-value RepScore RepScoreNorm Cols Query HMM Template HMM
A1  89.51  9.6e-10    43.19         0.56   77  128-204    128-204  
A2  81.72  2.0e-03    23.65         0.33   65  205-284    135-204  
A3  86.25  1.7e-05    24.55         0.61   43  285-327    135-177  
A1  Mon_Aug_16_21:  128-204   +0 IICDVCLCEYTS...HGHCGVLY............DDGTINRERSVSRLAAVAVNYAKAGAHCVAPSDMIDGRIRDIKRG
A2  Mon_Aug_16_21:  205-284   +0 -------LSYAAkfsGNLYGPARdaacsapsngdrKCYQLPPAGRGLARRA-LERDMSEGADGIIVKPSTF-YLDIVRDA
A3  Mon_Aug_16_21:  285-327   +0 -------CAYHV...SGEYAMLH............AAAEKGVVDLKTIAFESHQGFLRAGARLII---------------
A1  Mon_Aug_16_21:  128-204   +0 LINANLAHKTFV
A2  Mon_Aug_16_21:  205-284   +0 SE-I-C-KDLPI
A3  Mon_Aug_16_21:  285-327   +0 ------------

1AW5 align.png

1AW5 trace.png

1AW5 tagHR.png

1B54.pdb

DILTAG Analysis
DILTAG Analysis
  • 2 : 1B54.pdb (2 ; mono; 257) - BA BA BA BA BA BA BA BA
__prefix__ = MSTGITYDEDRKTQLIAQYESVREVVNAEAKNVHVNENASKI
LLLVVSKLKPASDIQILYDHGVR
EFGENYVQELIEKAKLLPDDI
KWHFIGGLQTNKCKDLAKVPN
LYSVETIDSLKKAKKLNESRAKFQPDCNP
ILCNVQINTSHEDQKSGLNNEAEIFEVIDFFLSEECKY
IKLNGLMTIGSWNVSHEDSKENRDFATLVEWKKKIDAKFGTSL
KLSMGMSADFREAIRQGTA
EVRIGTDIFGARPPKNEARII

1B5T.pdb

DILTAG Analysis
DILTAG Analysis
  • 2 : 1B5T.pdb (39; hote; 296) - BA BA BA BA BA BA BaoaoA BA
  • in the seventh strand: IIPGILPVSNFKQAKKFADMTNVRIPAWMAQMFDGLDDDAETRKLVGANIAMDMVKILSREGVK
    • non-template helix: NFKQAKKFADM
    • non-template helix: PAWMAQMFD
    • correct helix: DAETRKLVGANIAMDMVKILSRE
__prefix__ = GQI
NVSFEFFPPRTSEMEQTLWNSIDRLSSLKPK
FVSVTYGANSGERDRTHSIIKGIKDRTGLE
AAPHLTCIDATPDELRTIARDYWNNGIR
HIVALRGDLPPGSGKPEMYASDLVTLLKEVADF
DISVAAYPEVHPEAKSAQADLLNLKRKVDAGAN
RAITQFFFDVESYLRFRDRCVSAGIDVE
IIPGILPVSNFKQAKKFADMTNVRIPAWMAQMFDGLDDDAETRKLVGANIAMDMVKILSREGVK
DFHFYTLNRAEMSYAICHTLGVRPA

1BD0.pdb

DILTAG Analysis
DILTAG Analysis
  • 2 : 1BD0.pdb (34; dime; 388) - aBA BA BA BA BA BA BA BAd
__prefix__ = MNDFHRDTWAEVDLDAIYDNVENLRRLLPDDT
HIMAVVKANAYGHGDVQVARTALEAGAS
RLAVAFLDEALALREKGIEA
PILVLGASRPADAALAAQQR
IALTVFRSDWLEEASALYSGPFP
IHFHLKMDTGMGRLGVKDEEETKRIVALIERHPH
FVLEGLYTHFATADEVNTDYFSYQYTRFLHMLEWLPSRPP
LVHCANSAASLRFPDRTFN
MVRFGIAMYGLAPSPGIKPLLPYPLKEA
__suffix__ = FSLHSRLVHVKKLQPGEKVSYGATYTAQTEEWIGTIPIGYADGWLRRLQHFHVLVDGQKAPIVGRICMDQCMIRLPGPLP
__suffix__ = VGTKVTLIGRQGDEVISIDDVARHLETINYEVPCTISYRVPRIFFRHKRIMEVRNAIGAGESSA

  • 1 : 1BF2.pdb (26; mono; 776) - dBA BA BA BA Bo BaoA BA BAd
  • 1 : 1BG4.pdb (53; mono; 302) - BA BA BA BA BA BboA BboA BboA
  • 1 : 1BGG.pdb (8 ; octa; 448) - bBaoA BA BA BaoA BAaao BbbA BA BbbA

1BQC.pdb

  • 2 : 1BQC.pdb (17; mono; 279) - bbBA BA BA BA BA BA BA BA

  •  : 1BQG.pdb () - BA BA BA BA BA BA BA BA
  • 1 : 1BYA.pdb (83; mono; 495) - BA BA BoaoA BoaoA BoaoA BA BA BAo
  •  : 1C7S.pdb () - BA BA BA BA BA BA BA BA
  •  : 1C9W.pdb () - BA BA BA BA BA BA BA BA
  • 1 : 1CB7.pdb (50; hete) - aaaBA BoaoA BA BA BA BA BA BAd
  • 1 : 1CIU.pdb (10; mono; 710) - BA BA BA BA BA BaA BA BAd

1CNV.pdb

  • 2 : 1CNV.pdb (47; mono; 324) - BA BA BA BA BA BA BA BA

  • 0 : 1CTN.pdb (59; mono; 563) - dBo BaoA BaA BA BA BA BdA BA
  •  : 1D8C.pdb () - BA BA BA BA BA BA BA BA

1D9E.pdb

  • 2 : 1D9E.pdb (75; hotr; 284) - bbBA BA BA BA BA BA BA BA

1DBT.pdb

  • 2 : 1DBT.pdb (5 ; hodi; 239) - BA BA BA BA BA BA BA BA

  • 1 : 1DE5.pdb (77; hote; 419) - aaBA BA BA BA BA BA BA BAoaaaa
  •  : 1DHP.pdb () - BA BA BA BA BA BA BA BA
  •  : 1DJX.pdb () - BA BA BA BA BA BA BA BA
  •  : 1DXE.pdb () - BA BA BA BA BA BA BA BA
  •  : 1ECE.pdb () - BA BA BA BA BA BA BA BA
  • 1 : 1EGM.pdb (80; cplx; 224) - dBA BA BA BAa BA BA BA BAd
  • 1 : 1EZW.pdb (78; ?  ; 348) - BA BA BA BA BA BA BAd BA
  •  : 1F2J.pdb () - BA BA BA BA BA BA BA BA
  •  : 1F61.pdb () - BA BA BA BA BA BA BA BA
  •  : 1F6Y.pdb () - BA BA BA BA BA BA BA BA
  •  : 1FCB.pdb () - BA BA BA BA BA BA BA BA
  • 0 : 1FIY.pdb (12; hote; 883) - dBaaA BAaaaa BoaaA BA BA BA BaoA BAd
  • 1 : 1FRB.pdb ( 5; mono; 315) - bbBA BA BA BA BA BA BaA BAao
  •  : 1FWJ.pdb () - BA BA BA BA BA BA BA BA
  •  : 1GHS.pdb () - BA BA BA BA BA BA BA BA
  •  : 1GOX.pdb () - BA BA BA BA BA BA BA BA
  •  : 1LUC.pdb () - BA BA BA BA BA BA BA BA
  •  : 1MNS.pdb () - BA BA BA BA BA BA BA BA
  •  : 1MUC.pdb () - BA BA BA BA BA BA BA BA
  •  : 1NAL.pdb () - BA BA BA BA BA BA BA BA
  •  : 1NAR.pdb () - BA BA BA BA BA BA BA BA
  •  : 1ONR.pdb () - BA BA BA BA BA BA BA BA
  •  : 1OYA.pdb () - BA BA BA BA BA BA BA BA
  •  : 1PII.pdb () - BA BA BA BA BA BA BA BA
  •  : 1PKL.pdb () - BA BA BA BA BA BA BA BA
  •  : 1PSC.pdb () - BA BA BA BA BA BA BA BA
  •  : 1PUD.pdb () - BA BA BA BA BA BA BA BA
  •  : 1PYM.pdb () - BA BA BA BA BA BA BA BA

1QFE.pdb

  • 2 : 1QFE.pdb (48; hodi; 252) - bbBA BA BA BA BA BA Bo BA

1QO2.pdb

DILTAG Analysis
DILTAG Analysis
  • 2 : 1QO2.pdb (79; hote; 241) - ? ? ? ? ? ? ? ?
__prefix__ = ML
VVPAIDLFRGKVARMIKGRKENTIFYEKDPVELVEKLIEEGFTL
IHVVDLSNAIENSGENLPVLEKLSEFAEH
IQIGGGIRSLDYAEKLRKLGYR
RQIVSSKVLEDPSFLKSLREIDV
EPVFSLDTRGGRVAFKGWLAEEEIDPVSLLKRLKEYGLE
EIVHTEIEKDGTLQEHDFSLTKKIAIEAEV
KVLAAGGISSENSLKTAQKVHTETNGL
LKGVIVGRAFLEGILTVEVMKRYAR

  •  : 1QPO.pdb () - BA BA BA BA BA BA BA BA
  •  : 1QR7.pdb () - BA BA BA BA BA BA BA BA
  •  : 1QRQ.pdb () - BA BA BA BA BA BA BA BA
  •  : 1QTW.pdb () - BA BA BA BA BA BA BA BA
  • 2 : 1RPX.pdb (9 ; hexa; 280) - BA BA BA BA BA BA BA BA

1THF.pdb

1THF tag.png 1THF tagQ.png
  • 2 : 1THF.pdb (82; mono; 253) - BA BA BA BA BA BA BA BA
__prefix__ = MLAK
RIIACLDVKDGRVVKGSNFENLRDSGDPVELGKFYSEIGID
ELVFLDITASVEKRKTMLELVEKVAEQIDI
PFTVGGGIHDFETASELILRGAD
KVSINTAAVENPSLITQIAQTFGSQA
VVVAIDAKRVDGEFMVFTYSGKKNTGILLRDWVVEVEKRGAG
EILLTSIDRDGTKSGYDTEMIRFVRPLTTL
PIIASGGAGKMEHFLEAFLAGAD
AALAASVFHFREIDVRELKEYLKKHGVNVRLEGL

  •  : 1TPF.pdb () - BA BA BA BA BA BA BA BA
  •  : 1UOK.pdb () - BA BA BA BA BA BA BA BA
  •  : 1URO.pdb () - BA BA BA BA BA BA BA BA
  • 1 : 1XYA.pdb (67; dime; 386) - BA BA BA BA BA BA BA BAa

2ALR.pdb

DILTAG Analysis
DILTAG Analysis
  • ? : 2ALR.pdb (20; ?  ; ? ) - ? ? ? ? ? ? ? ?
__prefix__ = AASCVLLHTGQKMPL
IGLGTWKSEPGQVKAAVKYALSVGYR
HIDCAAIYGNEPEIGEALKEDVGPGKAVPREEL
FVTSKLWNTKHHPEDVEPALRKTLADLQLEYLD
LYLMHWPYAFERGDNPFPKNADGTICYDSTHYKETWKALEALVAKGLVQ
ALGLSNFNSRQIDDILSVASVRPA
VLQVECHPYLAQNELIAHCQARGL
EVTAYSPLGSSDRAWRDPDEPVLLEEPVVLALAEKYGRSPAQILLRWQVQRKV
ICIPKSITPSRILQNIKVFDFTFSPEEMKQLNALNKNWRYIVP
__suffix__ = MLTVDGKRVPRDAGHPLYPFNDPY

  • 0 : 2AMG.pdb (3 ; mono; 551) - BA BbbA BA BA BA BA BA BAbbbbb
  • 0 : 2CHR.pdb (38; octa; 370) - dBA BA BA BA BA BA BA BAb
  • 1 : 2DIK.pdb (41; hodi; 873) - BA BaoA BoaA BA BA BaoA BA BA
  • 0 : 2DOR.pdb (63; hodi; 311) - bbBA BbbA BA BA BA BbbbbA BA BA
  • 0 : 2EBN.pdb (15; mono; 339) - Bb BboA BA BA Bo Bo BA BA
  • 2 : 2EXO.pdb (25; mono; 312) - aBA BA BA BA BA BA BA BAa
  • 2 : 2HVM.pdb (35; mono; 273) - BA BboA BA BA BA BA BA BA
  • 0 : 2TMD.pdb (61; hodi; 729) - bbBA BA BbbboA BA BA BA BA BAm
  • 2 : 2TPS.pdb (18; mono; 222) - aBA BA BA BA BA BA BA BA
  • 0 : 2WSY.pdb (54; tetr; 268) - aBA BaoA BA BA BA BA BA BaA
  • 0 : 4REQ.pdb (65; hedi; 637) - aaBA BA BaoA BaoAao BA BA BA BAd
  • 0 : 7ENL.pdb (32; dime; 436) - bB aA BbbA BA BA BA BA BAb
  • 0 : 7ODC.pdb (14; hodi; 461) - bBAb BAbo BA BA BA BA BA BAd*

1FQ0.pdb -- added from HHMrep paper

>1FQ0:A|PDBID|CHAIN|SEQUENCE
MKNWKTSAESILTTGPVVPVIVVKKLEHAVPMAKALVAGGVRVLEVTLRTECAVDAIRAIAKEVPEAIVGAGTVLNPQQL
AEVTEAGAQFAISPGLTEPLLKAATEGTIPLIPGISTVSELMLGMDYGLKEFKFFPAEANGGVKALQAIAGPFSQVRFCP
TGGISPANYRDYLALKSVLCIGGSWLVPADALEAGDYDRITKLAREAVEGAKL
Repeats        7
P-value   0.0096
Length        23
Offset        72
ID Probab  P-value RepScore RepScoreNorm Cols Query HMM Template HMM
A1  37.08  1.1e-02     3.88         0.30   13   35-47      82-94   
A2  18.31  2.7e-01     1.47         0.11   13   50-62      73-85   
A3  75.98  3.9e-05    13.03         0.57   23   72-94      72-94   
A4  63.90  7.1e-04     8.98         0.43   21   95-115     74-94   
A5  58.26  2.4e-03     7.01         0.47   15  116-130     75-89   
A6  32.21  2.0e-01     3.35         0.16   21  137-160     72-92   
A7  62.00  1.5e-02     8.85         0.40   22  161-182     72-93   
A1  Mon_Aug_16_16:   35-47    +2 ----------ALVAGG...VRVLEVTlr.......
A2  Mon_Aug_16_16:   50-62    +9 -TECAVDAIRAIAK--...-------evpeaivga
A3  Mon_Aug_16_16:   72-94    +0 GTVLNPQQLAEVTEAG...AQFAISP.........
A4  Mon_Aug_16_16:   95-115   +0 --GLTEPLLKAATEGT...IPLIPGI.........
A5  Mon_Aug_16_16:  116-130   +6 ---STVSELMLGMDYG...LK-----efkffp...
A6  Mon_Aug_16_16:  137-160   +0 AEANGGVKALQAIAGPfsqVRFCP--.........
A7  Mon_Aug_16_16:  161-182   +0 TGGISPANYRDYLALK...SVLCIG-.........

1FQ0 align.png

1FQ0 trace.png

1FQ0 tagHR.png

__prefix__ = MKNWKTSAESILTTGP
VVPVIVVKKLEHAVPMAKALVAGGVR
VLEVTLRTECAVDAIRAIAKEVPEA
IVGAGTVLNPQQLAEVTEAGAQF
AISPGLTEPLLKAATEGTIP
LIPGISTVSELMLGMDYGLK
EFKFFPAEANGGVKALQAIAGPFSQV
RFCPTGGISPANYRDYLALKSVLC
IGGSWLVPADALEAGDYDRITKLAREAVEGAKL
1FQ0 tag.png 1FQ0 tagQ.png

20100812.020205

20100810.095955

CS 798 Dan Brown Phylogeny Final Project

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox