Archive for category PhD Program

Python Crash Course — 4/5ths done!

This week is going to be crowded enough for me that I’m going to cancel this week’s class. On the bright side, the classes have gone better than I thought it would. We will continue on February 9th.

The very first class ended up being too short, with the advanced students feeling that it moved too slowly. The second and third classes ended up being just the right speed– with the exception that the example fill-in-the-blank script from the third class was too difficult.

The difficulty rose when I too quickly introduced dictionaries whose values are lists.

The fourth class held last week was excellent– I completely ditched slides that week and produced five fill-in-the-blank scripts that were just the right tempo for everyone. I had a good mix of BIC (Bioinformatics Club), iGEM and chemistry graduate students– all who attended got something out of the hour which was my objective.

We only had time for four out of the five scripts with the remaining script as a bonus that everyone could take home and try.

Now, it’s back to Structural Bioinformatics homework… It’s quite a daunting assignment to be true (having just formally shaken hands with Singular Value Decomposition), but the parts that are Python (particularly the bonus question) are familiar enough for comfort.

Tags:

fsMSA Algorithm Context

What started as a meeting between me and my advisors ended up being a ball of unresolved questions about the cultural context of multiple sequence alignment and phylogenetic trees. While I had a good idea of what the field and its researchers had looked into and developed, I hadn’t a grasp of how far along we were. The result is the presentation I’ve just finished. In it, I discuss what I consider to be a representative sampling of the alignment and phylogenetic tree building algorithms available right now, at this very instant.

(PDF not posted, contact me if interested.)

Tags: , , , , ,

fsMSA Algorithm… Monkeys in a β-Barrel…

I’ve finally finished documenting my foil sensitive protein sequence algorithm… This is part of Monkeys in a β-Barrel — a work in progress, this time continuing more on Andrew’s half of the problem rather than Aron’s.

I’ve decided on using the word “foil” to mean “internal repeat” since it’s easier to say and less awkward in written sentences. Andre suggested it after “trefoil” and “cinquefoil”, the plant.

Thumbnails below (if you are curious about the full slide show, contact me :D ).

Tags: , , , , , , , ,

Early draft of TIM Barrel problem

Here’s a slide show I presented at a lab group meeting earlier this month. There is a structure and sequence part to my research, this only overviews the structure half. The sequence part of my presentation was given with a white board since the figures were faster to draw on the white board than as vector graphics. I’ll come back to this when I’ve finished formalizing the big plan in its entirety.

Slideshow as a PDF with previews below.

Tags: , ,

Neighbor Joining Revisited

Brief: It’s been such a long time since I’ve actually implemented neighbor joining for any reason– The neighbor-joining method (Saitou N, Nei M. 1987) is used to create binary trees such that we iteratively merge pairs of most-similar points in a collection. Every merge results in the creation of a new node where it adopts the two merged nodes as its children. This occurs until there is only one node left. This single remaining node is the root of the tree. Here it is on wikipedia, wherein a transitional Q-Matrix is used to store intermediate values– and here it is care of Fred Opperdoes, explained using a net divergence array and a distance matrix. The algorithm described is identical, but Opperdose’s explanation is a lot faster to comprehend. … Incidentally, the wikipedia article doesn’t explain how to break the tie to decide which child node gets what value against their newly merged parent. In wikipedia’s formalization, one gets a pair of children of the same distance from the parent but one has a negative sign (this is incorrectly ambiguous); in both worked examples however, the actual (correct) score of one arbitrary child is equal to the difference between the score of the parent and the other child. In reality, the steps following treat the two children symmetrically anyway, so which one is one and which the other is trivial… there is no favourite child– and barring the above error, no negative child either.

Tags: ,

A Mobile Phone Photo Post!

One of the side effects of having a mobile phone is having a low quality camera handy to snap the occasional photo.

Here are some of the better photos I’ve snapped over the past month or so.

Photo0149Photo0154

Photo0155Photo0157

These Muffins were prepared for a party that was hosted by Helen Stubbs for all of us at the Meiering Lab. They were the very first cinnamon muffins I ever baked. I think the next time I make these, I’ll have to add less baking soda, more sugar and more cinnamon.

Photo0172

Broken Glass … Yes, I’m boasting my ability to make construction paper signs. Haha! The objective was to make something ugly and jarring, thus visually impossible to dismiss for the lab.

Photo0248

Chris Go and Hannah Jantzi are confused at the black cellphone-like object I’ve pointed at them.

Photo0240Photo0239

Photo0238Photo0243

University of Waterloo iGEM team 2010 — we’re getting organized for the coming year. I’ll have to update this post with names later. The blackboard notes contains the signature lack of flow that I’ve come to be known for in my group discussion diagrams…

Photo0177Photo0170

Photo0171

When T.A.’s Go to Tim Hortons – they might end up finding oddball photo ops… The strange sculpture is actually a water fountain; in this photo, the water’s been turned off for the winter, but the temporary floor boards haven’t been put in yet. The second photo is a picture of Ariana Marcassa, my T.A. partner posing with a headstone for school spirit. The last photo is of the archway to the Peter Russell Rock Garden, facing away from construction and into the Earth Science & Chemistry building.

Tags: , , , , ,

Protein Project Progress…

Last week, Liz, Aron, Andrew, Brendan and I sat down to discuss the beta-trefoil project. It was a good chance for me to understand the methods used and the kinds of results we are interested in for my own TIM Barrel project.

Continuing on with the structural repeat problem, I’ll today be writing a short FSA parser that can handle DSSP or DSS output– simply, a very primitive machine will be used to imitate a human’s visual inspection of repeated secondary structural elements in given proteins. This is in line with the work I did manually staring at structures to get a grasp of how to look at protein models, and also in line with the objective to automate much of this work. Prior to that step, I reduced the probability of doing redundant work by using BLASTCLUST and selecting only a few known structures in each cluster to inspect… a sequence based alignment for each cluster will inform me of where my manually detected repeat boundaries map to the remaining sequences.

Oddity: If you BLASTCLUST all the “FULL” (not “SEED”) sets of TIM Barrel sequences for the entire fold from PFAM along with the sequences of known TIM Barrel fold structures of SCOP, you’ll find that cluster fifteen (as of today) has these elements:

1YBE_A A6U5X9 A6WV52 Q2KDT0 Q2YNV6 Q6G0X7 Q6G5H6 Q8UIS9 Q8YEP2 Q92S49 Q98D24 A1UUA2

In the above listing, 1YBE:A (PDB code) is the sole known PDB structure, while the remainder are putative TIM Barrels (uniprot codes) as determined by the HMM model from PFAM.

The enzyme 1YBE looks like this…1YBE_A

It’s an oddity because of the number of alpha-helices inserted within what is usually a hydrophobic beta barrel– the red pieces of ribbon should form a hollow cylinder, but it’s split apart for 1YBE and accommodates a bunch of cyan helices. Labeled in white are helices that break with the beta-alpha repeating secondary structural element (SSE) pattern by occurring before the first repeat. Labeled in green are breaks between beta-alpha SSE patterns.

Reference

Seetharaman, J., Swaminathan, S., Crystal Structure of a Nicotinate phosphoribosyltransferase [To be Published]

Tags: , , , , , , ,

Squishy TIM Barrel Subunits

Again with the TIM Barrel pictures! Here’s some text about it from my notes…

1a5m (A Urease) is a really interesting protein– it consists of three subunits. Each subunit consists of three unique domains: a very squashed TIM Barrel, an alpha-alpha-alpha-beta-beta domain and a beta-beta-alpha-beta-beta domain. I’m not yet sure what to call little broken alpha helices that have less than two complete turns. The TIM Barrel (though exceedingly asymmetrical) will still be accounted for in the data to be analyzed. The TIM Barrel (566 amino acids) is the alpha subunit of each symmetrical subunit. The remaining two domains are the alpha and beta subunit though PDB is not clear which is which: they each weigh in at 100 and 101 amino acids. 1a5m is part of several solved urease structures in the PDB– the collection: {1A5K, 1A5L, 1A5M, 1A5N, 1A5O} are solved by Pearson et al. (1998).

References
Matthew A. Pearson, Ruth A. Schaller, Linda Overbye Michel, P. Andrew Karplus, and, Robert P. Hausinger (1998). Chemical Rescue of Klebsiella aerogenes Urease Variants Lacking the Carbamylated-Lysine Nickel Ligand. Biochemistry. 37(17):6214-20.

Squishy squishy shapes– the giant pink object in the next picture is actually three such TIM Barrels, each of which belongs to one of the three subunits.
1a5m_holo
Each of the three subunits are shown separately below…

1a5m_1_3 1a5m_2_3 1a5m_3_3

Aesthetically pleasing– these images were captured from the JMol output available from RCSB PDB.

Tags: , ,

T.A.ing — half of the semester done!

Wow, it’s been half the semester already. It’s been a good experience so far. This job has highlighted two key things I should work on though. First, it’s impossible to know everything that can be asked in a given tutorial, so I should figure out what the boundaries for the module are (i.e. what the current module does not approach), and know the most likely places the answers are in the textbook so I can refer to an official explanation when I’ve drawn a blank. Providing answers that go beyond the scope of a course is actually harmful since it affects students’ recall during examinations! Second, I need to reduce the amount of repetition I use to explain something. Specifically, saying the same sentence twice doesn’t help comprehension– clearly, there’s something incompatible about that sentence.

Overall though, I’m happy with how the tutorials are progressing and how our (my and Ariana’s) introductory slide shows go– The quizzes at the end of class have progressed into being a very smooth transaction too.

In about half an hour, I have to head down to watch them all write a midterm along with the other TAs.

Applications for Winter TA positions are due soon.

It might be nice to do a hands-on laboratory course next, but tutorials are certainly pleasant.

Tags: , ,

TIM Barrels look like this…

It occurred to me that I didn’t actually post any images in the past few posts. Here are two TIM Barrels that I’ve arbitrarily picked from DATE. Note– the image files are horribly misnamed, so please use the text description underneath.

From left to right– these figures are 1A53 (Beta Strand C-Face), 1A53 (Beta Strand N-Face), 1BG4 (Beta Strand N-Face), 1BG4 (Beta Strand C-Face). 1A53 is called “indole-3-glycerol phosphate synthase” and 1BG4 is called “xylanase” isolated from Penicillium simplicissimum. I won’t go into the function of each of these enzymes, but they do illustrate what a general beta-alpha TIM barrel looks like. TIM Barrels comprise of eight beta-alpha secondary structure elements. Extra helices and sheets may occur but must flank the TIM Barrel-like portion of the protein domain (1A53 has a very prominent extra alpha helix close to the camera in the far left image). The “barrel” name derives from the twisted cylinder enclosed by the parallel beta sheets in the middle of the object. TIM barrels can be deformed quite a bit too if they’re a subunit part of a larger holoprotein.

The four-character designations (1A53, 1BG4) are RCSB (A Resource for Studying Biological Macromolecules) Protein Data Bank identifiers– I’ve found that SCOP frequently links into PDB while PFAM frequently links to UniProtKB (Knowledge Base) and utilizes UniprotKB identifiers. More on that later…

Tags: , ,