fsMSA Algorithm Context
What started as a meeting between me and my advisors ended up being a ball of unresolved questions about the cultural context of multiple sequence alignment and phylogenetic trees. While I had a good idea of what the field and its researchers had looked into and developed, I hadn’t a grasp of how far along we were. The result is the presentation I’ve just finished. In it, I discuss what I consider to be a representative sampling of the alignment and phylogenetic tree building algorithms available right now, at this very instant.
(PDF not posted, contact me if interested.)



Python Crash Course – Lesson 1
The first lecture of my Python Crash Course went really well! I ran it two evenings ago in the Dean’s Conference Room.
In gearing the very first lecture for absolute beginners, I had very little to cater to BIC (Bioinformatics Club) members. I however took the opportunity to discuss with them about the SOLVER group (more on that later); many of which seemed interested.
Overall there were roughly a dozen people that turned out, including Ariana, my TA partner from last term. There were about four iGEM members and six BIC members.
I also took the opportunity to poll for the kinds of things that students wanted to learn. Here are my findings.
- Object Orientation is something everyone wants to know– especially the people coming in with a Javascript, PERL, C, C++ and Scheme background; I was surprised that the C++ people didn’t get exposure to thinking in objects earlier.
- The beginners came in two groups. First, there are the ones who are happy to learn anything as long as it can be applied later.
- The second group of beginners want to data crunch PDBs, SDFs, FASTAs, Nucleotides etc.
In week two, we’ll take care of object orientation and in week three, we’ll take care of everything anyone ever needs to know about input output in order to do data crunching. I have added a link in the navigation of this blog for the Python Crash Courseware which will eventually include all the PDFs, code modules and examples used in class.
Oh right, I don’t know if I’ll get around to it– but I am missing instructions for setting environment variables in Windows. Perhaps I will add it later when I have time.
(iGEM attendees were John Heil, Danielle Nash, Tiffany and Lina; BIC members included Fiona, James and about four others whose names I have forgotten.)
Edit: Direct link to Python Crash Courseware; Direct link to Week 1: A Mad Mad Introduction, PDF.
Apache Optimized Finally! (Firebug, YSlow)
I didn’t realize I hadn’t added the mod_expires.c and mod_deflate.c items to my httpd.conf file in Apache yet– Andre clued me in!
Andre noticed my blog was taking a while to load, even when the browser cache should have significantly dented the page weight. He used Firebug and Yahoo’s YSlow to make a diagnosis and told me to do the same– this page ended up taking a whopping 17 seconds to load which is … very … sad. After I added these lines to my httpd.conf file, things were looking better (roughly 1.5 seconds — not perfect, but it’s far better).
The mod_expires.c chunk specifies that files displayed on a webpage ought to live in the browser cache. The caching information is sent as part of the file header by Apache to the client browser. Without this, files were apparently expiring instantly meaning that each refresh required downloading every single file again including including the images comprising this theme’s background.
The mod_deflate.c chunk specifies that file data should be gzipped before transmitting– this is again handled by Apache. The trade off between compressing a few text files (even dynamically generated ones) versus sending uncompressed text is more than fair.
<IfModule mod_expires.c>
FileETag MTime Size
ExpiresActive On
ExpiresByType image/gif "access plus 2 months"
ExpiresByType image/png "access plus 2 months"
ExpiresByType image/jpeg "access plus 2 months"
ExpiresByType text/css "access plus 2 months"
ExpiresByType application/js "access plus 2 months"
ExpiresByType application/javascript "access plus 2 months"
ExpiresByType application/x-javascript "access plus 2 months"
</IfModule>
<IfModule mod_deflate.c>
# these are known to be safe with MSIE 6
AddOutputFilterByType DEFLATE text/html text/plain text/xml
# everything else may cause problems with MSIE 6
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/ecmascript
AddOutputFilterByType DEFLATE application/rss+xml
</IfModule>
I’ve also removed the custom truetype font files specified in the CSS… they aren’t handled correctly for whatever reason– even after I added ‘font/ttf’ entries to the mod_expires.c chunk above. Finally, I tried completely removing background images from the site and restoring them again– it doesn’t make things any faster after images have been cached (correctly, finally).
I am very happy.
Project management software I want to write
Update: if you’re from Waterloo iGEM and want to work on this with me, see here.
I’ve contemplated this for so long, I think it’s time to put the plans into writing– at least as a draft so I have something to build upon.
There are three pieces of software I’ve been wanting to write. The rules are simple. I’ve become so addicted to cloud computing, I wouldn’t even contemplate desktop applications– so this stuff will be browser powered and hosted on Zinc. Each piece of software must intuitively communicate with each other piece — this can only be done where logical as is discussed below. Each application must offer familiar interfaces, seemingly simple operation and few tunable parameters. Finally, this stuff should be released with a cheap as free (as in beer, birds and guilt) license.
Tarocchi – Task Rhythms
The first application is for time and task management, I call it tarocchi as in ‘tarot’ as in the cards– ironically to mean that we are masters of our own future (immediate or distant). As much as I contested the idea of working-sphere based task management, I came to realize that’s what I’d benefit from the most the more I analyzed what I was already doing. Of course, this is not quite as invasive/pervasive as a technology that would stop irrelevant phone calls or e-mails from reaching me– but it is designed to break a workday apart into manageable hour-long units.
Each hour, tarocchi’s heartless silicon clutches will deal the end user with a card that is inscribed with the task for the next hour. For individuals with many projects, this provides a visual cue to switch gears so as to not burn out from the given task. The user must accept the task for the time spent on it to be logged. Note that at any time within the hour-long time period, the user may accept the task however– just as with pausing a task in progress, tarocchi will only log the amount of time that the user has claimed to work but will not track work committed after the hour unit is complete.
At the time of card dealing, the user may comment on the previous card and indicate whether or not it has been completed so that it will be placed into a deck of finished tasks.
Although tarocchi has no soul, it is not meant to force the user to work but is designed to make it psychologically easier to let go of a current task, and to beat the early morning, mid-day and end-of-day mental blahs.
A user may opt to reject a dealt card, whereupon tarocchi will ask if a break is needed, or if another card should be drawn. If the user manages to exhaust the entire remaining deck by rejecting them all, tarocchi will shuffle the cards and present the deck again. Tarocchi doesn’t operate completely at random, a few heuristics and user defined parameters tune how often, how long and in what order tasks may occur. There is also a facility to specifically tune when a particular deck should be played (date, time)– this functionality will make more sense after seeing mastermind and jigsaw.
Tarocchi has a guest mode which allows a logged-off user to browse the deck, each tasks’ progress and time into the current task of a named user, should that user have made the given card, deck or their own profile available for prying eyes. This guest interface doubles as the weekly report generated by tarocchi that allows the user to self-analyze how much time they spent on each task– which tasks were more often rejected, accepted, had breaks put in etc..
Mastermind – Team Brains
The second application fills a niche for iGEM, I call it mastermind. Mastermind is software designed to delegate tasks for a given team of individuals and to keep track of milestones. I will first describe mastermind as a stand alone. The guest-level interface shows the progress of particular projects, and its logical milestones. There is no enforcement on how large these milestones can be, as mastermind does not attempt to comprehend real-world implications, only to represent supervisors’ comprehension of tasks. Milestones should never be deleted, and have a five-valence progress descriptor: preproduction, churning, done, aborted, and paused. Tagged with each task or milestone are descriptions, notes, messages, links to materials they represent and other references. The guest may browse essentially the deliverable history of the team.
The user-level interface allows a user to own a particular task, edit its contents and to change its progress– additionally, users may break a long task into smaller logical tasks while supervisor-level interface additionally allows the creation and delegation of tasks and the administration of users.
Mastermind does not inherently keep track of time spent on milestones, only the materials used and how complete each milestone and task is. Mastermind does optionally keep track of when a particular project should be worked on by a specific user… this functionality will make more sense after you’ve read about jigsaw.
Tarocchi and Mastermind work together as follows. Mastermind would pass a deck of cards to Tarocchi corresponding to each of the teams and tasks that user has committed to. Tarocchi would then ask the user a few questions about priorities and shuffle these cards into its emotionless cultches to be dealt to the user. These cards also inherit the project descriptions provided by mastermind, and similarly these properties can be updated by every user with a copy of this card within tarocchi. Tarocchi would then tell mastermind whether or not the task depicted on the card has been completed, and also pass back the amount of time that was worked by this user. For tasks that mastermind has defined a specific time frame to be spent (date and time), the corresponding cards are not dealt unless the user has entered that frame. Mastermind tasks thus display the time spent on tasks per user through tarocchi.
Jigsaw – Scheduler
Jigsaw plays well with calendaring software such as google calendar. This is basically a revival of the scheduling software I was interested in being involved with. Basically, when shifts are to be assigned, wherein a specific number of people must show up in a given shift– where one must piecewise fit each person’s reported availability– jigsaw would magically puzzle the pieces into place. A supervisor defines the week, month or other time range they want filled, and what shifts exist. Users fill in their availability and guests may see the result. Users get an e-mail about which shifts they’ll take, may subscribe with RSS or export to an offline or online calendar. Little hitches like insufficient availability, users forgetting to report their availability, incorrect reports etc., can be resolved through automated messages to supervisors and users involved.
Jigsaw would likely communicate with Mastermind and Tarocchi as follows (if at all, as this gets complicated). Because there isn’t a clear logical entry point, I’ll have to make both jigsaw and mastermind applications that can begin the crosstalk. I generally avoid this kind of design because it is implicitly redundant– so in lieu of a better method, here we go. If a supervisor uses both jigsaw and mastermind and creates a task, they can specify on either that this task requires the other application. In mastermind, one would check off a box and indicate that this project must be scheduled in particular shifts. In jigsaw, one would indicate where these jigsawed shifts fit into the team’s projects. The functionality is far clearer from here. After all of the piecewise fitting is complete, jigsaw then tells mastermind who has been assigned to the task, and when. Mastermind does not communicate with jigsaw after this. Mastermind however does indicate to tarocchi when a particular project must and can only occur so that cards related to this project are dealt when a user enters a specific shift.
Overall
The interface that a user should see most often is tarocchi– it keeps often times overwhelming details of time management hidden while actual work is being done. Taking the worry away from choosing what to work on amongst self-defined, team-based and team-scheduled projects is a plus for people that have well defined tasks, but ill defined working schedules. No additional bloat such as chatting or person-to-person messaging should be added aside from the notes interface within mastermind– the focus is on simple, effective and time saving means to manage projects.
First Steps
I think I’ll develop a draft and specs for tarocchi while recruiting developers for tarocchi and mastermind.
Having an image logo AND logotype in Arclite (WordPress)
Update: See this post if you have Arclite 2.02.
Arclite is still by far my favourite theme available on WordPress. At some point in time I wanted to have both my snazzy Sigma/E logo up along with the logotype “Ed’s Big Plans”. Arclite doesn’t allow this (at least there is no such setting in this version), so after some more digging in header.php, I found this chunk of code.
<?php // logo image? if(get_option('arclite_logo')=='yes' && get_option('arclite_logoimage')) { ?> <h1><a href="<?php bloginfo('url'); ?>/"> <img src="<?php print get_option('arclite_logoimage'); ?>" title="<?php bloginfo('name'); ?>" alt="<?php bloginfo('name'); ?>" /></a></h1> <?php } ?>
The above is code that checks if the user has selected to use an image logo– when this logo is available, Arclite displays the image and moves on down the page. Alternatively…
<?php else { ?>
<h1><a href="<?php bloginfo('url'); ?>/"><?php bloginfo('name'); ?></a></h1>
<?php } ?>
…when the user has selected not to use an image, Arclite handily prints out logotype.
Since I want Arclite to render both the image logo and logotype when they’re available, I’ve simply slapped the logotype printing in with the former snippet of code to make this…
<?php // logo image? if(get_option('arclite_logo')=='yes' && get_option('arclite_logoimage')) { ?> <h1><a href="<?php bloginfo('url'); ?>/"> <img src="<?php print get_option('arclite_logoimage'); ?>" title="<?php bloginfo('name'); ?>" alt="<?php bloginfo('name'); ?>" style="vertical-align:text-bottom;" /> <?php bloginfo('name'); ?> </a></h1> <?php }
Quick and painless
Some inline CSS was used to get the logo and the text to sit inline with one another; by default, the logo goes out of line up high and the text goes down low.
Note that when one chooses not to use an image logo in Arclite’s settings, it just goes back to the default logotype behaviour– exactly as you’d expect.
fsMSA Algorithm… Monkeys in a β-Barrel…
I’ve finally finished documenting my foil sensitive protein sequence algorithm… This is part of Monkeys in a β-Barrel — a work in progress, this time continuing more on Andrew’s half of the problem rather than Aron’s.
I’ve decided on using the word “foil” to mean “internal repeat” since it’s easier to say and less awkward in written sentences. Andre suggested it after “trefoil” and “cinquefoil”, the plant.
Thumbnails below (if you are curious about the full slide show, contact me
).



Ed's Big Plans