Ed's Big Plans

Computing for Science and Awesome

Archive for the ‘Recombinatron’ tag

iGEM: Freedom Unhashed

with 2 comments

An iGEM modeling meeting was held yesterday wherein Andre revealed his big plans for switching the team into enduserhood. Unfortunately, I didn’t follow along as well as I could have this time around and can really only document and comment on the bottom line.

We’ve again self-organized into two to three teams based on task. The first team is charged with creating a hashing function which creates a sequence of integrase usable tokens from an integer. The second (and third?) team is responsible for creating a check to ensure that a given product corresponds correctly to a given pair of reactant sequences. Finally, the dangling task of creating an even bigger external harness along with modifications to the present main.py program logic is likely being handled by the latter team.

The Hashing Task is kind of interesting because it essentially calls for unhashing an integer into a meaningful sequence rather than hashing a meaningful sequence into a unique integer. Since the reactant strings can themselves be lexicographically sequenced, then the task quickly becomes an enumeration or counting problem whereupon we find the most efficient way to count through the possible permutations of reactant tokens until we reach the integer that we want. The backward task (what we’re doing) may end up being implemented as the forward task with a sequential search.

The hashing subteam is headed by Jordan, the modeling head from last year and is joined by myself and Wylee– I honestly don’t see this as a task that can’t be completed by one person in a single bout of insanity– so it’s likely that I’ll hop over to Andre’s reactant-product verification team whenever this finishes.

We’ve planned another meeting for Tuesday 5pm next week to pull whatever we have together and to tackle any nascent problems.

Reactant-Product Verification is I think the more straight forward item, at least to explain. It is likely more technically challenging. Basically, we make the reaction go forward, and if the product matches what we wanted, then we favour the persistence of the product. … Err, at least that’s how I understood it… I’ll probably need to pop in and ask about it on Thursday before the big oGEM Skype meeting.

Side note– Oddly, both Shira and John were present at this meeting– it probably means we’re expecting progress 😀

Eddie Ma

July 22nd, 2009 at 5:36 pm

Big Bang Day! A Recombinatron Story

without comments

Matthew also has a post about Recombinatron & Big Bang Day here.

Integrase Enzyme Alphabet

‘Big Bang Day’ was this awesome Saturday morning where a bunch of the UWiGEM modeling folks came together and integrated our modules together. We ended up delegating more work, understanding the problem better and fixing up some of the logic in the big picture. After everyone had filed in and we managed to figure out how to synchronize with the SVN…. and after I managed to break the SVN and fix it again (thankfully!), it was time to get to work. I think the project as it stands right now is more or less done unless Giant Scaffold manages to find something that needs fixing.

Big Picture Drawn on the Chalk Board

The big picture was simplified to three giant objects that passes a big bag of DNA to one another in sequence.

The DNA bag is actually a list of DNAClass Objects (DNAObjects)– We decided not to create our own collection… there’s already a Python list. The Giant Scaffold module controls the movement of the DNA bag or subset there of from storage to Operators to Filters.

I was working with the Operators team– Basically, Matthew finished our team’s work because I managed to get swamped with thesis defense preparations, and Andre managed to take down the UWiGEM server.

(Andre incidentally has a post about taking down servers and making backups here.)

The Operators team ended up producing two big functions (and their little internal functions) and one support function.

A Couple of Filters

reactOneStrand(DNAObject) – Produces a list of resultant DNAObjects when the integrase enzyme is used on a single strand of DNA– this function may produce a one-list, two-list or three-list of DNAObjects. One-lists result from inversion (indirect) reactions, two-lists result from excision (direct) reactions, and three-lists result from palindromic operators.

reactTwoStrands(DNAObject, DNAOther) – Produces a list of resultant DNAObjects when two strands of DNA are reacted together with integrase.

The Filters team split their filters into three big enclosing functions– these three functions are equivalent to categories based on the likelihood that a given event would happen in a cell (Frequent, Moderate, Infrequent).

I unfortunately had to leave roughly 2.5 hours into Big Bang Day on other business but was happy to continue the madness online and on a subsequent Monday.

And now… some more photos…

Look! Everyone’s on their laptops– gee, I didn’t know they had Python on computers now!

Wylee, Bradon, Jordan

Wylee, Brandon and Jordan are all part of the Filters team.

Chong, Mattew, Andre

Chong is part of the Giant Scaffold team. Matthew and Andre are part of the Operators team.

Eddie Ma

July 3rd, 2009 at 1:12 pm

Yesterday and Today on UWiGem Recombinatron

without comments

So, I’ve been assigned the Recombinatron DNA submodule and spent a good part of yesterday morning and afternoon working on it at the UWiGem office. I’ve brain stormed out what features the submodule should have and have finished a sizeable chunck of it.

While doing so, I managed to learn how to use the Python yield keyword (along with raise StopIteration); and all about abstract functions and how to manipulate them underneath the hood. Abstract functions include items like “len()”, “some_collection[5:7]” and “some_object >= another_object”.

Basically, the DNAClass submodule will be the atomic type that will be passed between the different larger modules of the project. Each DNAClass instance (DNAObject) encapsulates a read-only string that is to be iterated in a loop either forward or backward, along with the ability to be sliced as a string. This all must be transparent to the user.

I might go into detail later, but for now, here’s a good resource — Ordered Dictionary [odict] class by Nicola Larosa and Michael Foord. The odict source is an excellent primer actually, it contains many many useful comments that’ve really helped me figuring out iterators, the slice object and abstract functions.

Finally, I only really have two items left that are mandatory– fixing up slices when a “None” object is used, and then the iterator-iterators…

The iterator-iterators (terrible idea actually) would be a list of iterators, so that each iterator will start at a different position in the loop of DNA all of which correspond to the same token.

I’m thinking now that I should replace it with an accessor that returns a list of positive integers corresponding to the tokens of interest instead; this can still be transparent to the user AND have the benefit of not being unwielding to implement. Having nested yield statements is just asking for trouble.

Update: Done. An initial version has been committed to the repository.

Eddie Ma

June 16th, 2009 at 11:24 am

Modeling Meeting

without comments

Modeling Team Selection with Flush();

Modeling Team Selection with Flush();

A modeling meeting occurred on Wednesday. Andre headed off the discussion and revisited the entire program layout in a nice chalkboard cartoon. Unfortunately, Andre generally doesn’t push down hard enough or make wide enough lines with the chalk in order to make a high enough contrast image against the black board for photography (i.e. faint drawing => no photos, sorry).

The discussion saw the formalization and division of the programming problem into three distinct software components as follows.

  • Genetic Fragment Operators
  • Genetic Fragment Filters
  • Overall Program Logic

Genetic Fragment Operators

These are the functions that represent reverse-complementation, enzyme activity etc..

Genetic Fragment Filters

These are functions that represent removing uninteresting, ‘inert’, undesirable and fatal fragments of DNA. This definition will become more precise once we’ve worked on the project a bit and better understand the philosophical correctness of each of these notions.

Overall Program Logic

The overall program logic will constitute producing some structure that represents a Big Bag of DNA (as opposed to a cell), communication between this Big Bag, the Operator module and the Filter module and of course– our main program loop.

What I’m doing…

I’ve been tasked with producing a universal representation of DNA which includes a circular iterator on a loop of DNA with an arbitrary starting position. This is OK to do in Python with the use of the ‘yield’ operator. I will be borrowing from Jordan / Brendan / My own previous ideas for this representation– we want to have an easy single-letter-token system and for the moment are happy with the single byte space ascii has to offer.

Eddie Ma

June 12th, 2009 at 8:41 am