Archive for the ‘Mathematical Modeling’ tag
iGEM: Freedom Unhashed
An iGEM modeling meeting was held yesterday wherein Andre revealed his big plans for switching the team into enduserhood. Unfortunately, I didn’t follow along as well as I could have this time around and can really only document and comment on the bottom line.
We’ve again self-organized into two to three teams based on task. The first team is charged with creating a hashing function which creates a sequence of integrase usable tokens from an integer. The second (and third?) team is responsible for creating a check to ensure that a given product corresponds correctly to a given pair of reactant sequences. Finally, the dangling task of creating an even bigger external harness along with modifications to the present main.py program logic is likely being handled by the latter team.
The Hashing Task is kind of interesting because it essentially calls for unhashing an integer into a meaningful sequence rather than hashing a meaningful sequence into a unique integer. Since the reactant strings can themselves be lexicographically sequenced, then the task quickly becomes an enumeration or counting problem whereupon we find the most efficient way to count through the possible permutations of reactant tokens until we reach the integer that we want. The backward task (what we’re doing) may end up being implemented as the forward task with a sequential search.
The hashing subteam is headed by Jordan, the modeling head from last year and is joined by myself and Wylee– I honestly don’t see this as a task that can’t be completed by one person in a single bout of insanity– so it’s likely that I’ll hop over to Andre’s reactant-product verification team whenever this finishes.
We’ve planned another meeting for Tuesday 5pm next week to pull whatever we have together and to tackle any nascent problems.
Reactant-Product Verification is I think the more straight forward item, at least to explain. It is likely more technically challenging. Basically, we make the reaction go forward, and if the product matches what we wanted, then we favour the persistence of the product. … Err, at least that’s how I understood it… I’ll probably need to pop in and ask about it on Thursday before the big oGEM Skype meeting.
Side note– Oddly, both Shira and John were present at this meeting– it probably means we’re expecting progress
Operator Group Meeting
The Operator Group (UWiGem/Modeling/Operators) had a meeting about a week ago– the meeting ended up being between three people: Matthew, Andre and me at the iGem office. We’ve basically figured out everything we needed to in terms of raw interfaces between our module and the remaining two modules (Filtering Group and Giant Scaffold Group). The DNAClass was updated with the needs Andre presented– one of which is the ability to iterate over a DNAObject while returning yielding both the token index and token in a duplet: (index, token).
The implementation of the enumerate() built-in in Python (PEP 279) doesn’t allow for abstract function overriding. It always counts a collection as it iterates over it starting from zero. Ideally, the count should reflect the index of the circular DNA strand which means that it should be able to count forward or backward (iterate as reverse compliment), and count from any arbitrary position in the loop.
Note that the reverse compliment copy constructor (DNAObject.rc()) does not cause indexes to be reversed… It actually produces a reverse compliment strand and doesn’t do anything special with the indices (i.e. The new strand increments positively as it iterates forwardly). This behaviour is being debated now– On the one hand, it’s correct because a reverse compliment strand is a new strand; however, it is not a strand de novo– it came from a positive sequence.
I’m now waiting for Andre to let me know about the functions and data frameworks needed for the Operators module; my feeling is that the functions will be the straight forward integrase enzyme actions and that the data framework will simply be a python list.
Modeling Meeting

Modeling Team Selection with Flush();
A modeling meeting occurred on Wednesday. Andre headed off the discussion and revisited the entire program layout in a nice chalkboard cartoon. Unfortunately, Andre generally doesn’t push down hard enough or make wide enough lines with the chalk in order to make a high enough contrast image against the black board for photography (i.e. faint drawing => no photos, sorry).
The discussion saw the formalization and division of the programming problem into three distinct software components as follows.
- Genetic Fragment Operators
- Genetic Fragment Filters
- Overall Program Logic
Genetic Fragment Operators
These are the functions that represent reverse-complementation, enzyme activity etc..
Genetic Fragment Filters
These are functions that represent removing uninteresting, ‘inert’, undesirable and fatal fragments of DNA. This definition will become more precise once we’ve worked on the project a bit and better understand the philosophical correctness of each of these notions.
Overall Program Logic
The overall program logic will constitute producing some structure that represents a Big Bag of DNA (as opposed to a cell), communication between this Big Bag, the Operator module and the Filter module and of course– our main program loop.
What I’m doing…
I’ve been tasked with producing a universal representation of DNA which includes a circular iterator on a loop of DNA with an arbitrary starting position. This is OK to do in Python with the use of the ‘yield’ operator. I will be borrowing from Jordan / Brendan / My own previous ideas for this representation– we want to have an easy single-letter-token system and for the moment are happy with the single byte space ascii has to offer.
Integrase Problem Introduction
In a meeting with iGemmers @ Waterloo today– specifically the Modeling team headed by Andre and supplemented by core members Sheena and John– a discussion was held on this year’s modeling project. We’re currently interested in creating a solver that will yield an arrangement of attX sites on a chasis bacterial host chromosome that can accomodate several rounds of deterministic recombination.
Plainly, we need to write software that will create a solution that is a sequence of DNA– this DNA is arranged such that specific sites that can be operated on by the enzyme integrase is sequenced so that it can accept several loops of artificial DNA to recombine with. In this design, we’re interested in a sequence for the host chasis, a sequence for the artificial loops and another loop for integrase to be produced at some arbitrary tonic level inside the cell.
The first step is to mathmatically formalize the problem– and along with it, some working particles of software that successfully model the problem space. The solver is a yet more abstract piece of software that will use these particles in its solution. This is similar to designing the notion of integers and arithmetic operators prior to using those components to solve algebra.
This description is very coarse– I’ll refine it in a later post after I’ve had some time to analyze the problem constraints and what software particles are important to set down on paper.
Ed's Big Plans
I’d actually like to see a bit more “current product verification” — that is, verifying that the code we currently have actually works — before moving on to the distributed-computing-and-madness realm.
That aside, I’m glad you figured out the hashing stuff. Just out of curiosity, what exactly is an open-form, lexicographically sequenced, permuted, time-amortized, mathematical expression that falls under counting problems, anyway?
Okay, I accept your challenge: It is exactly as it sounds. Although I’m certain it didn’t sound *that* terrible when I said it
Current product verification? Of course.