Archive for the ‘Academic Life’ Category
I think it’s safe to say that I’m unlikely to post anymore on this blog (although announcements related to my department will still be broadcast here until I find a suitable spot for them). The original intention was to quickly write down a few thoughts, especially if I learned something that was new to me, and then be able to refer to it in future. At some point, there was a transition from shorter, less meaningful posts that were easy and quick to write — to longer, more meaningful posts that took a lot of time to produce (at the same time, I recall trying to write posts that became less and less personal to the point where the narrative suffered – Who’s telling the story? Why would I even look back on these posts in future?). After that, things became very sparse as the amount of time I wanted to dedicate onto side projects was consumed with razor-focus on my thesis. That focus on my thesis actually did me very well in retrospect.
The gut feeling I have is there’s still an exception, there’s a need that’s been left unattended for a long time since I stopped completing articles in the blog — I still have a need to record things down where I could review them later — and then to allow those articles to be quickly given to someone else as an answer to a question (passing a URL is easy).
My wiki is still up (again, rarely updated), that seems to be a reasonable stop-gap. The problem is that there’s not really a sense of accomplishment writing things down in there (for the corporate type, what operational definition can we pull out as a key performance indicator? Bytes? Pages? Number of updates over a timespan? These do not describe the practical utility of the content of a wiki).
I’m going to try something different and create something like a journal, that is the culmination of things learned. The codex will exist as a set of loosely organized LaTeX sources and accompanying figures arranged in yet another SVN/git repo (I think I have like five on my home server now…)– plus their PDF outputs. When articles from the codex seem reasonably mature, they’ll get a stamp stating their version number, and a copy will be released somehow, for magical search engines to index — or it’ll be archived somewhere locally if it somehow isn’t appropriate to shove onto the net (e.g. oh hey, this particular article would work 1:1 as part of my dissertation). If the articles are any good, it’d then be possible to string chapters together, and eventually to bind them into a book to have for posterity. Now that’s a physical secondary storage of stuff that I learned that I’d get to keep and flip through (the ol’ primary storage isn’t as sharp as it used to be).
Having OSX and an SSD means searching for keywords in local PDFs is super quick (– I don’t know what the experience is like in Windows 7 /8, but I assume it’s similarly quick to find documents containing phrases — there’s a search field in the system-wide start menu that appears to do the same thing). I think the codex would become the best way for me to write things that solves the problems of (1) a fluctuating personality I experienced when blogging with WordPress, and (2) a feeling of unfulfillment I felt putting things in my wiki.
Finally, there’s time commitment. Let’s see — I used to be terrible at budgeting time – now I’m just bad. So that’s an improvement. Let’s say I spend about six hours each week writing in my codex. Without the pressure to actually release anything as a blog post, I’ll actually feel like my ongoing project isn’t stagnating. Rather than having a few dozen drafts, I’ll have a few dozen works in progress that are already useable and searchable fragments of chapters. Since each chapter can be on its own subject, I don’t have to worry about keeping to a narrow personalty — as in the limit, the codex will cover my own personality.
Sorry to see you go! I can definitely relate though – there’s just not enough hours in a day and more often then not I find my time would better be suited to writing a paper or doing actual research.
Brief: Well, I’ve organized the first Wine & Cheese for the School of Computer Science graduate students slated for next week (University of Guelph). I had significant moral support from Richard Schwarting and Jason Ernst. We’re hoping for a turnout of about thirty including faculty and staff. I’m particularly proud of the logo I made.
I think we’ll have an excellent turn out — Richard and I did advertise as much as humanly possible after all
Since I’m heading into Reading Week, I’ve decided that this is a good time to take a moment to reflect on the semester. So far, it’s been very rewarding.
It’s fitting then that I should tell the story starting with the break before this semester. During that break, I spent a few days figuring out whether or not I was happy working on the symmetrical protein evolution problem at the University of Waterloo. As it turned out, my brain had become so trained into thinking like a computer scientist, there was actually an added cognitive tax in order to communicate well with my biology and biochemistry advisors and labmates. It was clear that I would better serve in a role elsewhere. The friends I made at Waterloo are the good ones — they are exactly as capable, willing to help and academically diverse as I need them to be. I’m happy to say that we’ll be picking each others’ brains for a long time. The evolution problem is a very interesting one, and I’ve promised my former advisors that I would help whoever they find to take my place, as that person would inherit a pile of my source code, data and in-house briefings.
I am here now — returned to the University of Guelph. The graduate programs at the two universities differ in many ways — the one most significant to my brain is when graduate courses are taken. At Waterloo, the culture has it that one works on their thesis primarily at all times and takes an occasional course to fulfill degree requirements. At Guelph, the culture has instead that one amortizes all of their courses in the first semester so that if the course materials are even ostensibly useful during thesis work, that knowledge would be mentally installed and available.
(Of course, the cultural differences are likely influenced by the different disciplines too.)
So this semester, I’m taking Computer Security, Artificial Neural Networks and Image Processing. Computer Security has been mostly to do with number theory. I’ve retrained myself in matrix multiplication and inversion while learning fast exponentiation, modulus power, modulus inverse, multiplicative and additive groups, the totient function, sieves, the extended Euclidean algorithm and much more I’m sure. I’m happy that I’m continuing to save my notes digitally in triplicate so that I’ll have a backup when my soft meat brain starts to forget. The number theory amounts to nice security devices such as RSA encryption. There’s another half semester to go, so whether it will be equally math intensive — or more application intensive built upon what we’ve already learned — remains to be seen. Artificial Neural Networks is a nice course for me — it rounds out my repertoire of architectures and training algorithms seeing as I had only worked with the feed forward, recurrent and recursive (back propagation) cases during my Master’s work. Image Processing has been a curious class. We’ve been through one round of presentations and are heading into learning the math behind a number of transforms applicable to image processing. This class is very firmly grounded in the symbolic maths and algebra — each adjective, each concept is meticulously and correctly described mathematically. Consider the precise meaning of continuity below as an optional property of fuzzy inverses.
∀x0 ∈ [0, 1], ∀ε ∈ ]0, +∞[, ∃η ∈ ]0, +∞[, ∀x ∈ [0,1], |x – x0| < η → |¬(x) – ¬(x0)| < ε
Assertion: No matter how small the difference is between two x’s, there will always exist a value for the difference of their negations.
There have been many items to memorize in this course — luckily, the above mathematical statement isn’t one of them
I’ve chosen a thesis topic as well. As it turns out, I’ll be working in machine learning — but this time on something of an expert decision support system that replaces the human proof-reader during nucleotide sequencing. If I do things correctly, the system I build will be able (1) to anticipate when it will make a mistake, (2) to anticipate how the human would react to that mistake and (3) to replace the erroneous token with the repaired token that a human expert would choose. I’ll have to go into more details as I discover them for myself — but it looks to be fun — you’ll notice that step (1) is probably recursively applicable to steps (2) and (3) — this device should know when it will make a mistake about a mistake — how this plays out in the decision states of my system remains to be discovered
Sounds like you have a pretty cool semester going on. Notational question: I take it you’re using ]a, b[ to represent the interval from a to b excluding a and b? I'm accustomed to using round brackets for that, such that [0, 2) would mean 0 to 2 including 0 but not 2.
[This is the part where I look check Wikipedia while writing a comment.] Oh, huh, it turns out that the square brackets thing IS an alternative notation, apparently used in Europe. Good to know.
Yes — parentheses ( … ) are overloaded and could indicate an ordered pair — in reality, I think that we can resolve everything by making all tuples including ordered pairs use angle brackets 〈 … 〉 and keeping parentheses to indicate non-inclusive ranges. This of course raises the issue of many many people not knowing how to type angle brackets without resorting to < … > — and the whole process of finding non-overloaded notation happens all over again — no no, let’s end it here before it begins and try to grab the meaning from context
Brief: This is the fourth semester of the PhD program I’ve found myself enrolled in. Here’s a quick rundown of what I’ll be up to this time around now.
- Thesis proposal defense on Oct. 8th — Symmetrical protein phylogeny and engineering (final title) — I’m ready to have it torn apart by the committee
- TA BIOL 208 — Analytical methods in molecular biology; will be marking midterms next week in a tiny yet well lit room with half a dozen TAs. I’ll bring coffee.
- Scholarship applications just completed. They’re both to be submitted next week — wish me luck!
- BioCompiler (UW iGEM) moving along much faster thanks to Matthew‘s hard work writing up the first AND second drafts of the project outline — sidestep: it’s going to be more like BioMacro for a long time first.
- Taking BIOL 614 — Bioinformatics tools; and possibly CHEM 731 — Protein design and engineering. I’m happy that BIOL 614 complements the CIS 6060 bioinformatics course I took at Guelph.
It looks to be a full semester.
Brief: I got my TA evaluations from BIOL 208 back last week (which I taught with Ariana in fall 2009) and it looks the students liked my instruction. The overall positive response is encouraging but I’m concerned that they’ve actually been too kind. The whole thing was something of a learning experience for me as I’ve never given tutorials before. The written remarks were very informative too. There are two big things the students wanted more of: first, I should increase the depth of my background in the course; second, I should ensure there’s time to take up quiz and workbook questions. The first item is a bit difficult to do actively mostly because it’s hard to proactively decide on what kinds of questions a student will have cooked up based on the readings available for a given week. It looks like it’s a self-repairing problem however — I simply have to TA more in and around the same topic area until the background information is second hand (or at least until all the keywords are loaded into my brain along with hints toward appropriate literature). The second point is important. The amount of time needed to take up questions can be built into the lesson plan — I think the best way to approach this is to reduce the amount that we try to cover in the tutorial slide show. Besides, there’s little advantage to repeating all of the same things as the instructor (particularly if we might say it differently, or explain it in a way that’s even more confusing or worse, disagree with the instructor). It’s thus better to focus on giving the background for the workbook questions. I figure that an average of 25% to 33% less material covered will enable us to focus in on the workbook and allow us to discuss quiz questions (and spurious questions) etc. with sufficient time.
Overall, this was a very enjoyable and instructional experience .
In the beginnings of the Winter semester, I had an idea to start up a undergraduate/graduate student group that would provide a scaffold for faster, computer assisted research.
The semester was simply too full for me to try and get it running at that time. I’m tempted to do some work on it later this semester, after I’ve gotten more of my thesis done.
The basic idea is as follows. Faculty and graduate students in the biology and chemistry departments often have the need to analyze data with some elegant computing. Whether this be as commonplace as hacking together an excel sheet to do work or learning some existing toolbox, or something slightly more in depth such as analysis in R, Python or PERL. Unfortunately, these research problems often fall by the wayside as the time commitment to learn a new software package or programming language is not trivial without a stronger computing background. Undergraduate students who are raised in the computer science environment, particularly with a bioinformatics interest have some knowledge of the research problem semantics as well as some knowledge of how to do the above analysis by using and coding software.
The “SOLVER” group would fill the gap by performing a matchmaking and coaching service. In my vision, SOLVER creates working teams of three to four– (1) one or two graduate students or faculty with a research problem, (2) one or two undergraduate students with some knowledge in bio / chem and some talent in computing, (3) an experienced coach that can recommend best practices so that the team has a good shot of solve the problem in a reasonable amount of time. In the end, the researcher gets help and a good chance at a working solution– they might even learn some programming; educational / professional / social connections are made; and the student gets an item to add to their resume. In reality, a particular research step would be executed as week-long blocks– whether this means one block or four blocks (one month) depends on the complexity of the task.
One stretch of work that I have to do is to determine the needs of the department. I wanted to do this in the middle of Winter semester but didn’t find it a priority. For this group to work, there must be research problems. Similarly, I need to determine the capabilities of potential undergraduate students. I’m learning quickly that the key here is to start small and think big (i.e. start with one group, then two, then learn about the administrative logistics, then deploy some progress tracking mechanism, then four groups and onward).
Retuning the image
The almost idealized description above suffers from a few logistical issues. First, I will now address the issue of publication credits. Because researchers must attribute tangible work (including written code and analyses), some graduate students may be hesitant to participate in the program. I don’t know if this will actually become a barrier however as a “tangible contributor” would never be the first nor the last name on a paper (this is true in all CS, Bio and Chem). Furthermore, a paper with more than one author demonstrates the ability to collaborate in a team, and fostering the experience of another student is part of our culture (e.g. co-op students etc.). That brings me to a second major issue. One of the frames that this group could find itself in is a means to circumvent or short-circuit the co-op student appointment process– a frame that I readily reject. In fact, I should hope that this group becomes a means to introduce new putative co-op students to their future advisers which may otherwise be overlooked for their differing background. Finally, there is the problem of attrition, whereby a group dwindles in size as members drop out. The only contract-based perks or penalties I can think of is really ties to the group itself (e.g. unsatisfactory work naturally results in a time out or withdrawal from the program). It is really the only tangible leverage we can offer at the outset– unsteady leverage at best (difficult to assume a reputation when none has yet been built).
More research has to be done in terms of polling and documenting the needs of the department. Furthermore, a deeper understanding about the kinds of students we’d attract and want to appoint is needed– this allows us to understand what time commitment is reasonable (both the lower and upper limits need definition). Finally, the group must from the outset be understood as something beneficial to all parties involved. The solutions named above must be deployed at launch time to ensure minimal friction, and maximum return. First steps were made last year by introducing this idea at the BIC-iGEM meeting– BIC students are excited with this idea, wherein an entire room of a dozen students raised their hands. Furthermore, there are no other groups on campus that attempt this activity, so that SOLVER would provide a unique non-conflicted service.
Of course– this all depends on the amount of work I get done on my own thesis in the first half of the semester.
As an aside Isabelle Lam, a student I TA’d last semester in Biol-241 (microbio) has been planning on starting a job / volunteer / coop mine for science students. I should go and bother her and see how far she’s gotten. Her project is called “SPORE”. Last I checked, her team was registering a subdomain with the university.
EDIT: I previously confused Isabelle with Lisa– this has been corrected.