Bowtie Final

Written by Hunter Jansen on December 06, 2014

So this is going to (probably) be my final post on bowtie and as this (my final) semester wraps up, the last post for SPO600 at Seneca College. There’s not really too much left to say, so this post is most likely going to be shorter than the past few.

Since my last post, I’ve gone through testing bowtie as much as I could think of, ensuring all the command I know of work after my changes and testing performance with and without my changes (and with the fedpkg version). I’ll post briefly on my findings on that, and round everything off with some final thoughts.

Bowtie

In the last post, I mentioned that I’d be reaching out to upstream to ask them about some changes to the makefile as well as just to touch base to declare my intentions and say hi. I reached out via their sourceforge page which appeared to be the only real avenue of conversation for the project, but have yet to hear back. In my research along the ways, I’ve seen that the bowtie team is commonly unresponsive to issues and communications - so this isn’t very surprising, but I guess a little disappointing.

It occurs to me now that I haven’t written any updates since properly updating my code to include ifdefs - so I appologize for that. In the end, it didn’t actually end up being very much work to get those in there, and I learned that the compiler actually creates a variable called aarch64 that you can check against on arm64 systems. Instead of writing a large post about content that’s largely self explanatory, I’ll just link to my github repo commit.

The main files to look at are ebtw.h and third-party/cpuid.h, for a bit more indepth reasoning behind the code, you can read the previous entry.

Tests

So, as part of me accepting that I’ve changed things in a way that doesn’t break the existing implementation, I had to run some tests to make sure that not only does stuff work, but it works at or better than the existing solution.

Unfortunately, bowtie doesn’t have an existing test suite, so the way I went about testing everything was by running a few of the more common commands in the getting started guide on the following setups:

  • The fedpkg code that I got on x86
  • The bowtie repo from upstream on x86
  • The updated code from me on x86
  • The updated code on arm64

The main way I did this was with the time command in a short for loop ala:

time for i in {1..10}; 
do 
time ./bowtie e_coli reads/e_coli_1000.fq; 
done

This essentially just runs the e coli reader 10 times and outputs the total time and the average time.

So here’s the results for that command. (Note that I did perform similar tests on other commands, but for brevity I’m just including this one):

  • Fedpkg:
real	0m0.074s
user	0m0.041s
sys	0m0.020s

real	0m0.717s
user	0m0.390s
sys	0m0.186s
  • clean repo x86
real	0m0.116s
user	0m0.080s
sys	0m0.023s

real	0m1.166s
user	0m0.757s
sys	0m0.268s
  • Updated repo x86
real	0m0.070s
user	0m0.037s
sys	0m0.019s

real	0m0.745s
user	0m0.402s
sys	0m0.194s

  • Updated repo arm64
real	0m0.120s
user	0m0.080s
sys	0m0.020s

real	0m1.210s
user	0m0.810s
sys	0m0.190s

Sooooo, everything looks fine from that. Oddly, the updated code is more on par with the fedpkg version than the unadultered git version. Also, an important thing to note is that the arm64 machine has consistently had slower execution times for everything throughout all these experiments, not just on bowtie - so it’s not necessarily slower on arm64, just on this machine.

Finally!

So with all that done up, the final step was to remove all the extra things that got added along the way via running some of the testing bowtie commands.

I also decided to take into consideration that the built version of bowtie I should push would be the x86 version, as it’s currently still primarily an x86 run program.

Following both of these, I made one final commit and sent the pull request with results awaiting. If there are any revisions required by upstream, I probably won’t hear about them until after the semester’s over, but I’ll post updates on here regardless of the outcome.

Final Thoughts

It’s almost kind of bittersweet to be done with bowtie and spo600, especially as it’s really the last work I’ll be doing during my time at Seneca college. Even though my trajectory of front-end web development probably won’t use anything learned in the course, it’s still a neat tool and important knowledge to store in my utility belt of programming stuff.

As I mentioned earlier, if bowtie needs more work to get into the the upstream I’ll finish it off and post about it here. I’ve been really wanting to contribute to an open source project for a long time, but have been shy about it for a bunch of self conscious reasons. So while this is only my first contribution to the world, hopefully it’s just the start of many to come to all sorts of interesting projects.

Thanks to Chris for being such a knowledgable prof and quickly providing help and feedback wherever it was needed, and thanks to whoever else ended up reading these posts, it was truly an experience!

Done for now, but until next time -Hunter