Bioinformatics often deals with sequential data with data laid out on a 1-dimensional genomic coordinate system. Since these data signals are often compared against functional regions in genome annotations, it is often necessary to identify contiguous regions of interest. I have never come across a function built into numpy or scipy to accomplish this, but I was inspired from two stackoverflow posts:
Read more...29 Nov 2019
I have been running a parameter sweep on a recurrent neural network (RNN) consisting of long short-term memory (LSTM) layers, and most of my long runs would eventually fail after being able to allocate additional memory.
Read more...17 Oct 2019
I have been writing a python module that utilizes threading, and I wanted it to have a specific logging format so messages from separate threads could be differentiated. However, when the module was imported, it inherited any (root logger) format that was specified before it.
Read more...26 Sep 2019
In my most recent project, rgc, I have been using the python threading library for concurrent operations. Python Threads are often overlooked because the python GIL forces them to share a single CPU core, but they are great for scaling I/O or subprocess calls without worrying about communication.
Read more...21 Dec 2018
I have always thought Python did a great job exposing parallel processing with the multiprocessing
package. The Pool
class in particular made it relatively simple to jump from the built-in map
function, which is a good first step to accelerating loops, to utilizing all cores on a processor without any obscure hoops.
26 May 2018