• Mitigating a memory leak in Tensorflow's LSTM

    I have been running a parameter sweep on a recurrent neural network (RNN) consisting of long short-term memory (LSTM) layers, and most of my long runs would eventually fail after being able to allocate additional memory.

    Read more...

    17 Oct 2019

  • Multiple Python Logging formats

    I have been writing a python module that utilizes threading, and I wanted it to have a specific logging format so messages from separate threads could be differentiated. However, when the module was imported, it inherited any (root logger) format that was specified before it.

    Read more...

    26 Sep 2019

  • Interrupting Python Threads

    In my most recent project, rgc, I have been using the python threading library for concurrent operations. Python Threads are often overlooked because the python GIL forces them to share a single CPU core, but they are great for scaling I/O or subprocess calls without worrying about communication.

    Read more...

    21 Dec 2018

  • Multiprocessing Size and Rank

    I have always thought Python did a great job exposing parallel processing with the multiprocessing package. The Pool class in particular made it relatively simple to jump from the built-in map function, which is a good first step to accelerating loops, to utilizing all cores on a processor without any obscure hoops.

    Read more...

    26 May 2018

  • Generating Different Hash Functions

    Representing genetic sequences using k-mers, or the biological equivalent of n-grams, is a great way to numerically summarize a linear sequence. Depending how unique you need your k-mers to be, you may overallocate your system memory trying to keep track of all 4^k possibilities, where there are 4 possible bases (A, G, C, T) and k-length strings. To circumvent this technological constraint, Bloom filters were designed to probabilisticly track the presence (not count) of items.

    Read more...

    05 Feb 2018