• Comparing loss values of different data sizes

    While running the hyperparameter optimization of a model, where one of the parameters was the actual data size, I realized that I didn’t know if loss values calculated from different data sizes were comparable. I knew that different loss metrics could not be compared, but I was not sure if different data sizes affected the final value.

    Read more...

    22 Jan 2020

  • Finding contiguous region coordinates with python

    Bioinformatics often deals with sequential data with data laid out on a 1-dimensional genomic coordinate system. Since these data signals are often compared against functional regions in genome annotations, it is often necessary to identify contiguous regions of interest. I have never come across a function built into numpy or scipy to accomplish this, but I was inspired from two stackoverflow posts:

    Read more...

    29 Nov 2019

  • Mitigating a memory leak in Tensorflow's LSTM

    I have been running a parameter sweep on a recurrent neural network (RNN) consisting of long short-term memory (LSTM) layers, and most of my long runs would eventually fail after being able to allocate additional memory.

    Read more...

    17 Oct 2019

  • Multiple Python Logging formats

    I have been writing a python module that utilizes threading, and I wanted it to have a specific logging format so messages from separate threads could be differentiated. However, when the module was imported, it inherited any (root logger) format that was specified before it.

    Read more...

    26 Sep 2019

  • Interrupting Python Threads

    In my most recent project, rgc, I have been using the python threading library for concurrent operations. Python Threads are often overlooked because the python GIL forces them to share a single CPU core, but they are great for scaling I/O or subprocess calls without worrying about communication.

    Read more...

    21 Dec 2018