Posts

TiDE - Forecasting in the lodging world

Image
Note that the development of this article was stopped after I noticed a problem in the TiDE official implementation. After noticing the bug, I realized an issue was already opened in the repo:  LINK . You can still read the article as it provides some key insights of how to formalize the forecasting problem for lodging signals, but I'm not using this model in the end and won't provide any additional detail for the implementation. I'm currently writing an article on how to implement the SOFTS paper in tensorflow for this same use case, which is a pretty cool paper (I made sure that there were no blatant issues this time 👀) and also quite similar to TiDE as it's mostly using simple dense layers in an encoder-decoder style. So stay tuned ! If you take time to read the code and the paper carefully, you'll actually realize that the TiDE model is nothing but... a linear regression using the target lags ! All of the deep learning technicalities, the encoder and the d...

An Embedding Learning Framework for Numerical Features in CTR Prediction

Image
In this post we're going to review a paper written by our Chinese friends and for which I couldn't find an implementation out there (at least using PyTorch and TF; note that you can find an implementation in MindSpore but I felt quite discouraged to go through their code as it uses a lot of wrappers). This paper is perfect for beginner to intermediate level Machine Learning coders to get their hands on implementing NN  papers. The story behind me reading this paper is pretty straightforward: I was making some research about building an expressive deep learning model for a kind of CTR (click-through rate, meaning number of clicks divided by number of impressions) prediction model at work, and it turns out that DeepFM (Deep Factorisation Machines) was state-of-the-art (at least at the time of reading). Building a CTR model usually means you're working with some kind of search data, which usually comes flavoured with high-dimensional categorical features, with each feature...

Estimation of conditional mixture Weibull distribution with right-censored data using neural network for time-to-event analysis

Image
Paper link This paper extends classical parametric time-to-event models (a part of Survival Analysis (SA)) using a neural network (NN) architecture. While providing the implementation, we will be reviewing the following concepts: Theoretical concepts: Survival analysis Parametric models Maximum Likelihood estimate Drawing random numbers from a given distribution Implementation details ( colab, p=1  and colab, any p ): Custom loss and architecture with Tensorflow How to use Tensorflow probability as an alternative to build the model architecture and its loss Time-to-event prediction with simulated data (the ones discussed in the paper cf. Figure 4) The key take away of this article is how to leverage Tensorflow Probability for survival analysis. Using the framework removes a lot of pain, typically yielding a code base that is easy to modify and that can accomodate every setting described in this article (i.e using a Mixture or not, using the Weibull or some other distribution). ...