Jan 15, 2024

Annotating the Annotated Transformer

The Annotated Transformer is a detailed and instructive guide, offering comprehensitve insights into the original transformer architecture. I decided to add my own notes to make it clearer. In this...

Aug 09, 2023

Language models and transformer from scratch

I recently did some exercises on (small) language models. It is still quite a foreign field to me, so the only way to appreciate it better is to start from...

Jul 16, 2023

Statistical Mechanics and Statistical Inference

I have to confess that when I was a physics student, I thought taking classes on probability theory and statistics was an unnecessary distraction from learning “real physics”. But while...

Jun 26, 2023

Evaluation Stores - a high bias, low variance view

Feature Store has been one of the hottest buzzwords in the machine learning community in recent years. In my view, however, “Evaluation Store” should be of equal or higher priority...

Jun 15, 2023

From Laplace to Neural Networks (Part 2)

We continue the discussion in Part 1, but now using neural networks. Can neural networks really predict a nonlinear system like double pendulum? Well, we know from universal approximation theorem...

May 23, 2023

From Laplace to Neural Networks (Part 1)

Time-series predictions have been always interesting to me (and it’s been very hard for me), but I realized that I had never thought about predicting the time-series of physical system...

May 14, 2023

Biases in logistic regression - it is not about N (Part 1)

Here is a short script I used to run often: from sklearn.linear_model import LogisticRegression model = LogisticRegression() model.fit(X, y) But there could be a problem in this naive implemtation of...