A template for fine-tuning your own GPT-2 model.

GPT-3 has dominated the NLP news cycle recently with its borderline magical performance in text generation, but for everyone without $1,000,000,000 of Azure compute credits there are still plenty of ways to experiment with language models on your own. Hugging Face is a free open source company focussing on NLP tooling and they provide one of the easiest ways of accessing pre-trained models and tokenizers for NLP experiments. In this article, I will share a method for fine tuning the 117M parameter GPT-2 model with a corpus of Magic the Gathering card…

Language agnostic BERT sentence encoding, SCANN and neural collaborative filtering: a combined approach.


Language models are empowering thousands of products and millions of people through language generation like predictive text and classification. For instance, many companies are using NLP for customer service interactions, to help route users to solutions more quickly. Or github and copilot providing code from comments. Maybe most subtly, but most impactfully BERT powers almost every single English based query done on Google Search, fundamentally improving how you find things on the web. …

Logistic Regression from scratch with NumPy.


In my previous article, I wrote about linear regression, starting with linear equations and analytical solutions to fitting your data, to gradient descent-optimized models and using PyTorch primitives to create a single layer neural network to solve a continuous linear regression problem. This time, I will be giving an explanation of how to make discrete classifications using logistic regression on a binary breast cancer data set.


In the ScikitLearn Breast Cancer data set, we have labels for benign or malignant tumours and a matrix of thirty features which describe each of 569 breast tumours to make the predictions on. I…

Linear regression from scratch using Pytorch and Autograd

Neural network frameworks, automl solutions and staple numerical libraries, like Scikit-learn and SciPy, have abstracted away much of the logic and math from the implementation of workhorse algorithms. Linear regressions fall into this category. Regressions are fundamental techniques that are often as performant as more complicated models, but we sometimes underestimate them as simply “drawing a line through the data”. This oversimplification leads us to forget that the most basic neural network is infact just a linear regression with no hidden layers, meaning that understanding linear regressions is key to understanding deep learning techniques. …

Richard Bownes

BBC Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store