A template for fine-tuning your own GPT-2 model.

GPT-3 has dominated the NLP news cycle recently with its borderline magical performance in text generation, but for everyone without **$1,000,000,000** of Azure compute credits there are still plenty of ways to experiment with language models on your own. Hugging Face is a free open source company focussing on NLP tooling and they provide one of the easiest ways of accessing pre-trained models and tokenizers for NLP experiments. In this article, I will share a method for fine tuning the 117M parameter GPT-2 model with a corpus of Magic the Gathering card…

Language models are empowering thousands of products and millions of people through language generation like predictive text and classification. For instance, many companies are using NLP for customer service interactions, to help route users to solutions more quickly. Or github and copilot providing code from comments. Maybe most subtly, but most impactfully BERT powers almost every single English based query done on Google Search, fundamentally improving how you find things on the web. …

In my previous article, I wrote about linear regression, starting with linear equations and analytical solutions to fitting your data, to gradient descent-optimized models and using PyTorch primitives to create a single layer neural network to solve a continuous linear regression problem. This time, I will be giving an explanation of how to make discrete classifications using logistic regression on a binary breast cancer data set.

In the ScikitLearn Breast Cancer data set, we have labels for benign or malignant tumours and a matrix of thirty features which describe each of 569 breast tumours to make the predictions on. I…

Neural network frameworks, automl solutions and staple numerical libraries, like Scikit-learn and SciPy, have abstracted away much of the logic and math from the implementation of workhorse algorithms. Linear regressions fall into this category. Regressions are fundamental techniques that are often as performant as more complicated models, but we sometimes underestimate them as simply “drawing a line through the data”. This oversimplification leads us to forget that the most basic neural network is infact just a linear regression with no hidden layers, meaning that understanding linear regressions is key to understanding deep learning techniques. …