aivancity NLP#
Main repository for the 2024-2026 Natural Language Processing class at aivancity by Paul Lerner (both during semester 1 and 2)
Classes#
Practical Works#
Practical Work 1: Text Classification/Bag of Words/Naive Bayes https://colab.research.google.com/github/LernerPaul/pw1_cls/blob/main/notebook.ipynb
Practical Work 2: Distributional Semantics/Skipgram/word2vec https://colab.research.google.com/github/LernerPaul/pw2_embedding/blob/master/notebook.ipynb
Practical Work 3: N-gram language models and decoding from Large Language Models https://colab.research.google.com/github/LernerPaul/pw3_llm/blob/master/notebook.ipynb
Practical Work 4: Code a Transformer from scratch https://colab.research.google.com/github/LernerPaul/pw4_transformers/blob/master/notebook.ipynb
Practical Work 5: Fine-tuning and benchmarking a pretrained model https://colab.research.google.com/github/LernerPaul/pw5_finetune/blob/master/notebook.ipynb
Practical Work 6: Measuring biases of pretrained and aligned models https://colab.research.google.com/github/LernerPaul/pw6_bias/blob/master/notebook.ipynb
Contributing#
Add Google Colab badges to PWs with https://openincolab.com/
Build docs using sphinx-build -b html . docs
Acknowledgements#
This class directly builds upon:
Jurafsky, D., & Martin, J. H. (2024). Speech and Language Processing : An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models (3rd éd.). https://web.stanford.edu/~jurafsky/slp3/ed3bookaug20_2024.pdf
Eisenstein, J. (2019). Natural Language Processing. 587. https://nlp.cs.princeton.edu/cos484-sp21/readings/eisenstein-nlp-notes.pdf
Yejin Choi. (Winter 2024). CSE 447/517: Natural Language Processing (University of Washington Paul G. Allen School of Computer Science & Engineering)
Noah Smith. (Winter 2023). CSE 447/517: Natural Language Processing (University of Washington Paul G. Allen School of Computer Science & Engineering)
Benoît Sagot. (2023-2024). Apprendre les langues aux machines (Collège de France)
Clément Morand (Oct 7, 2025). ISIR seminar. Artificial Intelligence (AI) and Machine Learning: an overview of environmental and social issues
Chris Manning. (Spring 2024). Stanford CS224N: Natural Language Processing with Deep Learning
Thomas Gerald. Ecole Automne 2024. Fine-tuning lecture https://gitlab.lisn.upsaclay.fr/ecole-automne/fine-tuning
Classes where I was/am Teacher Assistant: - Christopher Kermorvant. Machine Learning for Natural Language Processing (ENSAE) - François Landes and Kim Gerdes. Introduction to Machine Learning and NLP (Paris-Saclay)
Also inspired by:
My PhD thesis: Répondre aux questions visuelles à propos d’entités nommées (2023)
Noah Smith (2023): Introduction to Sequence Models (LxMLS)
Kyunghyun Cho: Transformers and Large Pretrained Models (LxMLS 2023), Neural Machine Translation (ALPS 2021)
My former PhD advisors Olivier Ferret and Camille Guinaudeau and postdoc advisor François Yvon
My former colleagues at LISN