Teaching a course on LLMs and GenAI

Dmitry Kan
2 min readDec 1, 2024

--

This Fall we are teaching a course on LLMs and Generative AI at the University of Helsinki, together with Aarne Talman (Accenture) and Jussi Kalgren (Silo.AI, now AMD).

Screenshot of the PDF RAG streamlit app

Syllabus:

Week 1: Introduction to Generative AI and Large Language Models (LLM)

  • Introduction to Large Language Models (LLMs) and their architecture
  • Overview of Generative AI and its applications in NLP
  • Lab: Learn about Tokenizers

Week 2: Using LLMs and Prompting-based approaches

  • Understanding prompt engineering and its importance in working with LLMs
  • Exploring different prompting techniques for various NLP tasks
  • Hands-on lab: Experimenting with different prompts and evaluating their effectiveness

Week 3: Evaluating LLMs

  • Understanding the challenges and metrics involved in evaluating LLMs
  • Exploring different evaluation frameworks and benchmarks
  • Hands-on lab: Evaluating LLMs using different metrics and benchmarks

Week 4: Fine-tuning LLMs

  • Understanding the concept of fine-tuning and its benefits
  • Exploring different fine-tuning techniques and strategies
  • Hands-on lab: Fine-tuning an LLM for a specific NLP task

Week 5: Retrieval Augmented Generation (RAG)

  • Understanding the concept of RAG and its advantages
  • Exploring different RAG architectures and techniques
  • Hands-on lab: Implementing a RAG system for a specific NLP task

Week 6: Use cases and applications of LLMs

  • Exploring various real-world applications of LLMs in NLP
  • Discussing the potential impact of LLMs on different industries
  • Hands-on lab: TBD

Week 7: Final report preparation

  • Students work on their final reports, showcasing their understanding of the labs and the concepts learned.

Aarne and Jussi took care of the foundation: intro to LLMs, prompt engineering, fine-tuning and evaluation.

I gave the lecture on RAG (Week 5), which included a live demo built with Gemma LLM, SBERT, streamlit, allowing you to query your PDF files with natural language questions. The code runs locally, which means there is no dependency on any API provider. This certainly means, that LLM inference may be much slower if ran on CPU (though having a local GPU isn’t a rarity today!). However, such an approach has certain benefits, for example the ability to query a sensitive PDF (say, a statement from your bank) without worrying that your precious data goes onto the cloud.

The GitHub repository for the course can be found here: https://github.com/Helsinki-NLP/LLM-course-2024 (it is still evolving at the time of this writing).

And here goes the lecture recording:

I’d like to thank Aarne Talman for inviting me to co-teach this course, and to Atita Arora for helping out with some RAG materials and sharing her slides. In this lecture I have also relied on the work done by Daniel Bourke. The novel part contributed by me is the streamlit app that allows you to query your PDFs in natural language.

--

--

Dmitry Kan
Dmitry Kan

Written by Dmitry Kan

Founder and host of Vector Podcast, tech team lead, software engineer, manager, but also: cat lover and cyclist. Host: https://www.youtube.com/c/VectorPodcast

No responses yet