Course
Digicomp Code H36446
Natural Language Processing and Generative Models with Python and transformers («H36446»)
Course facts
- Gaining in-depth knowledge of concepts and methods for the use of language-based artificial intelligence
- Learning about the fundamental technologies and developing comprehensive knowledge of transformer architecture as a key technology for modern generative AI
- Learning how to work with the most important Python frameworks and pre-trained models on Hugging Face and knowing how to use them in your own projects
- Acquiring in-depth theoretical knowledge in the field of language processing with artificial intelligence and practical experience in the application of methods and frameworks
- Developing, adapting, and productively using your own language systems and models based on machine learning
- Learning to use the technologies in your own projects
1 Python techniques for text processing
- Python basics for word processing
- Processing text and PDF files
- The most important regular expressions
2 Introduction to natural language processing (NLP)
- Concepts of natural language processing
- Using the SpaCy library for text analysis
- Tokenization, stemming, and lemmatization
- Part-of-speech and named entity recognition
- Decomposing Texts with Sentence Segmentation
3 Text Classification and Text Analysis
- Introduction to scikit-learn
- Evaluating Classification Models with Precision, Recall, and F1 Score
- Semantic Understanding and Sentiment Analysis
- Vector-Based Text Representations with Word Vectors
- Sentiment Analysis with the NLTK Library
4 Topic Modeling and Long Short-Term Memory
- Introduction to Topic Modeling
- Classification with Latent Dirichlet Allocation (LDA)
- Recognizing Structures with Non-negative Matrix Factorization (NMF)
- Long Short-Term Memory, GRU, and Text Generation
- Implementing an LSTM for Text Generation with Keras
5 Transformers and Attention
- The Concept of Self-Attention
- Multihead Attention and Its Significance in NLP Models
- Encoders and Decoders for Machine Translation and Language Understanding
- Architectural Concepts of Common Transformer Models: GPT-2/3/4, BERT
- Creating a Transformer structure with Python and Keras
- Training and evaluation of a Seq2Seq Transformer
6 Transfer learning and fine-tuning with Hugging Face
- Introduction to Hugging Face and presentation of pre-trained models
- Selection of suitable models and tokenizers
- Transfer learning and fine-tuning of pre-trained models
- Automatic configuration and adaptation of models
7 Practical project: Training your own chatbot
This online seminar will be held in a group of no more than 12 participants using Zoom video conferencing software. Individual support from the instructors is guaranteed.
The practical exercises will be provided in the form of Jupyter Notebooks, which you can install locally on your own computer. The computationally intensive training of the data models will be carried out on freely available cloud GPUs.
The instructors will be available to assist you with the practical exercises – in the virtual classroom or individually in breakout sessions.
After registering, you will find all the information, downloads, and extra services related to this training program in your online learning environment.
This training is aimed at anyone who wants to understand machine learning, natural language processing, and generative AI in detail and use them in their own projects. This course is a valuable building block in the qualification process to become a data scientist, data engineer, or machine learning engineer.
Basic knowledge of programming with Python is required. Additional technical, mathematical, and statistical knowledge is helpful but not required.
To ensure that you receive any necessary documents by mail in good time, we recommend booking at least 14 days before the seminar date.