Advanced Natural Language Processing (NLP) and Temporal Sequence Processing | SGInnovate
September212021
to
September232021

Location

ONLINE WORKSHOP

Advanced Natural Language Processing (NLP) and Temporal Sequence Processing

Organised by SGInnovate and Red Dragon AI

This is the last run of the Deep Learning Developer Workshop Series – Don’t miss this chance to learn new AI competencies from leading professionals!

Together with Red Dragon AI, SGInnovate is pleased to present the third module of the Deep Learning Developer Series. In this module, we dive deeper into some of the latest Deep Learning techniques for text and time series applications.

About the Deep Learning Developer Series:

The Deep Learning Developer Series is a hands-on series targeted at developers and data scientists who are looking to build Artificial Intelligence applications for real-world usage. It is an expanded curriculum that breaks away from the regular eight-week full-time course structure and allows for modular customisation according to your own pace and preference. In every module, you will have the opportunity to build your Deep Learning models as part of your main project. You will also be challenged to use your new skills in an application that relates to your field of work or interest.

Start here

    • Have an interest in Deep Learning?
    • Join us if you are able to read and follow codes
    • This module is compulsory before you take the advanced modules
     
    • You will need to take module 1 before this module
     
    • You will need to take module 1 before this module
     
    • You will need to take module 1 AND module 2 OR 3 before this module
     
    • You will need to take module 1, 2 AND 3 before this module
    • Attain a “Deep Learning Specialist” certification when you complete all five modules
     
    • Have an interest in Deep Learning?
    • Join us if you are able to read and follow codes
    • This module is compulsory before you take the advanced modules
     
    • You will need to take module 1 before this module
     
    • You will need to take module 1 before this module
     
    • You will need to take module 1 AND module 2 OR 3 before this module
     
    • You will need to take module 1, 2 AND 3 before this module
    • Attain a “Deep Learning Specialist” certification when you complete all five modules
     

About this module:

In this course, we go beyond the basic skills and dive deeper into some of the latest techniques for using Deep Learning for Natural Language Processing (NLP) and Natural Language Understanding (NLU) applications.

Since text classification is a general workhorse for NLP tasks, we will build custom models for tasks such as sentiment analysis, spam detection, and classifying document subject matter over the course.

A specific requirement of NLP systems is to reliably detect entities and classifying individual words according to their parts of speech. We will look at how Named Entity Recognition (NER) works, how Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTMs) are used for tasks like this and more. We will also show how these methods can be customised to look for more company-related entities in text that are often required in business, such as finding product names, components and mentions of job roles.

To provide a foundation for these methods, we explore the Deep Learning technique of using word, character and document vector embeddings. We will cover well-known models such as Word2Vec and GLoVE - how they are created, their unique properties and how you can use them to improve the accuracy of Natural Language in terms of understanding problems and applications.

Building chatbots is another useful NLP skill and we will examine how combining text classification and slot filling can be used to create custom chatbots with better accuracy than off the shelf systems.

While our course starts with techniques using RNNs/LSTMs and Attention, we allocate equal time to cover recent developments using transfer learning for text-related problems and language modeling with Transformer models. These models have led to some recent state-of-the-art results for text classification problems like sentiment analysis and many more. This section will cover papers from ULMFIT, OpenAI’s most recent Transformer models (GPT2 and GPT3); Google’s BERT, T5 and Reformer models; and the fundamentals of how Transformer architectures work and how they can be applied to many common techniques with code examples.

We will cover Neural Machine Translation for additional NLP applications, where you will learn the recent developments and models that use these techniques and various types of attention mechanisms that dramatically increased the quality of translation systems. This topic extends by looking at the challenges of Multilingual NLP and models for text similarity and sentence embeddings. This course is designed to give the participants a practical hands-on approach. Participants will be taught from real-world code examples as well as in class challenges which will be worked through and completed during class. The goal is to prepare the participants for applications, challenges and needs that they will face in the day to day world as a data scientist dealing with Natural Language Processing.

Participants will be given in-class challenges each day, which might involve fixing code with (deliberate) bugs in it or creating a model from scratch given the data with some preprocessing code.

The In-class Challenges for the Advanced NLP are as follows:

  • Day 1 : Toxic words LSTM/CNN/Embeddings for multiclass classification
  • Day 2 : Transformer Sentiment Analysis - Fixing some common code errors
  • Day 3 : Full Transformer/BERT variant Implementation Challenge

Workshop Overview:

In the course participants will learn:

  • Text Classification Models and to build a text classifier
  • To build a Named Entity Recogniser (NER) system
  • About Sequence-to-Sequence models
  • To build NLP models from scratch
  • To build a Chatbot ML system
  • To build a language model
  • Document embeddings
  • Text similarity models and pipelines
  • Longformer & Reformer - Transformers for large texts and contexts
  • Multilingual NLP systems

Taught with over 30 notebooks of code examples that students can use for their own projects.

Prerequisite(s):

  • *Must have attended Module 1: Foundations of Deep Learning (previously known as Deep Learning Jumpstart Workshop)
  • Attendees MUST have their own laptops

*Trainees who are versed in deep learning or are current practitioners may write in for request of waiver of pre-requisite course. Waiver of pre-requisite will be considered on a case-to-case basis.

Pre-workshop Instructions:

  • You MUST have a laptop to attend this workshop
  • Please watch the introductory videos that will be sent out separately
  • Please experiment with the pre-exercises given
  • Please have a Google account with access to Google Colab

Day 1

Section 1: Recurrent Neural Networks Recap 
Prerequisites: Foundations of Deep Learning (previously known as Deep Learning Jumpstart Workshop)
Frameworks: TensorFlow, Keras
Abstract:

  • Recurrent Neural Networks
  • LSTMs (Long Short Term Memory)
  • Word Embeddings: Word2Vec, GloVE
  • Basic Char RNN
  • Word RNN
  • Build LSTM networks

Section 2: Natural Language Processing
Prerequisites: NA
Frameworks: TensorFlow, Keras, PyTorch
Abstract:

  • Text Classification Models
  • BiDirectional LSTMs
  • Building a Named Entity Recogniser (NER) system
  • Sentiment analysis
  • Build a text classifier
  • Personal Text project
  • Project Briefing

Day 2

Section 3: Sequence-to-Sequence & CNN for Text
Prerequisites: Previous Modules 
Frameworks: TensorFlow, Keras, PyTorch 
Abstract:

  • Sequence-to-Sequence models
  • Convolutions for text networks
  • Clustering
  • Seq2Seq Chatbot
  • Project Clinic

Section 4: Time Series
Prerequisites: Previous Modules 
Frameworks: TensorFlow, Keras, PyTorch 
Abstract:

  • Dealing with date times in models
  • Time Series models
  • Rise of the language models

Day 3

Section 5: Transformers, BERT, RoBERTa, T5 and more
Prerequisites: Previous Modules
Frameworks: TensorFlow, Keras, PyTorch, 
Abstract:

  • Building Transformer models from scratch
  • Understanding BERT
  • BERT derivative models
  • Text Generation models GPT2 and GPT3
  • T5, Reformer, Mena and beyond

Section 6: Online Learning
Prerequisites: Previous Modules
Frameworks: TensorFlow, Keras, PyTorch, Spacy
Abstract:

  • Building NLP models from scratch
  • NLP pipelines
  • Guide to using Spacy
  • Building a Chatbot ML System
  • Building a language model

Assessments
Participants must fulfil the criteria stated below to pass and complete the course.

1.    Online Tests: Participants are required to score an average grade of at least 80% for the online quizzes.

2.    Project: Participants are required to submit a project that demonstrates the following:

  • Train a model for a text or sequences dataset using the advanced skills taught in this course. Example of advanced skills include using Transformer Models for text classification, entity recognition or similarity models.
  • The Model must reach an acceptable level of performance and make use of industry standard data pipelines.
  • The project should contain written text explaining the project, data used and the processes used.
  • Using CNNs or Attention for Text. Developing forecasting models.

$2,140 / pax (after GST)

Funding Support

CITREP+ is a programme under the TechSkills Accelerator (TeSA) – an initiative of SkillsFuture, driven by Infocomm Media Development Authority (IMDA).

Link to CITREP+ Programme Support table guide.

Funding Amount:

  • CITREP+ covers up to 90% of your nett payable course fee depending on eligibility for professionals

Please note: funding is capped at $3,000 per course application

Funding Eligibility: 

  • Singaporean / PR
  • Meets course admission criteria
  • Sponsoring organisation must be registered or incorporated in Singapore (only for individuals sponsored by organisations)

Please note: 

  • Employees of local government agencies and Institutes of Higher Learning (IHLs) will qualify for CITREP+ under the self-sponsored category
  • Sponsoring SMEs organisation who wish to apply for up to 90% funding support for course must meet SME status as defined here

Claim Conditions: 

  • Meet the minimum attendance (75%)
  • Complete and pass all assessments and / or projects

For more information on CITREP+ eligibility criteria and application procedure, please click here

In partnership with:Driven by:

  

In partnership with employers to support employability:

For enquiries, please send an email to [email protected]



Dr Martin Andrews

Martin has over 20 years’ experience in Machine Learning and has used it to solve problems in financial modelling and the creation of Artificial intelligence (AI) automation for companies. His current area of focus and specialisation is in Natural Language Processing and understanding. In 2017, Google appointed Martin as one of the first 12 Google Developer Experts for Machine Learning. Martin is also one of the Co-founders of Red Dragon AI.



Sam Witteveen

Sam has used Machine Learning and Deep Learning in building multiple tech startups, including a children’s educational app provider which has over 4 million users worldwide. His current focus is AI for conversational agents to allow humans to interact easier and faster with computers. In 2017, Google appointed Sam as one of the first 12 Google Developer Experts for Machine Learning in the world. Sam is also one of the Co-founders of Red Dragon AI.

Topics: AI / Machine Learning / Deep Learning