Overview
With increasing digitization across industries, we are creating and consuming more imageled digital data than ever before – both as organizations and as individuals. From photos captured on our mobile phones to medical scans at the doctor’s office, digital images have become ubiquitous. Consequently, computer vision technologies that are needed to process and analyze these images have also quickly evolved. This has resulted in an increased demand for technology professionals with in-depth knowledge in this field.
The Computer Vision online program from the Executive Education division of Carnegie Mellon University’s School of Computer Science is specifically designed to cater to this demand. This program will help you to receive a comprehensive introduction to the theory and algorithms used in computer vision systems. The technology has applications across a whole gamut of industries, including retail, manufacturing, sports, healthcare, and agriculture. It is also used in online services, such as Facebook and Amazon, in autonomous driving, and mobile phones. Along with insights into the real-world uses of this technology, the program will help you to understand techniques, including image detection, recognition and processing, 3D reasoning, and video analysis.
The program is ideal for technology professionals, including machine learning (ML) engineers, looking for hands-on experience with computer vision tools and techniques and the ability to understand how the technology is used in the real world. You will focus on the skills needed to work on problems related to motion and structure, visual cognition, computational photography, and 3D reconstruction.
Course Description & Learning Outcomes
This is a 10-week online program designed to provide software developers, technology professionals, data scientists, data analysts, and ML professionals with an understanding of computer vision concepts, tools, and techniques. The program also explores real-world applications of this technology. In this program, you will:
Implement fundamental image processing methods and learn about the various techniques they require
Use neural networks to perform image recognition and classification
Extract 3D information from images and learn the basic principles of geometry-based vision
Align and track objects in video
Recommended Prerequisites
This program requires knowledge of multivariable calculus, linear algebra, probability, and statistics, as well as Python programming.
Pre-course instructions
Sign up for the course on Deep Tech Central, using the registration link available here, for a 20% discount
Schedule
End Date: 13 May 2026, Wednesday
Location: Online
Agenda
Day/Time | Agenda Activity/Description |
---|---|
Module 1 | Introduction to Computer Vision |
Module 2 | Image Processing |
Module 3 | Feature Detection and Matching |
Module 4 | Image Classification and Neural Networks |
Module 5 | Convolutional Neural Networks (CNNs) |
Module 6 | Transformation and Homographies |
Module 7 | Camera Models |
Module 8 | Geometry-Based Vision |
Module 9 | Dealing With Motion |
Module 10 | Physics-Based Vision |
Pricing
Course fees: 2500 USD before 20% discount for Deep Tech Central member In order to qualify for the 20% discount, sign up for the course on Deep Tech Central, using the registration link available here or by emailing us at [email protected]
Skills Covered
PROFICIENCY LEVEL GUIDE
Beginner: Introduce the subject matter without the need to have any prerequisites.
Proficient: Requires learners to have prior knowledge of the subject.
Expert: Involves advanced and more complex understanding of the subject.
- Computer Vision (Proficiency level: Expert)
Speakers
Trainer's Profile:
Kris M. Kitani, Associate Research Professor, Robotics Institute, School of Computer Science | Courtesy Professor, Electrical and Computer Engineering Department Carnegie Mellon University, Carnegie Mellon University
Kris Kitani works in the areas of computer vision, machine learning, and human–computer interaction. His research interests lie at the intersection of first-person vision, human activity modeling, and inverse reinforcement learning. Kitani’s work has applications in areas spanning personal and assistive robotics, surveillance and security, infrastructure, field robotics, and manufacturing. Kitani earned his Ph.D. and master’s degree in science from the University of Tokyo. He also has a bachelor’s degree in science from the University of Southern California.
Trainer's Profile:
Ioannis Gkioulekas, Assistant Professor, Robotics Institute, Carnegie Mellon University, Carnegie Mellon University
Ioannis Gkioulekas works on computational imaging: the process of forming images from measurements using algorithms that rely on a significant amount of computing. While imaging involves optics, sensors, and illumination, computation includes physics-based modeling and rendering, inverse algorithms and and adaptive imaging. He is also broadly interested in computer vision and computer graphics. Gkioulekas earned his Ph.D. from the School of Engineering and Applied Sciences at Harvard University. He also has a diploma in Electrical and Computer Engineering (five-year degree) from the National Technical University of Athens.