Created with flickr slideshow.
Machine Learning for Computer Vision:
This introductory course on Machine Learning aims to give the student the adroitness to understand and apply some of the following topics:
- The abc of computer vision: we will briefly review some concepts from image processing and computer vision, with a focus on object detection and recognition. Most of the subsequent concepts will be motivated with the object detection and recognition in visual data.
- The abc of the learning process: we will briefly review some concepts from the machine learning field with a focus on the classification task and the independent observations assumption. We will illustrate these concepts with some Matlab examples for object recognition in images. Finally we will discuss the limitations of these assumptions in computer vision and pave the way to the articulation between objects and context for object recognition.
- Context dependent data modelling in computer vision: we will start by discussing the sources of context, from the most common local pixel context, which captures the basic notion that image pixels/patches around the region of interest carry useful information, to scene and cultural context. Then we will discuss how these sources of context could be used for improved object detection and recognition. This will motivate a revision of some of the most common tool for modelling context: Hidden Markov Models, Conditional Random Fields, Martingales, Probabilistic Graphical Models, and Statistical Relational Learning. Some of the tools will be illustrated with the application to object recognition using again Matlab. Some of the limitations of these techniques will be stressed, and the difficulties in the application to gigantic databases of visual information will be addressed.
- Context dependent data modelling in computer vision: are we there yet?
Despite the quality of the research in this field and the significant recent advances, we will argue that existing data models cannot yet naturally and directly represent such context-dependent information in computer vision. We will highlight open questions and stimulate promising future research.
- Take home messages: conclusion with a suggestion of a set of useful references in the field.
- Christopher M. Bishop; Pattern recognition and machine learning.
- Forsyth and Ponce, Computer Vision: A Modern Approach, Prentice Hall, 2002.
- Kevin Murphy, Machine Learning: A Probabilistic Perspective.
I this lecture I will give an overview the progress in the area of semantic segmentation including top-down and bottom-up approaches, including different methodologies based on variational approaches and graph cuts. I will also review our group’s recent work on semantic visual interpretation based on image segmentation techniques. Differently from existing bag-of-words or regular-grid description methods that bypass image segmentation entirely, and unlike methods that segment images and recognize objects by detecting known object parts or fusing superpixel maps by means of random field models, we will explore interpretation strategies based on multiple figure-ground segmentations. Central to our approach is a combinatorial parametric max flow methodology (CPMC) that can explore, exactly, a large space of object layout hypotheses constrained at different image locations and spatial scales, in polynomial time. Once a potentially large ensemble of such hypotheses is obtained, we show that it is possible to distill and diversify a pool of a few hundred elements, at minimal loss of accuracy, by training category-independent models to predict how well each segment hypothesis exhibits real world regularities based on mid-level properties like boundary smoothness, Euler-number or convexity. I will show that such a simple combinatorial strategy operating on only low-level and mid-level features can generate segments that cover entire objects or parts in images with high probability and good accuracy, as empirically measured on most existing segmentation benchmarks. Moreover, the figure-ground segment pool can be now used within a sliding-segment- as opposed to sliding window – strategy, or compositionally — and in conjunction with second-order pooled region descriptions, for object detection, semantic segmentation, video processing or monocular 3d human pose reconstruction. A proof of concept system based on such principles has been demonstrated in the PASCAL VOC semantic segmentation challenge where it was top-ranked over the past four editions.
- J. Carreira and C. Sminchisescu. CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012.
- F. Li, J. Carreira, and C. Sminchisescu. Object Recognition by Sequential Figure-Ground Ranking. In International Journal of Computer Vision, 2012.
- C. Ionescu, F. Li, and C. Sminchisescu. Latent Structured Models for Human Pose Estimation. In IEEE International Conference on Computer Vision, November 2011.
- A. Ion, J. Carreira, and C. Sminchisescu. Probabilistic Joint Image Segmentation and Labeling. In Advances in Neural Information Processing Systems, December 2011.
- J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. Semantic Segmentation with Second-Order Pooling. In European Conference on Computer Vision, October 2012.
I will give an overview of methods used in Computer Vision to extract and describe image features, from the ad hoc methods that were originally proposed, to the more modern methods based on Machine Learning techniques.
- David G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
- Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, “SURF: Speeded Up Robust Features”, Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346–359, 2008
- Jan Sichuan, Jiri Matas. WaldBoost, Learning for Time Constrained Sequential Detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
- T. Trzcinski, M. Christoudias, P. Fua, and V. Lepetit, Boosting Binary Keypoint Descriptors. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
I will cover a general introduction to user-centered design, and will focus on some elements of intelligent interaction that make the creation of effective interactive systems a challenge. I will mix practical work with lectures and other materials, so that people get an experience of designing a system and learn how to understand user requirements.
The advent of the Microsoft Kinect and other RGB-D sensors has resulted in great progress in dense mapping, object recognition and SLAM in recent years. Given the low cost of the sensor coupled with the high resolution visual and depth information provided at video frame rate, methods relying on RGB-D sensors are becoming more popular in tackling some of the key perception problems in robotics and computer vision. This course will feature an overview of many of the recent advances in RGB-D camera-based research, including methods which specifically exploit the wide-scale availability of general-purpose computing on graphics processing units. The following topics will be covered in the course:
- Sensor technology and calibration
- Dense tracking & GPGPU approaches
- 3D reconstruction
- Large scale dense SLAM
- Applications of RGB-D data
“Kintinuous: Spatially Extended KinectFusion” by T. Whelan, M. Kaess, M.F. Fallon, H. Johannsson, J.J. Leonard and J.B. McDonald. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, (Sydney, Australia), July 2012. .
“Robust Real-Time Visual Odometry for Dense RGB-D Mapping” by T. Whelan, H. Johannsson, M. Kaess, J.J. Leonard, and J.B. McDonald. In IEEE Intl. Conf. on Robotics and Automation, ICRA, (Karlsruhe, Germany), May 2013.
“Deformation-based Loop Closure for Large Scale Dense RGB-D SLAM” by T. Whelan, M. Kaess, J.J. Leonard, and J.B. McDonald. In IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, IROS, (Tokyo, Japan), November 2013.
- Introduction to Biometric Recognition
- Biometric Traits: Comparison and Critical Review
- A Biometric recognition System From the Pattern Recognition Perspective
- Data Acquisition
- Object detection
- Object Segmentation
- Feature Encoding
- Gabor Filters
- Multi Lobe Differential Filters
- Biometric Systems Performance
- Inter-class and Intra-class Variability
- Performance Measures
- Receiver Operating Characteristic Curves
- Detection-Error Trade-off Curves
- Area Under Curve
- Equal Error Rate
- False Rejection, given False Acceptance rate
- Multimodal Biometrics
- Fusion at Different Levels: Data, Features, Scores and Responses
- Hugo Proença; Iris Biometrics: Indexing and Retrieving Heavily Degraded Data, IEEE Transactions on Information Forensics and Security, volume 8, issue 12, pag. 1975-1985, ISSN 1556-6013, Digital Object Identifier 10.1109/TIFS.2013.2283458, 2013.
- Hugo Proença, Luís A. Alexandre; Toward Covert Iris Biometric Recognition: Experimental Results From the NICE Contests, IEEE Transactions on Information Forensics and Security, volume 7, issue 2, pag. 798-808, ISSN 1556-6013, Digital Object Identifier 10.1109/TIFS.2011.2177659.
One of the major goals of image processing and computer vision is to extract meaningful high level information from image data. Image classification can refer to a variety of purposes, including the labelling of multi-spectral image pixels to produce thematic maps. This has been a major topic in processing Earth Observation Satellite (EOS) images since the 1970s.
In this talk, a brief overview of the various image classification approaches will be made. The evolution of EOS images will be also addressed, from the historic low resolution multi-spectral images to the now available hyperspectral and very high spatial resolution datasets. The challenges posed by the current and planned image datasets are huge, due to a variety of reasons including the hyper-dimensionality and the sheer volume of the data itself. These issues will be addressed together with some of the most promising techniques to handle the image classification task.
Professor André Marçal
INESC TEC, Faculdade de Ciências, Universidade Porto
Eng. Pedro Santos
- Albatroz Engineering
Eng. Sérgio Copeto
Machine Learning For Computer Vision
Jaime S. Cardoso holds a Licenciatura (5-year degree) in Electrical and Computer Engineering in 1999, an MSc in Mathematical Engineering in 2005 and a Ph.D. in Computer Vision in 2006 from University of Porto. Cardoso is an Assistant Professor at the Faculty of Engineering of the University of Porto (FEUP), where he has been teaching Machine Learning and Computer Vision in Doctoral Programs and multiple courses for the graduate studies.
Cardoso is currently the leader of the ‘Information Processing and Pattern Recognition’ Area in the Telecommunications and Multimedia Unit of INESC Porto.
His research can be summed up in three major topics: computer vision, machine learning and decision support systems. Cardoso has co-authored 100+ papers, 30+ of which in international journals.
Cristian Sminchisescu is a Professor in the Department of Mathematics, Faculty of Engineering, at Lund University. He has obtained a doctorate in computer science and applied mathematics with specialization in imagining, vision and robotics at INRIA, France, under an Eiffel excellence doctoral fellowship, and has done postdoctoral research in the Artificial intelligence Laboratory at the University of Toronto. He holds a Professor equivalent title at the Romanian Academy and a Professor rank, status appointment at Toronto, and advises research at both institutions. During 2004-07, he has been a Faculty member at the Toyota Technological Institute, a philanthropically endowed computer science institute located at the University of Chicago, and during 2007-2012 on the Faculty of the Institute for Numerical Simulation in the Mathematics Department at Bonn University. He is a member of the Editorial Board (Associate Editor) of IEEE Transactions for Pattern Analysis and Machine Intelligence (PAMI). He has offered tutorials on 3d tracking, recognition and optimization at ICCV and CVPR, the Chicago Machine Learning Summer School, the AEFRAI Vision School in Barcelona, and the Computer Vision summer school at ETH in Zurich. Over time, his work has been funded by the United States National Science Foundation, the Romanian Science Foundation, the German Science Foundation, the Swedish Science Foundation, and the European Commission, under a Marie Curie Excellence Grant. Cristian Sminchisescu’s research goal is to train computers to `see’ and interact with the world seamlessly, as humans do. His research interests are in the area of computer vision (articulated objects, 3d reconstruction, segmentation, and object and action recognition) and machine learning (optimization and sampling algorithms, structured prediction and kernel methods).
Joao Carreira received his doctorate from the University of Bonn, Germany. His thesis focused on sampling class-independent object segmentation proposals using the Constrained Parametric Min-Cuts (CPMC) algorithm, and on studying their feasibility in object recognition and localization. Systems authored by him and colleagues were winners of all four PASCAl VOC Segmentation challenges, 2009-2012. He did post-doctoral work at the Institute of Systems and Robotics in Coimbra, Portugal and is currently with the EECS department, at the University of California in Berkeley, USA. His research interests lie at the intersection of recognition, segmentation, pose estimation and shape reconstruction of objects from a single image.
Vincent Lepetit recently started at TUGraz, Austria, as a Full Professor. Before that, he was a Senior Researcher at the Computer Vision Laboratory, EPFL, Switzerland. He received the engineering and master degrees in Computer Science from the ESIAL in 1996. He received the PhD degree in Computer Vision in 2001 from the University of Nancy, France, after working in the ISA INRIA team. He then joined the Virtual Reality Lab at EPFL (Swiss Federal Institute of Technology) as a post-doctoral fellow and became a founding member of the Computer Vision Laboratory in 2004.
His research interests include vision-based Augmented Reality, 3D camera tracking, object recognition, and 3D reconstruction. He often serves as program committee member and area chair of major vision conferences.
Thomas Whelan received his B.Sc. (Hons) in Computer Science & Software Engineering from the National University of Ireland Maynooth in 2011. He is currently finishing his Ph.D at the National University of Ireland Maynooth under a 3 year post-graduate scholarship from the Irish Research Council. In 2012 he spent 3 months as a visiting researcher at Prof. John Leonard’s group in CSAIL, MIT funded by a Science Foundation Ireland Short-Term Travel Fellowship. As the principal author of the Kintinuous RGB-D SLAM system for dense mapping over large scale environments, his research focuses on developing methods for dense real-time perception and its applications in SLAM and robotics.
Human Computer Interaction
Russell Beale is the Director of the HCI Centre which exists to promote leading-edge research and development in theories, designs, methodologies, and systems to support people in whatever they want to achieve in the School of Computer Science at the University of Birmingham, UK. His research interests include supporting social interaction, mobile and pervasive systems interaction, creative design, and information visualization and access. He received his DPhil in computer science from the University of York. He‘s chair of the British Computer Society’s HCI Special Interest Group.
Hugo Proença received the B.Sc. degree from the University of Beira Interior, Portugal, in 2001, the M.Sc. degree from the Faculty of Engineering, University of Oporto, in 2004, and the Ph.D. degree from the University of Beira Interior, in 2007. His research interests are focused in the artificial intelligence, pattern recognition and biometrics. Currently, he serves as Assistant Professor in the Department of Computer Science, University of Beira Interior. He is the area editor (ocular biometrics) of the IEEE Biometrics Compendium Journal and member of the Editorial Board of the International Journal of Biometrics. Also, he served as Guest Editor of special issues of the Pattern Recognition Letters, Image and Vision Computing and Signal, Image and Video Processing journals.
André Marçal received the B.Sc. degree from the Faculty of Sciences, University of Porto, Portugal, in 1991, the M.Sc. and PhD degree from the University of Dundee, in 1994 and in 1998, respectively. In 1999 received the Remote Sensing Society Student Award (Annual prize for the best PhD) and in 2007 received the Syngenta Innovation Agriculture award. Finally, André Marçal is member of various International conferences Scientific Committee on Remote Sensing and Image Processing, including the review of 51 Full Papers and over 120 Abstracts / Short Papers.
|Jaime S. Cardoso||INESC TEC, Faculdade de Engenharia da Universidade do Porto|
|Hélder P. Oliveira||INESC TEC|
|Eduardo M. Pereira||INESC TEC, Faculdade de Engenharia da Universidade do Porto|
|Ana Rebelo||INESC TEC|
|Ricardo Sousa||INEB, Universidade do Porto|
Graphic Design Committee
|Andreia Pais Cunha||Faculdade Belas Artes da Universidade do Porto|
|André Matos||Faculdade de Ciências Sociais e Humanas da Universidade Nova de Lisboa|
|Ana F. Sequeira||INESC TEC, Faculdade de Engenharia da Universidade do Porto|
|Isidro Ribeiro||Faculdade de Engenharia da Universidade do Porto|
|Renata Rodrigues||INESC TEC|