VISion Understanding and Machine intelligence – VISUM 2017 was the fifth edition of the Summer School that aims to gather Ph.D. candidates, Post-Doctoral scholars and researchers from academia and industry with research interests in computer vision and machine intelligence.
VISUM 2017 was organized by INESC TEC in the scope of the FourEyes project. FourEyes aims to make critical and tangible advances in the digital media sector and transform the means by which multimedia content is created, distributed and consumed. It will provide the means for scientific, creative and industrial communities to understand how new forms of multimedia content in a symbiotic relation with intelligent processing can improve our way of life and can be used to produce innovation. Its long term vision is towards establishing a vibrant online network of multimedia content where all users can be content creators as well as consumers, or so called prosumers.
Considering the existing gap between the most fundamental concepts of computer vision and their application in real world scenarios, the realisation of VISUM summer school seeks to bridge these two key domains. By creating an expert multicultural environment, visum school aims to foster junior researchers’ awareness of computer vision topics, as well as to enhance all attendees’ knowledge regarding the state of the art, provided by leading international experts on the field. Being an area of great potential in industrial applications with a strong increase in the number of researchers in these last years, visum school will be an incredible opportunity to be on the edge of knowledge.
VISUM comprised three main tracks: fundamental, industrial and application topics, each one with extensive practical “hands-on” sessions. In the industrial track, national and international renowned institutions will present their case studies and knowingness to the attendees. VISUM will have a track dedicated to applications with the aim to bridge the gap between the fundamental and industrial topics, each year with a different subject.
:: Target Audience ::
- MSc. and Ph.D. candidates;
- Post-Doctoral scholars and researchers;
- Academic and industrial professionals with (research) interests in computer vision;
- And, everyone who wants to have knowledge of avant-garde topics.
Schedule
Speakers
Computer Vision and Machine Learning
Vincent Lepetit
Institute for Computer Graphics and Vision
Graz University of Technology
Dr. Vincent Lepetit is a Full Professor (Universitätsprofessor) at the University of Technology of Graz.
He received the PhD degree in Computer Vision in 2001 from the University of Nancy, France, after working in the ISA INRIA team. He then joined the Virtual Reality Lab at EPFL as a post-doctoral fellow and became a founding member of the Computer Vision Laboratory. He became a Professor at TU GRAZ in February 2014. His research interests include computer vision and machine learning, and their application to 3D hand pose estimation, feature point detection and description, and 3D registration from images. In particular, he introduced methods such as Ferns, BRIEF, and LINE-MOD for feature point matching and 3D object recognition.
He often serves as program committee member and area chair of major vision conferences (CVPR, ICCV, ECCV, ACCV, BMVC). He is an editor for the International Journal of Computer Vision (IJCV) and the Computer Vision and Image Understanding(CVIU) journal.
Computer Graphics and NVIDIA Research
Samuli Laine
NVIDIA / Aalto University
Dr. Samuli Laine is a principal research scientist and professor at NVIDIA and Aalto University.
He joined NVIDIA Research in 2007 after receiving his Ph.D. at Helsinki University of Technology. His research interests have ranged from shadow and visibility computation to signal processing and efficient representations, authoring, acquisition, and rendering of extremely detailed 3D content. Dr. Laine has had a pivotal role in NVIDIA’s ray tracing research, especially related to the implications of high-latency, wide-SIMD hardware.
Research Interests: Deep Learning, Parallel Architectures, Ray Tracing, Rendering Algorithms.
Deep Learning
Jaime Cardoso
Faculdade de Engenharia, Universidade do Porto/INESC TEC
Jaime dos Santos Cardoso is an Associate Professor with Habilitation/Professor Associado com Agregação at DEEC, in the Faculdade de Engenharia da Universidade do Porto (FEUP), Portugal.
Simultaneously, he is also involved in research and development activities at INESC TEC as Leader of the Information Processing and Pattern Recognition Area.
At INESC TEC, he is the co-founder and co-leader of the Breast Research Group and of the Visual Computing and Machine Intelligence Group. At VCMI they focus their research on computer vision and pattern recognition.
Motion Analysis
Ioannis (Yiannis) Patras
Associate Professor at School of EECS
Queen Mary University of London
His research is in the area of Machine Learning and Computer Vision with applications in Multimedia Analysis and Multimodal Human Machine Interaction. This includes low level analysis as well as learning from large datasets or from user interaction. He is particularly interested in the problems of Motion Analysis (including tracking and motion estimation), (semantic) Object Segmentation and (Human) Action Recognition (including face and gesture analysis). A recent line of activity involves multimodal human sensing for Brain Computer Interfaces. He is an associate editor of the Image and Vision Computing Journal and of the journal of Multimedia, the general chair of the 10th Int’l Workshop of Image Analysis for Multimedia Interactive Services and in the organisation committee of a number of conferences and workshops.
Face Analysis
Hamdi Dibeklioglu
Pattern Recognition&Bioinformatics Group
Delft University of Technology
He is Postdoctoral Researcher in the Pattern Recognition & Bioinformatics Group of Delft University of Technology. He is also a Research Affiliate in the Computer Vision Group of the University of Amsterdam. Earlier, he was a Visiting Researcher at Carnegie Mellon University, University of Pittsburgh, and Massachusetts Institute of Technology. His research interests include Affective Computing, Intelligent Human-Computer Interaction, Pattern Recognition, and Computer Vision.
Industry
Disney Research
by Leonid Sigal, USA
ABOUT Disney Research
Disney Research’s mission is to drive value for The Walt Disney Company by delivering scientific & technological innovation Company-wide. Our world-class research talent invents and transfers the most compelling technologies enabling the Company to differentiate its content, services, and products. Disney Research combines the best of academia and industry, by doing both basic and application-driven research. We utilize publication as a principal mechanism for quality control and encourage engagement with the global research community. Our research applications and technology are experienced by millions of people. We honor Walt Disney’s legacy by deploying our innovations on a global scale.
ABOUT Leonid Sigal
He is a Senior Research Scientist at Disney Research Pittsburgh and an Adjunct Faculty member at Carnegie Mellon University. His research focuses on recognition, scene understanding, articulated motion capture, motion modeling, action recognition, motion perception, manifold learning, transfer learning, character and cloth animation and a number of other directions on the fringe of computer vision,machine learning, and computer graphics.
Facebook Research
by Dario Garcia Garcia, USA
ABOUT Applied Machine Learning group at Facebook
Machine learning is essential to Facebook. It helps people discover new content and connect with the stories they care the most about. Their applied machine learning researchers and engineers develop machine learning algorithms that rank feeds, ads and search results, and create new text understanding algorithms that keep spam and misleading content at bay. New computer vision algorithms can “read” images and videos to the blind and display over 2 billion translated stories every day, speech recognition systems automatically caption the videos that play in your news feed, and we create new magical visual experiences such as turning panorama photos into fully interactive 360o photos.
ABOUT Dario Garcia
He is currently a Technical Lead for Applied Machine Learning at Facebook, helping push the state of the art and build cutting-edge intelligent systems that can make the world more open and connected by understanding people and content.
Before joining Facebook he worked as a Senior Data Scientist and Product Owner for the Commonwealth Bank of Australia in the Big Data space. Prior to that he was a Senior Researcher in Artificial Intelligence working at Tecnalia Research & Innovation, designing and applying Machine Learning algorithms for complex industrial systems. Previously, He was a Research Fellow in the Research School of Information Sciences and Engineering (RSISE) of the Australian National University and in the Machine Learning Group at NICTA working with Prof. Bob Williamson on theoretical aspects of Machine Learning.
NVIDIA Research
by Samuli Laine, FI
ABOUT NVIDIA Research
NVIDIA Research explores challenging topics on the frontiers of visual, parallel, and mobile computing. Their current work spans many domains including computer graphics, physical simulation, scientific computing, computational photography, programming languages, circuit design, and computer architecture. They support advances in these fields through collaboration with academic and industrial research institutions, and disseminate results in technical conferences, journals, and other academic venues.
ABOUT Samuli Laine
Samuli Laine joined NVIDIA Research in 2007 after receiving his Ph.D. at Helsinki University of Technology. His research interests have ranged from shadow and visibility computation to signal processing and efficient representations, authoring, acquisition, and rendering of extremely detailed 3D content. Dr. Laine has had a pivotal role in NVIDIA’s ray tracing research, especially related to the implications of high-latency, wide-SIMD hardware.
Research Interests: Deep Learning, Parallel Architectures, Ray Tracing, Rendering Algorithms.
United Technologies Research Center
by Eduardo Marques, IE
ABOUT United Technologies Research Center
As the innovation hub of United Technologies Corp. (UTC), United Technologies Research Center (UTRC) supports the development of new technologies and capabilities across the company and collaborates with external research organizations, universities and government agencies globally to push the boundaries of science and technology. Further, UTRC leads the monetization of UTC’s intellectual property through business model innovation. UTRC is headquartered in East Hartford, Connecticut, with additional operations at its affiliate in Berkeley, California, and its subsidiaries in Shanghai, China; Rome, Italy; and Cork, Ireland. UTC, based in Farmington, Connecticut, provides high technology systems and services to the building and aerospace industries.
United Technologies Research Centre Ireland, Ltd. (UTRCI), established in 2009, conducts research into the next generation of energy and security systems for high-performance buildings, generating world-class technologies for UTC’s commercial businesses worldwide. Recently, UTRC Ireland added aerospace systems research to its portfolio to support UTC’s large industrial presence in Europe.
ABOUT Eduardo Marques
Eduardo M. Pereira is currently a Senior Research Scientist at UTRC-Ireland. His research activity is related to supervised and unsupervised learning applied to video processing encompassing tasks such as visual retrieval, video summarization and event recognition. His industrial aim is to push research ideas into commercial prototypes as soon as possible.
He holds a PhD, regarding the topic of human activity analysis in video, from a collaboration between UT Austin-Texas University (USA) and Universidade do Porto (Portugal), a Post-graduation in Computer Graphics and Virtual Environments by Universidade do Minho (Portugal) and a Licenciatura (5-year degree) in Electronics and Telecommunications Engineering by Universidade de Aveiro (Portugal). He also has a large experience on the industry field where he has explored in several companies diverse topics from physical simulation environment, passing through 3D animation and game development, to analysis of human body gesture in space, until interactive technologies and natural user interfaces.
MOG Technologies
by Alexandre Ulisses, PT
Leading-edge Technology for Broadcasters and New Media Channels.
Based on open industry standards and offering state-of-the-art technology, MOG Technologies has been establishing itself in the market as the worldwide supplier for Centralized Ingest Solutions, Cloud Services and MXF Development Tools.
For over a decade, MOG has been helping worldwide broadcasters and technology providers to increase the overall performance of its workflows while migrating for file-based environments, ensuring high interoperability between systems and formats.
From MXF Experts to worldwide leaders in File-Based Systems and a growing company on the expansion to the cloud, MOG aims to exceed media challenges and to break the workflow boundaries by delivering fully interoperable solutions to the broadcast industry.
Nonius
by António Silva, PT
Enable hospitality operators to provide their guests with a great experience.
Nonius offers state of the art technology that enables hospitality operators to provide their guests with a great experience. A comprehensive portfolio of products and services for all Hotel segments, comprising solutions for Guest Internet Access (GIA/HSIA), TV solutions (Interactive/IPTV/COAX), digital signage, telephony (IP/Analog), mobile apps and APIs, and entertainment content (OTT/VoD/MoD).
Nonius services more than 145,000 rooms in Hotels, Hospitals and Cruise Ships across Europe, the Americas, Africa, Asia and Middle-East, which include international hotel chains such as: Accor, Belmond, Corinthia, Meliá, Hilton, Wyndham-Tryp, Starwood, IHG, Marriott, JJW, Uniworld, Viking Cruises, Scenic, Eurostars, Pestana, Transamérica, Estanplaza, Tivoli, Altis, Blue Tree, Sana and Turim.
Canal180
by João Vasconcelos, PT
Canal180 is the first Portuguese Open Source TV channel entirely dedicated to culture, arts and creativity. Following the ever-changing artistic agenda, the channel broadcasts innovative content, created by a new generation of artists. Canal 180 also exclusively produces and curates projects from around the world.
Combining internet and TV in the same platform, Canal 180 is targeted to a growing audience that can finally watch original content on art and culture at easy access. An award-winning TV channel based in Portugal, Canal 180 aims to broadcast worldwide via cable television operators.
Project
During the hands-on sessions of this VISUM 2017 edition, the students were gathered in groups to develop a project covering the subjects of VISUM, such as deep learning, motion analysis and automatic affect recognition.
The main idea was to encourage the students to put what they have learned in the theoretical classes into practice.
– Topics: Detection of criminal activities and automatic content analysis for the media industry.
– Main Objectives: detection of rogue/suspicious human behaviour; identification of criminal activities; content management and production; media asset management.
The winner group was Team Francesinhas, formed by:
- Véronique Gomes, Portugal, UTAD (Vila Real, Portugal)
- Samuel Medrano, Spain, Institute for Geoinformatics (Münster, Germany)
- Cristina Palmero, Spain, Noldus IT (Wageningen, Netherlands)
and they received:
- a prize money of €1000, sponsored by Disney Research and UTRC.
- 3 NVIDIA Jetson TX2 Developer Kits, sponsored by NVIDIA.
Programme Committe
Organizing Committe
Local Committe
João C. Monteiro | INESC TEC, FEUP |
Renata Rodrigues | INESC TEC |
Sponsors
Organization