Tutorials

Call for Tutorial Proposals

Proposals are solicited for tutorials to be organized in conjunction with the 2018 IEEE Conference on Automatic Face and Gesture Recognition (FG 2018: http://www.fg2018.org/), which will be held from May 15-19, 2018 in Xi’an China. The tutorials should complement and enhance the scientific program of FG 2018 by providing authoritative and comprehensive overviews of growing themes that are of sufficient relevance with respect to the state-of-the-art and the conference topics.

Accepted tutorials will be held on either May 15 or May 19, 2018, in the same venue as the FG 2018 main conference, the Grand New World Hotel, Xi’an, China. The tutorials are half-day events (3 to 4 hours including breaks).

We solicit proposals on any topic of interest to the FG community. Interdisciplinary topics that could attract a significant cross-section of the community are highly encouraged. We particularly welcome tutorials which address advances in emerging areas not previously covered in an FG related tutorial.

A tutorial proposal should include:
�?Title
�?Proposer’s contact information and short CV
�?Names of any additional lecturers and short CV
�?Tutorial description and description of relevance to the FG community and an evaluation plan
�?References and experience of the instructors with respect to the proposed tutorial topic
�?Planned length of the tutorial
�?List of relevant tutorials recently presented in other conferences
�?Requirements (e.g., facilities, internet access, etc.),
�?Other useful information (e.g., estimated attendance, slides/notes available, etc.).

The main conference will provide rooms, equipment, and coffee breaks for the tutorials. For any additional questions please contact Vitomir Struc (vitomir.struc@fe.uni-lj.si) and Yingli Tian (ytian@ccny.cuny.edu), FG 2018 Workshop Chairs.

For your reference, the titles of the three tutorials held in conjunction with the previous FG conference (FG2017, http://www.fg2017.org/index.php/tutorials/) were as follows:

�?Multi-view Face Representation
�?Remote Physiological Measurement from Images and Videos
�?From Deep Unsupervised to Supervised Models for Face Analysis
�?Statistical Methods for Affective Computing

REVIEW PROCESS

Tutorial proposals will be evaluated on the basis of their estimated benefit for the community and their fit within the tutorials program as a whole. Factors to be considered include: relevance, timeliness, importance, and audience appeal; suitability for presentation in a half or full day format; past experience and qualifications of the instructors. Selection will also be based on the overall distribution of topics, expected attendance, and specialties of the intended audiences.

SUBMISSION PROCESS

Tutorial proposals must be sent to FG 2018 Workshop Chairs, Vitomir Struc (vitomir.struc@fe.uni-lj.si) and Yingli Tian (ytian@ccny.cuny.edu), with email subject “FG 2018 Tutorial Proposal: [title of your tutorial] before Dec. 22, 2017. Decisions will be communicated to the proposers by Jan. 15, 2018.

IMPORTANT DATES

Tutorial proposals due: Dec. 22, 2017
Notification of acceptance: Jan. 15, 2018
Workshops and tutorials: May 15 and May 19, 2018



PERSON RE-IDENTIFICATION: RECENT ADVANCES AND CHALLENGES


Date and time:
Room:
Presenters: Shiliang Zhang, Jingdong Wang, Qi Tian, Wen Gao, and Longhui Wei


Tutorial description:
As a research topic attracting more and more interests in both academia and industry, person Re-Identification (ReID) targets to identify the re-appearing persons from a large set of videos. It is potential to open great opportunities to address the challenging data storage problems, offering an unprecedented possibility for intelligent video processing and analysis, as well as exploring the promising applications on public security like cross camera pedestrian searching, tracking, and event detection.

This tutorial aims at reviewing the latest research advances, discussing the remaining challenges in person ReID, and providing a communication platform for researchers working on or interested in this topic. This tutorial includes several talks given by researchers working closely on person ReID. The confirmed talks covers the following topics:
- Wide deep models for fine-grained pattern recognition
- Local and global representation learning for person ReID
- The application of Generative Adversarial Networks in person ReID
- Open issues and promising research topics of person ReID

Those talks cover our latest works on person ReID, as well as our viewpoints about the unsolved challenging issues in person ReID. We believe this tutorial would be helpful for researchers working on person ReID and other related topics.


About the presenters:
Shiliang Zhang is currently an Assistant Professor in School of Electronic Engineering and Computer Science, Peking University. He received the Ph.D. degree in computer science from Institute of Computing Technology, Chinese Academy of Sciences in 2012. He was a Postdoctoral Scientist in NEC Labs America and a Postdoctoral Research Fellow in University of Texas at San Antonio. Dr. Zhang’s research interests include large-scale image retrieval, person re-identification, and computer vision for autonomous driving. He was awarded the National 1000 Youth Talents Plan of China, Outstanding Doctoral Dissertation Awards from both Chinese Academy of Sciences and Chinese Computer Federation (CCF), President Scholarship by Chinese Academy of Sciences, NEC Laboratories America Spot Recognition Award, and the Microsoft Research Fellowship. He is the recipient of Top 10% Paper Award in IEEE MMSP 2011. His research is supported by the National 1000 Youth Talents Plan, Natural Science Foundation of China (NSFC), and Microsoft Research.



Jingdong Wang is a Senior Researcher at the Visual Computing Group, Microsoft Research, Beijing, China. His areas of interest include computer vision, machine learning, and multimedia. He is currently working on CNN architecture design, human understanding, person re-identification, multimedia search, and large-scale indexing. He has served/will serve as an area chair in ACMMM 2018, ICPR 2018, AAAI 2018, ICCV 2017, CVPR 2017, ECCV 2016, ACMMM 2015 and ICME 2015, a track chair in ICME 2012. He is an editorial board member for IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Multimedia, and Tools and Applications. He has shipped 10+ technologies to Microsoft products, including XiaoIce Chatbot, Microsoft cognitive service, and Bing search.



Qi Tian is currently a Full Professor in the Department of Computer Science, the University of Texas at San Antonio (UTSA). During 2008-2009, he took one-year Faculty Leave at Microsoft Research Asia (MSRA) as Lead Researcher in the Media Computing Group. Dr. Tian received his Ph.D. in ECE from University of Illinois at Urbana-Champaign (UIUC) in 2002 and received his B.E. in Electronic Engineering from Tsinghua University in 1992 and M.S. in ECE from Drexel University in 1996, respectively. Dr. Tian’s research interests include computer vision, multimedia content analysis, image and video indexing and retrieval, and machine learning, and published over 400 refereed journal and conference papers (161 journals including 94 IEEE/ACM Transactions, and 76 CCF Category A conference papers). He was the co-author of a Best Paper in ACM ICMR 2015, a Best Paper in PCM 2013, a Best Paper in MMM 2013, a Best Paper in ACM ICIMCS 2012, a Top 10% Paper Award in MMSP 2011, a Student Contest Paper in ICASSP 2006. He received 2017 UTSA President Distinguished Award for Research Achievement, 2016 UTSA Innovation Award in the first category, 2014 Research Achievement Awards from College of Science, UTSA, and 2010 Google Faculty Research Award. He received 2010 ACM Service Award. He is the Associate Editor of IEEE Transactions on Multimedia (TMM), IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), Multimedia System Journal (MMSJ), and in the Editorial Board of Journal of Multimedia (JMM) and Journal of Machine Vision and Applications (MVA). Dr. Tian is aFellow of IEEE.



Wen Gao received the B.S. degree in computer science from Harbin University of Science and Technology in 1982, the M.S. degree in computer science from Harbin Institute of Technology in 1985, the Ph.D. degree in computer science from Harbin Institute of Technology in 1988, Harbin, China, and the Ph.D. degree in electronics engineering from the University of Tokyo in 1991, Tokyo, Japan. He is a professor in the Department of Computer Science and Technology at Peking University, Beijing, China. He is the founding director of NELVT (National Engineering Lab. on Video Technology) at Peking University. He is also the Chief Scientist of the National Basic Research Program of China (973 Program) on Video Coding Technology from 2009, and the vice president of National Natural Science Foundation of China from 2013. He is working in the areas of multimedia and computer vision, including video coding, video analysis, multimedia retrieval, face recognition, and multimodal interface. He published six books and over 700 technical articles in refereed journals and proceedings in above areas. His publications have been cited for over 31,000 times according to Google Scholar. He served or serves on the editorial board for several journals, such as IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Multimedia, IEEE Transactions on Autonomous Mental Development, EURASIP Journal of Image Communications, Journal of Visual Communication and Image Representation. He chaired a number of prestigious international conferences on multimedia and video signal processing, such as IEEE ICME 2007, ACM Multimedia 2009, IEEE ISCAS 2013, and also served on the advisory and technical committees of numerous professional organizations. He earned many awards such as one second class award in technology invention by the State Council, and six second class awards in science and technology achievement by State Council. He is a fellow of IEEE, a fellow of ACM, and a member of Chinese Academy of Engineering.



Longhui Wei received the B.S. degree from Northwestern Polytechnical University, Xi’an, China, in 2016. He is currently a graduate student in Peking University, under the supervision of Prof. Shiliang Zhang and Prof. Qi Tian. His recent works focus on deep learning and person re-identification.

REPRESENTATION LEARNING FOR FACE ALIGNMENT AND RECOGNITION


Date and time:
Room:
Presenters: Hao Liu, Yueqi Duan, and Jiwen Lu


Tutorial description:
Over the past decade, face alignment and recognition have been developed as two of the most widely-used applications in computer vision where representation learning plays an important role in these tasks. In this tutorial, we will overview the trend of face alignment and recognition and discuss how representation learning techniques boost the performance of the two tasks. First, we briefly introduce the basic concept of face alignment and recognition, and show the key advantages and disadvantages of existing representation learning methods in the tasks. Second, we introduce some of our newly proposed representation learning methods from two aspects: representation learning for face alignment and representation learning for face recognition, which are developed for different application-specific representation learning techniques, respectively. Lastly, we will discuss some open problems in face alignment and recognition to show how to further develop more advanced representation learning algorithms for the two tasks in the future.


About the presenters:
Hao Liu received the B.S. degree in software engineering from Sichuan University, China, in 2011, and the Master Engineering degree in computer technology from the University of Chinese Academy of Sciences, China, in 2014, and the PhD degree in Control Science and Technology in Tsinghua University, China, 2018. He is currently an assistant professor with the School of Information Engineering, Ningxia University, China. His research interests include face alignment, facial age estimation, and deep learning. He has published several journal papers such as IEEE TPAMI, TIP, TCSVT, TIFS and PR. He serves as a reviewer for many journals and conferences such as IEEE TPAMI, IEEE Access, Neurocomputing, WACV, ICME and ICIP. He was a recipient of the National Scholarship of Tsinghua in 2017. He was invited to give the oral and poster presentations on the Doctoral Consortium in IEEE FG 2017.



Yueqi Duan received the B.S. degree in the Department of Automation, Tsinghua University, China, in 2014. He is currently a Ph.D Candidate with the Department of Automation, Tsinghua University, China. His current research interests include deep learning, unsupervised learning, and binary representation learning. He has authored/co-authored 13 scientific papers in these areas including IEEE TPAMI, TIP, TCSVT, and CVPR. He serves as a reviewer for several journals and conferences such as TCSVT, IEEE Access, Pattern Recognition, Neurocomputing and ICIP. He has obtained the National Scholarship of Tsinghua in 2017.



Jiwen Lu is currently an Associate Professor with the Department of Automation, Tsinghua University, China. His current research interests include computer vision, pattern recognition, and machine learning. He has authored/co-authored over 180 scientific papers in these areas, including 53 IEEE Transactions papers (including 8 PAMI papers) and 27 are top-tier computer vision conferences (ICCV/CVPR/ECCV/NIPS). He is an elected member of the Multimedia Signal Processing Technical Committee and the Information Forensics and Security Technical Committee of the IEEE Signal Processing Society, and an elected member of the Multimedia Systems and Applications Technical Committee of the IEEE Circuits and Systems Society, an Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology, Pattern Recognition, Pattern Recognition Letters, Journal of Visual Communication and Image Representation, Neurocomputing, and the IEEE Access. He serves/has served as an Area Chair for several international conferences such as ICME 2018, ICIP 2018, ICPR 2018, WACV 2018, ICIP 2017, VCIP 2016, ICB 2016, BTAS 2016, WACV 2016, ICME 2015 and ICB 2015, a Workshop Chair for WACV 2017 and ACCV 2016, a Special Session Chair for ICB 2019 and VCIP 2015. He is a Senior Member of the IEEE.

READING HIDDEN EMOTIONS FROM MICRO-EXPRESSION ANALYSIS


Date and time:
Room:
Presenters: Guoying Zhao and Matti Pietikäinen


Tutorial description:
Facial expressions are one of the major ways that humans convey emotions. Aside from ordinary facial expressions that we see every day, under certain circumstances emotions can also manifest themselves in the special form of micro-expressions (ME). Micro-expressions (MEs) are rapid, involuntary facial expressions which reveal emotions that people do not intend to show. Studying MEs is valuable as recognizing them has many important applications, particularly in forensic science and psychotherapy. Even though micro-expressions have been studied for many years by Psychologists, it is really new in computer vision field. There are three main problems related to MEs: spontaneous micro-expression inducement and collection; ME spotting; and ME recognition. All of them will be covered in the tutorial.


About the presenters:
Guoying Zhao (SM'12) is currently a Professor with the Center for Machine Vision and Signal Analysis, University of Oulu, Finland, where she has been a senior researcher since 2005 and an Associate Professor since 2014. She received the Ph.D. degree in computer science from the Chinese Academy of Sciences, Beijing, China, in 2005. In 2011, she was selected to the highly competitive Academy Research Fellow position. She was Nokia visiting professor in 2016. She has authored or co-authored more than 160 papers in journals and conferences. Her papers have currently over 6900 citations in Google Scholar (h-index 35). She is co-publicity chair for FG2018, has served as area chairs for several conferences and is associate editor for Pattern Recognition, IEEE Transactions on Circuits and Systems for Video Technology, and Image and Vision Computing Journals. She has lectured tutorials at ICPR 2006, ICCV 2009, and SCIA 2013, authored/edited three books and seven special issues in journals. Dr. Zhao was a Co-Chair of many International Workshops at ECCV, ICCV, CVPR, ACCV and BMVC. Her current research interests include image and video descriptors, facial-expression and micro-expression recognition, gait analysis, dynamic-texture recognition, human motion analysis, and person identification. Her research has been reported by Finnish TV programs, newspapers and MIT Technology Review.



Prof. Matti Pietikäinen received his Doctor of Science in Technology degree from the University of Oulu, Finland. He is currently Senior Research Advisor at the Center for Machine Vision and Signal Analysis, University of Oulu. From 1980 to 1981 and from 1984 to 1985, he visited the Computer Vision Laboratory at the University of Maryland.  He has made pioneering contributions, e.g. to local binary pattern (LBP) methodology, texture-based image and video analysis, and facial image analysis. He has authored over 340 refereed papers in international journals, books and conferences. His papers have over 45,000 citations in Google Scholar (h-index 71), and eight of his papers have over 1,000 citations. He was Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence, Pattern Recognition and IEEE Transactions on Forensics and Security journals, and currently serves as Associate Editor of Image and Vision Computing journal. He was President of the Pattern Recognition Society of Finland from 1989 to 1992, and was named its Honorary Member in 2014. From 1989 to 2007 he served as Member of the Governing Board of International Association for Pattern Recognition (IAPR), and became one of the founding fellows of the IAPR in 1994. He is IEEE Fellow for contributions to texture and facial image analysis for machine vision. In 2014, his research on LBP-based face description was awarded the Koenderink Prize for Fundamental Contributions in Computer Vision.

ACTIVE AUTHENTICATION IN MOBILE DEVICES: ROLE OF FACE AND GESTURE


Date and time:
Room:
Presenters: Vishal M. Patel and Julian Fierrez


Tutorial description:
Recent developments in sensing and communication technologies have led to an explosion in the use of mobile devices such as smartphones and tablets. With the increase in use of mobile devices, one has to constantly worry about the security and privacy as the loss of a mobile device would compromise personal information of the user. To deal with this problem, active authentication (also known as continuous authentication) systems have been proposed in which users are continuously monitored after the initial access to the mobile device. This tutorial will provide an overview of different continuous authentication methods on mobile devices, with special focus in those based on face and gesture. We will discuss merits and drawbacks of available approaches and identify promising avenues of research in this rapidly evolving field. The tutorial should prove valuable to security and biometrics experts, exposing them to opportunities provided by continuous authentication approaches. It should also prove beneficial to experts in computer vision, signal and image processing, introducing them to a new paradigm of practical importance with very interesting research challenges.


About the presenters:
Vishal M. Patel [SM'16] is an A. Walter Tyson Assistant Professor in the Department of Electrical and Computer Engineering at Rutgers University. Prior to joining Rutgers University, he was a member of the research faculty at the University of Maryland Institute for Advanced Computer Studies (UMIACS). He completed his Ph.D. in Electrical Engineering from the University of Maryland, College Park, MD, in 2010. His current research interests include signal processing, computer vision, and pattern recognition with applications in biometrics and imaging. He has received a number of awards including the 2016 ONR Young Investigator Award, the 2016 Jimmy Lin Award for Invention, A. Walter Tyson Assistant Professorship Award, Best Paper Award at IEEE AVSS 2017, Best Paper Award at IEEE BTAS 2015, and Best Poster Awards at BTAS 2015 and 2016. He is an Associate Editor of the IEEE Signal Processing Magazine, IEEE Biometrics Compendium, and serves on the Information Forensics and Security Technical Committee of the IEEE Signal Processing Society. He is a member of Eta Kappa Nu, Pi Mu Epsilon, and Phi Beta Kappa.



Julian Fierrez [M'02] received the MSc and the PhD degrees in telecommunications engineering from Universidad Politecnica de Madrid, Spain, in 2001 and 2006, respectively. Since 2002 he has been affiliated with the ATVS Biometric Recognition Group, first at Universidad Politecnica de Madrid, and since 2004 at Universidad Autonoma de Madrid, where he is currently an Associate Professor since 2010. From 2007 to 2009 he was a visiting researcher at Michigan State University in USA under a Marie Curie fellowship. His research interests include general signal and image processing, pattern recognition, and biometrics. Since 2016 he is and Associate Editor for IEEE Trans. on Information Forensics and Security and the IEEE Biometrics Council newsletter. Prof. Fierrez has been actively involved in multiple EU projects focused on biometrics (e.g. TABULA RASA and BEAT), has attracted notable impact for his research, and is the recipient of a number of distinctions, including: EURASIP Best PhD Award 2012, Medal in the Young Researcher Awards 2015 by the Spanish Royal Academy of Engineering, and the Miguel Catalan Award to the Best Researcher under 40 in the Community of Madrid in the general area of Science and Technology. In 2017 he has been also awarded the IAPR Young Biometrics Investigator Award, given to a single researcher worldwide every two years under the age of 40, whose research work has had a major impact in biometrics.

INTRODUCTION TO DEEP LEARNING FOR FACIAL UNDERSTANDING


Date and time:
Room:
Presenters: Raymond Ptucha


Tutorial description:
Deep learning has been revolutionizing the machine learning community. This tutorial will first review Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). After understanding what and how CNNs and RNNs work, participants will cover techniques such as fully convolutional , sequence models, and latent representations in preparation to understand the latest methods for face detection, facial recognition, super resolution for faces, generative networks for faces, people detection, pose estimation, facial swapping, and facial video redaction.  Sample code will be reviewed and distributed so that upon completion, participants can run code on their own data. The final third of this tutorial includes a hands-on portion where participants will practice building and testing their own deep models using provided cloud resources.


About the presenters:
Raymond Ptucha is an Assistant Professor in Computer Engineering and Director of the Machine Intelligence Laboratory at Rochester Institute of Technology.  His research specializes in machine learning, computer vision, and robotics, all with an emphasis on deep learning.  Ray was a research scientist with Eastman Kodak Company where he worked on computational imaging algorithms and was awarded 31 U.S. patents with another 19 applications on file. He graduated from SUNY/Buffalo with a B.S. in Computer Science and a B.S. in Electrical Engineering. He earned a M.S. in Image Science from RIT. He earned a Ph.D. in Computer Science from RIT in 2013. Ray was awarded an NSF Graduate Research Fellowship in 2010 and his Ph.D. research earned the 2014 Best RIT Doctoral Dissertation Award.  Ray is a NVIDIA Deep Learning Institute Certified Instructor and University Ambassador, a passionate supporter of STEM education, and is active in IEEE and FIRST robotics organizations.

SIGN LANGUAGE RECOGNITION AND GESTURE ANALYSIS


Date and time:
Room:
Presenters: Xiujuan Chai and Xin Zhang


Tutorial description:
In recent years, as a complex non-rigid object, the analysis to hand related activities attracts more and more researcher’s attention, especially the sign language recognition(SLR), hand/fingertip detection and hand pose estimation. Our tutorial will overview the trend of these three aspects and discuss the mainstream techniques involving manifold learning, dictionary learning, metric learning and deep learning etc. First, we briefly introduce the development of SLR and then show the progress on the sign representation and modeling, signer-independent SLR and continuous SLR. Secondly, the hand pose estimation is surveyed on three categories, i.e. frame-based, video-based and 3D skeleton based, respectively. Then, the hand/fingertip detection and classification are introduced both from fist person view and third person view. Lastly, we will discuss some open problems in SLR and hand gesture analysis to show how to further develop more advanced learning algorithms for visual recognition in the future.


About the presenters:
Xiujuan Chai received the B.Eng., M.Eng. and Ph.D. degrees in computer science from the Harbin Institute of Technology, China, in 2000, 2002, and 2007, respectively. She was a Post-doctorial researcher in Nokia Research Center (Beijing), from 2007 to 2009. She joined the Institute of Computing Technology, Chinese Academy Sciences, Beijing, in July 2009 and now she is an Associate Professor. Her research interests are computer vision, pattern recognition, and multimodal human-computer interaction. She especially focuses on sign language recognition related research topics. She has extensive publications in leading journals and international conferences/workshops. She serves as reviewer and PC member of many top journals and international conferences. As the leader, her team won the first places in both 2016 and 2017 ChaLearn LAP large-scale Continuous Gesture Recognition Challenges. She is a recipient of the China’s State Natural Science Award in 2015 and the member of the Youth Innovation Promotion Association CAS. 



Xin Zhang received her B.S. degree in automatic engineering from Northwestern Polytechnical University in 2003, and the M.S. and Ph. D. degree in electrical engineering from Oklahoma State University, U.S., in 2005 and 2011 separately. She joined School of Electronic and Information Engineering, South China University of Technology (SCUT) in 2011 and now she is an Associate Professor. Her research interests include computer vision, image processing and intelligent human computer interaction, especially on 3D human/hand pose estimation and hand gesture recognition. Dr. Zhang has published nearly 30 articles in leading journals and conferences. She has been the principal investigator of research projects funded by NSFC, Ministry of Education, Guangdong Science Foundation, and few collaboration projects with Microsoft Research Asia. She has also severed as the reviewer for many international conferences and journals. Dr. Zhang won the First Prize in the Young Faculty Teaching Contest of SCUT. She and her students have won the first place of hand detection and hand classification respectively, in 2016 CVPR Vision for Intelligent Vehicles and Applications (VIVA) contest. In 2017 ICCV HANDS (Observing and Understanding Hands in Action) workshop, her paper on the fingertip detection has been awarded the only best paper.

PHYSIOLOGICAL MEASUREMENT FROM IMAGES AND VIDEOS


Date and time:
Room:
Presenters: Daniel McDuff


Tutorial description:
Over the past 10 years there have been significant advances in remote imaging methods for capturing physiological signals. Remote measurement of physiology using cameras has numerous applications.  In certain situations (e.g., burns, babies, delicate skin) long-term contact sensor application causes skin irritation and discomfort. In applications with high levels of body motion (e.g., athletics) contact sensors can be corrupted with muscle or sensor movement artifacts. Finally, there are applications in which the use of body-worn sensors is impractical (e.g., tele-health). Many measurement approaches involve analysis of the human face. The face has several advantages for analysis over other regions of the body: 1) it has high levels of blood perfusion (beneficial for optical and thermal imaging of blood flow); 2) it is typically not obstructed by clothing, thus allowing accurate motion tracking and measurement of visual signals, and 3) there are a number of effective, automated methods for face detection and tracking.
Approaches for the remote measurement of physiology utilize machine learning and digital signal and image processing to recover very subtle changes in videos caused by human physiology. Methods for measurement of pulse rate, respiration rate, pulse rate variability, blood oxygenation, blood perfusion, and pulse transit time from images have been presented. These signals are clinically important as vital signs and are also influenced by autonomic nervous system activity.
The first part of this tutorial will cover the fundamentals of remote imaging photoplethysmography. Following this there will be a deeper dive into state-of-the-art techniques for motion and dynamic illumination tolerance. Newer methods that leverage deep neural networks will be discussed, as will related approaches for skin segmentation and face detection/tracking.  The impact of frame rate, image resolution and video compression on blood volume pulse signal-to-noise ratio and physiological parameters accuracy will be characterized and discussed.  Advancements in multispectral and hyperspectral imaging will also be presented, highlighting how hardware as well as software can be adapted to improve physiological measurement.  Finally, examples of visualization techniques and applications will be presented. Specifically, applications in clinical settings (ICU and NICU) and examples of measuring heart rate variability as a measure of cognitive stress will be addressed.
Sensing physiological signals from the face is a very nice complement to other forms of facial expression and gesture analysis. For example, facial expressions capture rich affective information particularly related to emotional valence, whereas physiological responses capture equally rich affective information related more strongly to emotional arousal. Furthermore, remote physiological analysis from the face leverages many techniques of interest to the AFGR community including: face detection, skin region detection, face tracking and registration, and robustness of these approaches to motion, lighting and appearance changes.


About the presenters:
Daniel McDuff is a Researcher at Microsoft where he leads research and development of affective computing technology, with a focus on scalable tools to enable the automated recognition and analysis of emotions and physiology. He is also a visiting scientist at Brigham and Women’s Hospital in Boston where he works on deploying these methods in primary care and surgical applications. Daniel completed his PhD in the Affective Computing Group at the MIT Media Lab in 2014 and has a B.A. and Masters from Cambridge University. Previously, Daniel was Director of Research at Affectiva and a post-doctoral research affiliate at the MIT Media Lab.  During his Ph.D. and at Affectiva he built state-of-the-art facial expression recognition software and lead analysis of the world's largest database of facial expression videos.

Daniel is a leading researcher in the area of imaging photoplethysmography (iPPG). He had published numerous papers on the topic and organized several of the first workshops and special sessions dedicated to iPPG.  His work has received nominations and awards from Popular Science magazine as one of the top inventions in 2011, South-by-South-West Interactive (SXSWi), The Webby Awards, ESOMAR and the Center for Integrated Medicine and Innovative Technology (CIMIT). His projects have been reported in many publications including The Times, the New York Times, The Wall Street Journal, BBC News, New Scientist, Scientific American and Forbes magazine. Daniel was named a 2015 WIRED Innovation Fellow and has spoken at TEDx Berlin and SXSW.

STATISTICAL METHODS FOR AFFECTIVE COMPUTING


Date and time:
Room:
Presenters: Jeffrey Girard and Jeffrey Cohn


Tutorial description:
Statistical methods of data analysis emphasize inference and interpretability. As such, they are indispensable tools for enhancing scientific understanding, and they deserve a place alongside machine learning in the toolkits of scientists and engineers working in affective computing.
This tutorial will provide training on contemporary statistical methods with high relevance to conference attendees. Its emphasis will be on providing high-level intuitions and practical recommendations rather than exhaustive theoretical and technical details. Prior exposure to statistics, while helpful, will not be required of attendees. Applied examples, complete with syntax and write-ups, will be provided in both R (www.r-project.org) and MATLAB; tutorial attendees are encouraged to bring a laptop with one of these software packages installed.
Cross-cutting themes will include (A) measurement, (B) validity, and (C) uncertainty. Specific methods to be discussed include (1) measures of inter-rater reliability, (2) measures of criterion validity, (3) effect sizes, (4) confidence intervals, and (5) generalized linear modeling. These tools will help attendees answer such questions as: What are we measuring? How well are we measuring it? When are our measurements wrong? Do our measurements systematically vary across groups, times, etc.? How can we design our research studies to be maximally informative?


About the presenters:
Jeffrey Girard will complete a PhD in Clinical Psychology at the University of Pittsburgh in June 2018 and then begin a postdoc at Carnegie Mellon University. His work takes a deeply interdisciplinary approach to the study of human behavior, drawing insights and tools from social science, computer science, and data science. He is particularly interested in developing and applying technology to advance the study of emotion, personality, and psychopathology. He offers a unique perspective to the affective computing community, especially regarding research design, statistical analysis, and clinical applications.



Jeffrey Cohn is Professor of Psychology and Psychiatry at the University of Pittsburgh and Adjunct Professor of Computer Science at the Robotics Institute at Carnegie Mellon University. He leads interdisciplinary and inter-institutional efforts to develop advanced methods of automatic analysis and synthesis of facial expression and prosody and applies those tools to research in human emotion, social development, nonverbal communication, psychopathology, and biomedicine. His research has been supported by grants from the U.S. National Institutes of Health, National Science Foundation, Autism Foundation, Office of Naval Research, and Defense Advanced Research Projects Agency.

MS-CELEB-1M: LARGE SCALE FACE RECOGNITION CHALLENGE TUTORIAL


Date and time:
Room:
Presenters: Yandong Guo, Lei Zhang, Yafeng Deng


Tutorial description:
Large scale face recognition is of great business value these days. In recent years, large scale training data, together with deep convolutional neural network, has demonstrated to be remarkably effective, especially in the face recognition domain. We have published MS-Celeb-1M two years ago, which has been the largest training data for face recognition and attracted a lot of attentions. Cutting-edge performance has been achieved by leveraging this dataset in many typical challenges in face recognition, especially,
a. Face representation learning
b. Large-scale celebrity recognition based on face
c. Large-scale celebrity recognition based on face
In the tutorial, we will first review the training data (format, download method, etc.). Then, we will introduce and summarize the research work based on this dataset for the above problems. One advantage for this tutorial is that we review not only the algorithms, but also the corresponding training dataset used, so that the research work can be reproduced by many other researchers/institute to inspire further research in the direction.
Last but not least, MS-Celeb-1M provides not only the training data, but also concrete benchmark tasks to evaluate the performance for each of the above tasks. We will also introduce in detail how we design these benchmark tasks, and how people should use these benchmark tasks.
Please refer to http://www.msceleb.org/ for more details.


About the presenters:
Yandong Guo is a researcher at Microsoft AI & Research, Redmond WA in the United States. His research interests are mainly in the machine intelligence areas including computer vision and deep learning. He is the key contributor to the Microsoft cognitive service, especially face recognition, as well as connected car project, Microsoft image search, and Microsoft Knowledge graph. Yandong Guo earned his Ph.D. in electrical and computer engineering at Purdue University at West Lafayette, under the supervision of Prof. Charles Bouman and Prof. Jan Allebach. Before that, he received his B.S. and M.S. degree in ECE from Beijing University of Posts and Telecommunications, China, in 2005 and 2008, receptively. He serves as reviewer/committee member for conferences including ICML, NIPS, CVPR, ACM MM, ICIP, ICASSP, ICME, TIP, TCI, TMM, SPIE EI, IJCAI, etc.



Lei Zhang is a principal researcher and research manager in Microsoft AI & Research, leading a team working on visual recognition and computer vision. Prior to this, he has worked with Microsoft Research Asia for 12 years as a senior researcher, leading a research team working on visual recognition, image analysis, and large-scale data mining. His years of work on large-scale, search-based image annotation has generated many practical impacts in multimedia search, including a highly scalable solution of duplicate image clustering for billions of images. From 2013 to 2015, he moved to Bing Multimedia Search as a principal development manager, helping develop cutting-edge solutions for web-scale image analysis and recognition problems, including image caption generation and high precision image entity linking.
Lei is a senior IEEE member and a senior ACM member, and has served as editorial board members for Multimedia System Journal, as program co-chairs, area chairs, or committee members for many top conferences. He is the author or co-author of 100+ published papers in fields such as multimedia, computer vision, web search and information retrieval, and holds 40+ U.S. patents for his innovation in these fields. Lei earned all his degrees (B.E., M.E., and Ph.D) in Computer Science from Tsinghua University, and currently also holds an adjunct professor position in Tianjin University.



Yafeng Deng is the CTO of Beijing DeepGlint Technology Corporation . He graduated from Tsinghua University. He has 15 years of research and development experience in computer vision. During his previous work, he has published more than 10 papers and applied for more than 100 patents (95 issued). He used to take charge of face recognition in the Institute of Deep learning at Baidu Research. He has led teams to win the LFW face recognition and FDDB face detection world champion for many times.