Keynotes
Date: TBA
Prof. Chu-Song Chen
National Taiwan University, Taiwan
Title: Applying Multimodal Large Language Models to Joint Vision and Text Understanding
Abstract:
Recently, multimodal large language models (MLLMs) that combine visual and language information are rapidly advancing. Based on the powerful ability of joint understanding of images and text, they show broad application prospects in many tasks. In this talk, I’ll give a review of recent progress of foundation models and MLLMs of vision and text at first. Then, I’ll present promising techniques for fine-tuning large language models to downstream tasks. Finally, I’ll introduce several downstream applications leveraging MLLMs, such as open-vocabulary object detection, referring expression grounding and segmentation, and document information retrieval.
Biography:
Chu-Song Chen is currently Professor & Chair of the Department of Computer Science and Information Engineering (CSIE), and a joint-appointment professor of the Graduate Institute of Networking and Multimedia, National Taiwan University. He is also an Adjunct Research Fellow of the Center of Intelligent Healthcare, National Taiwan University Hospital. He serves as a Governing Board Member of the Image Processing and Pattern Recognition (IPPR) Society, Taiwan, which is one of the regional societies of the International Association of Pattern Recognition (IAPR). He received a B.S. degree in Control Engineering from National Chiao-Tung University, Taiwan, in 1989, and a Ph.D. degree in Department of CSIE, National Taiwan University in 1996. He has been a Research Fellow/Professor with the Institute of Information Science, Academia Sinica, Taipei, Taiwan. His recent research interests include Multimodal LLM, Computer Vision, Multimedia, Medical Image Analysis, Signal and Image Processing, and E-health.
- google scholar: https://scholar.google.com/citations?user=WKk6fIQAAAAJ&hl=zh-TW&oi=ao
- DBLP: https://dblp.org/pid/67/1007.html
- semantic scholar: https://www.semanticscholar.org/author/Chu-Song-Chen/1720473
Date: TBA
Prof. Myung Hoon Sunwoo
Ajou University, Korea
Title: TBA
Abstract:
TBA
Biography:
MYUNG HOON SUNWOO received B.S., M.S., and Ph.D. degrees from Sogang University in 1980, KAIST in 1982, and the University of Texas at Austin in 1990, respectively. He worked for ETRI in Korea (1982-1985) and Motorola DSP Operations in Austin, Texas (1990-1992). Since 1992, he has been with Ajou University in Suwon, Korea, where he is currently a professor. He has authored numerous papers, holds over 130 patents, and has received over 60 awards. He has been an Associate Editor of IEEE TVLSI (2002-2003) and a coeditor of several books, including “Selected Topics in Biomedical Circuits and Systems,” in 2021 and “A Short History of Circuits and Systems” in 2024. He served as the General Chair of ISCAS 2012 in Seoul and General Co-chair of ISCAS 2021 in Daegu, Korea. In addition, He will serve as the General Co-chair of ISCAS2028, Twin Cities, Minnesota. He has been an IEEE CASS Distinguished Lecturer (2009–2010) and has served on the IEEE CASS BoG (2011–2016), the VP-Conferences (2018–2021), and the President-Elect (2022-2023). During his term as CASS VP-Conferences, he founded the IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) in 2019. AICAS has become the CASS Premier Conference in 2021. He is the director of the Medical Image-Based Intelligent Diagnostic Solutions (MIIDS) Research Center. His research interests include AI/DL circuits and systems and medical imaging diagnosis. He is the IEEE CASS President (2024–2025) and is an IEEE Fellow.
Date: TBA
Mr. Susumu Matsuda
SANTOKU COOPORATION, Japan
Title: Safety and peace of mind based on human biological characteristics
Abstract:
Susumu Matsuda started his company in 2001 with the aim of developing high-value-added products that combine VR visualization with machine vision processing and microcomputer control. During the development of a VR training simulator, he realized that the barrier to using VR was the initial cost, and in 2012, he began working to lower that barrier. He developed basic software that uses a modularized function library to edit programs in chart format. This function allows VR software to be developed in less than 50% of the development time and cost of conventional software. This makes it easier to develop experiential content that raises safety awareness through trial and error. Since 2014, through work on commercializing products that allow people to experience disasters, he has learned that sight alone cannot fully affect people’s emotions, and that only the five senses felt by the body affect negative emotions. This led to the development of a unit that reproduces the sensations felt by the body. He developed hardware using robot control technology to reproduce the sense of balance through a motion base, the sense of skin through air blowing, the sense of touch through a vibrator, and the sense of pain through high-frequency discharge. Disaster experiences are transmitted to the brain through all the nerves of the body via the five senses. Learn that only by experiencing the appropriate senses with the appropriate stimuli according to the disaster can people’s emotions be aroused, and the body memorize them. In 2014, when development began, society was reluctant to release disaster information to the outside world, and it was extremely difficult to obtain disaster information. By continuing to appeal for disaster information to be databased in a way that does not identify the people and places that caused the disaster, and at the same time sharing it with society, it will be demonstrated that safety can be increased beyond language and cultural barriers. In order to reduce the mental impact of experiencing a disaster through the five senses, research and development is being conducted on a mechanism to optimize the negative physical stimuli during the experience. In addition, in order to increase safety during the experience and prove the effectiveness of the product’s use, research and development is being conducted to derive effects from theoretical human characteristics through bioscience. In addition to confirming the effect of improving danger sensitivity, verification is also being conducted on the fact that moderate emotional experiences improve people’s memory effects. As research in brain science, psychology, and neurology progresses, it is beginning to be understood that experiencing moderate fear activates neurogenesis, which is related to the brain’s memory effects. It is also beginning to be understood that experiencing moderate joy causes neurogenesis. Research and development that learns human characteristics from biological sciences and improves safety is important in the task of theoretically verifying the effectiveness of disaster experiences using VR.
Biography:
He studied automatic feedback control theory in the Department of Mechanical Engineering at university. While studying microcomputer architecture at an electronics manufacturer, he engaged in microcomputer hardware design, firmware design, machine control development, and machine vision processing research and development for 13 years. He changed jobs after the 9/11 terrorist attacks and started a VR business on his own. He studied robotics while supporting universities through services to research institutes. He established a VR program company at Tokyo in 2005, and he established a 3D model production company at Ho Chi Minh City in 2009. He proposed the RiMM concept and began developing RiMM’s basic software and systems in 2012. He began commercializing RiMM in 2014, and he announced a disaster experience VR in 2016. Merger of business to current company in 2019. Since 2020, he has studied biological science and life science with the aim of “touching people’s hearts directly,” and mainly conducts research activities for safety and security. He announced the “Cyclone Safety and Security Philosophy.” In 2020. While promoting the RiMM project, he continues to conduct research activities and development both domestically and internationally.