MS in Data Science - Seminar Series
Introduction of the speaker:Carlos Garcia is an experienced researcher, passionate about understanding complex problems and applying new technologies. He is currently working as a machine-learning researcher at the Data Institute of USF and collaborating with various external organizations as Reddit and the UCSF Department of Radiation Oncology.
His research is interested in Natural Language Processing, specifically in topic modeling and the automatic acquisition of hierarchically-structured knowledge by means of taxonomies and graphs. He is also conducting research in computer vision models applied to radiation oncology images. He studies how transfer learning and model complexity impact the accuracy of the task. Previously he was a post-doctoral researcher at the Donostia International Physics Center. His work focused on developing numerical models to solve complex problems in computational chemistry and material sciences. He received a Ph.D. in Theoretical Physics from Franche-Comte University in 2015.
Abstract:Convolutional neural networks (CNNs) have become state-of-the-art models in medical imaging diagnosis and prediction. Commonly used networks have millions of parameters and have been mainly designed for ImageNet - a task with millions of training images and 1000 classes. However, most tasks in the medical domain have less than ten classes and thousands of training images. In this talk, I will present a quantitative analysis showing that similar performance can be achieved with smaller architectures, improving the scalability of machine learning models on real-world tasks.
The second part of the talk will be focused on Natural Language Processing and some results from our research collaboration with the Reddit ML content department. I will briefly introduce the Taxonomy learning task and how the automatic acquisition of hierarchically-structured knowledge can be used for topic discovery and user recommendation.