Radu Ionescu

Radu Ionescu

Biography

Radu Tudor Ionescu is a professor at the University of Bucharest (Romania). He completed his PhD at the University of Bucharest in 2013. He received the 2014 Award for Outstanding Doctoral Research in the field of Computer Science from the Romanian Ad Astra Association. His research interests include machine learning, computer vision, image processing, medical imaging, computational linguistics and text mining. He published over 120 articles at international peer-reviewed conferences and journals, and a research monograph with Springer. He received the “Caianiello Best Young Paper Award” at ICIAP 2013 for the paper entitled “Kernels for Visual Words Histograms”. Radu also received the “Young Researchers in Science and Engineering” Prize for young Romanian researchers, and the “Danubius Young Scientist Award 2018 for Romania” by the Austrian Federal Ministry of Education, Science and Research and by the Institute for the Danube Region and Central Europe. Together with other co-authors, he obtained good rankings at several international competitions: 4th place in the Facial Expression Recognition Challenge of WREPL 2013, 3rd place in the NLI Shared Task of BEA-8 2013, 2nd place in the ADI Shared Task of VarDial 2016, 1st place in the ADI Shared Task of VarDial 2017, 1st place in the NLI Shared Task of BEA-12 2017, 1st place in the ADI Shared Task of VarDial 2018, 1st place in the ACM Multimedia 2023 Computational Paralinguistics Challenge (ComParE) on request and complaint detection.

 

Title: Recent Text and Audio Resources for the Romanian Language

Abstract: In this talk, we present two recent resources for the Romanian language. First, we will introduce a novel Romanian Clickbait Corpus (RoCliCo) comprising 8,313 news samples which are manually annotated with clickbait and non-clickbait labels. We present preliminary results with various machine learning methods. Among the considered models, we will present a novel BERT-based contrastive learning model that learns to encode news titles and contents into a deep metric space. Next, we will present RoDia, the first dataset for Romanian dialect identification from speech. RoDia includes a varied compilation of speech samples from five distinct regions of Romania, covering both urban and rural environments, totaling 2 hours of manually annotated speech data. Along with RoDia, we introduce a set of competitive models to be used as baselines for future research. Moreover, we present empirical evidence showing that Automatic Speech Recognition on dialectal speech is more challenging.