Each day of the School will typically comprise two tutorials in the morning and one or two hands-on sessions in the afternoon, each lasting 1-1.5 hours (the balance of tutorials and hands-on sessions may vary from topic to topic). The final schedule will be available on the School’s website in the near future.
The School will begin on Sunday evening, September 14, with the registration of participants. The participants will be invited to discover Timișoara in a free tour in the afternoon/evening of Monday, September 15. A welcome reception will be given on Tuesday evening, September 16. During the School, evenings are usually reserved for engaging discussions, going out in the city, and networking. A day-long or half-day excursion will be organised on Saturday, September 20.
Sunday 14 September: Registration (afternoon)
Monday 15 September
8:00 - 9:00 -> Registration
9:00 - 9:30 -> Opening
9:30 - 11:00 -> Mădălina Chitez & Karla Csürös
Tutorial: Corpus-Driven NLP for Educational Insights: Tools, Techniques, and Use Cases
This lecture explores the development of corpus-driven NLP applications designed to generate educational insights from Romanian-language data. Focusing on domain-specific annotated corpora, it introduces the ROGER and EXPRES academic writing platforms, which enable advanced corpus queries and the extraction of multi-word expressions, rhetorical move patterns, and academic discourse features through shallow parsing and n-gram analysis. These resources have informed the creation of derived lexical datasets, such as structured academic phrase lists and a phrasal lexicon for writing support. The session also presents the LEMI platform, a readability assessment tool tailored to Romanian children’s literature, which integrates surface-level and syntactic complexity metrics to support age-appropriate text selection. Emphasis will be placed on corpus compilation, annotation schemes, feature engineering, and the integration of linguistic insights into NLP pipelines for educational technology.
11:00 - 11:30 -> Coffee break
11:00 - 13:00 -> Dan Cristea & Andrei Scutelnicu
Tutorial: DeLORo project
DeLORo (Deep Learning for Old Romanian) is a project intended to build a technology capable of deciphering old printed and uncial Cyrillic Romanian documents and transliterate them into the Latin script. In this paper we concentrate on the processes of organisation and of acquisition of the primary data necessary in the process of training the deep learning recognition technology. After a brief overview of similar enterprises, we compare our approach with others. Then, we present in some details the structure of DeLORo’s data repository, which includes: images of scanned pages, annotations operated over them, and alignments between annotated objects in the images and sequences of decoded Latin characters. The presentation will focus on the practical module, where a tutorial on how to use the platform will be given.
13:00 - 14:30 -> Lunch
14:30 - 16:00 -> Mădălina Chitez & Andreea Dincă
Hands-on session: From Text to Pedagogy: Data Mining Approaches to Educational Content
This hands-on session introduces participants to data mining techniques for analyzing educational materials, with a focus on Romanian school textbooks. Using pre-processed data from the ROTEX corpus and NLP pipelines for verb extraction and Bloom's Taxonomy labeling, participants will explore how instructional tasks reflect cognitive complexity across grades and publishers. Activities will include comparative analysis of textbooks and curriculum standards, identification of task patterns (e.g., prevalence of lower- vs. higher-order thinking), and visualization of cognitive progression using syntactic and semantic annotations. The session emphasizes how computational methods such as POS tagging, pattern matching, and task classification can be used to evaluate educational content quality and pedagogical coherence, particularly in low-resource educational contexts. Participants will leave with hands-on experience in aligning linguistic features with pedagogical frameworks and discussing the broader implications for curriculum development and instructional design.
16:00 -> Welcome reception
19:00 -> Concert @ Filamornica
Attendance to the concert requires registration using the form received on email together with logistic information.
Tuesday 16 September
9:00 - 10:30 -> Mark Finlayson
Tutorial: The Basics of Linguistic Annotation
This lecture will introduce the basics of linguistic annotation. We will motivate a continued interest in linguistic annotation by discussing it's fundamental importance in the age of LLMs. Then we will discuss defining an annotation task, obtaining or implementing annotation tools, writing annotation guides and schemes, the recruitment and training of annotators and adjudicators, and the actual process of annotation itself.
10:30 - 11:00 -> Coffee break
11:00 - 12:30 -> Mark Finlayson
Hands-on session: The Process of Annotation
During this hands-on session, students will split into small teams and conduct a small-scale annotation with instructor-provided materials. Teams will go through at least two rounds of annotation and adjudication, to learn first-hand about the often subtle problems that arise when idealized annotation schemes meet real-world data.
12:30 - 14:30 -> Lunch
14:30 - 16:00 -> Mark Finlayson
Hands-on session: Inter-Annotator Agreement
Measuring agreement between annotators is absolutely fundamental to annotation, in particular for validating annotation schemes, monitoring the quality of annotator training, and assuring the final quality of the annotations. In the first half of this session we will review a number of standard ways of computing agreement. In the second half student teams will reform and compute agreement for their annotated data, and will use these computations as well as detailed disagreement analysis to drive development of insights into possible next steps in refinement of their annotation task.
16:00 - 16:30 -> Coffee break
16:30 - 18:00 -> Mark Finlayson
Tutorial: Corpus Assembly & Distribution; The Future of Annotation
For annotated data to be useful to the scientific enterprise, it must be made available to other scientists. In the first half of this session we discuss best practices for corpus collation, formatting, and public release. We will discuss ease of use, ease and permanence of access, and intellectual property concerns. In the latter half of the session we will have a general discussion of the future of annotation in light of recent advances in LLMs and human-machine teaming.
18:00 -> City Tour
Wednesday 17 September
9:00 - 10:30 -> Daniela Gifu and Diana Trandabat
Hands-on session: Agentive AI in action.
Agentic AI represents a new generation of artificial intelligence capable of making autonomous decisions and independently acting to achieve its goals. AI agents can solve real or fictional problems, combining creativity with an understanding of decision-making flows. Through interactive examples and scenarios, participants learn essential principles of AI autonomy and its potential impact across various fields. This session provides a practical and applied understanding of agentic AI, preparing students for future technological challenges.
10:30 - 11:00 -> Coffee break
11:00 - 12:30 -> Nancy Ide
Tutorial: From Mechanical Minds to Neural Networks: Charting AI’s Evolution from the 1950s to Today Artificial Intelligence has traveled a remarkable path since its conceptual roots in the 1950s, evolving from symbolic reasoning and rule-based systems to the data-driven, learning-centric approaches that power today’s cutting-edge technologies. This talk traces the major milestones and paradigm shifts that have shaped the development of AI over the past seven decades. We’ll explore the early era of logic and expert systems, the "AI winters" that tested the field’s resilience, and the emergence of machine learning and neural networks that redefined what machines can do. Along the way, we’ll highlight key breakthroughs, the individuals and institutions that propelled progress, and the social and technological forces that influenced each stage. By understanding this historical trajectory, we gain deeper insight into where AI stands today—and where it may be headed next.12:30 - 14:30 -> Lunch
14:30 - 16:00 -> Liviu Dinu
Tutorial: From comparative to assisted-by-computer methodologies in historical linguistics via modern computational tools16:00 - 16:30 -> Coffee break
16:30 - 18:00 -> Adrian Spataru
Hands-on session: Entrepreneurial Thought and Action
This session will help students discover each other through a playful series of exercises. Moreover, it will help students discover their inner child, the one which had no option but to learn from mistakes. This is an introduction to the flow of Entrepreneurial Thought and Action.
18:00 -> Socializing and Brainstorming
Thursday 18 September
9:00 - 10:30 -> Adrian Iftene & Cristian Simionescu
Hands-on session: From Chat to Action: Integrating LLMs into Real-World ApplicationsLarge Language Models (LLMs) are often presented as “conversational” engines, yet their true power lies in driving concrete actions within software systems. In this couple of hands-on sessions, you will learn how to adapt a pre-trained LLM to your own domain and seamlessly embed it in an application pipeline. We’ll start by fine-tuning or prompt-engineering an open-source model for a targeted vocabulary (e.g. online retail terminology), then connect the model to a simple web service or voice-driven interface. By the end of the two sessions you will have built two fully functional demos:
- Natural-Language e-Commerce: “Add three summer dresses and a pair of sneakers to my cart” – the model parses the request, looks up SKUs, and issues API calls to an example store.
- Voice-Activated Workflows: “Schedule tomorrow’s team meeting, send invites, and share the agenda” – the model interprets high-level commands and triggers microservice operations via a REST interface.
Along the way, we’ll cover best practices for prompt design, error handling, and performance monitoring in production. This session is ideal for linguists, developers, and data scientists who want to move beyond chatbots and build real-world applications powered by LLMs. Bring your laptop and get ready to turn language into action!
10:30 - 11:00 -> Coffee break
11:00 - 12:30 -> Dan Tufiș & Vasile Păiș
Tutorial: European Language Resources Infrastructures: Case Study-ERIC CLARIN
Research Infrastructures are constructions requiring big investments, serving researchers in doing their work. Today, there are 30 ERICs, in various scientific domains, but in the future new ones are expected to be launched. This talk will provide a brief view on one of the most successful RI – CLARIN, which hopefully will be soon joined by Romania. CLARIN is dedicated to Open Science, promoting the sharing and re-use of language data, interoperability of data and services. It promotes comparative perspectives, multidisciplinary collaboration, transnational research, responsible data science and supports linguistic diversity: data covering many languages, tools for many languages, language resources in all modalities, discipline- & language-agnostic. By following the developments of CLARIN, we developed a LT portal in the same spirit (but much smaller, mainly for Romanian language) and it will be demonstrated and the hand-on session.
12:30 - 14:30 -> Lunch
14:30 - 16:00 -> Dan Tufiș & Vasile Păiș
Hands-on session: RELATE – A Portal for Language Technologies
The students will be introduced to a CLARIN-like collection of tools, resources and services. RELATE is a modular state-of-the-art platform developed at RACAI (Dr. Păiș, Dr. Ion), that is used for processing written and spoken language (mainly Romanian but not only). Resources and technologies were developed in our institute as well as by partner institutions. RELATE is used in multiple national and international research projects. It was designed to use standardized file formats, ensuring interoperability with other language processing systems. Internal functions are available as JSON REST web services. In the Representational State Transfer (REST) architectural style, data and functionality are considered resources and are accessed using Uniform Resource Identifiers (URIs).
16:00 - 16:30 -> Coffee break
16:30 - 18:00 ->Liviu Dinu
Hand-on session: From comparative to assisted-by-computer methodologies in historical linguistics via modern computational tools
Natural languages are living eco-systems, they are constantly in contact and, by consequence, they change continuously. Traditionally, the main Historical Linguistics problems (How are languages related? How do languages change across space and time?) have been investigated with comparative linguistics instruments. We propose here, for Romance languages, computer-assisted methods for main problems in HL (related words discrimination, protoword reconstruction, languages similarity, semantic divergence, etc.). Our studies on Romance languages rely on a digital resource for HL that we constructed and published (RoBoCoP - ROmance BOrrowing COgnate Package) containing a comprehensive and reliable database of Romance cognates and borrowings based on the etymological information provided by publicly available dictionaries in five languages: Spanish, Italian, French, Portuguese, and Romanian (the largest known database of this kind) . To answer the first question, we are interested not only by the phylogenetic classification of natural languages, but also by the degree of similarity between two languages. Via various techniques and metrics we offer an answer at three levels: phonetic, lexical and syntactic. For the second question, based on RoBoCoP dataset we were able to perform the most extensive experiments up to date on a series of HL tasks, including cognates identification, cognate-borrowing discrimination, borrowing direction detection, automatic protoword reconstruction, semantic divergence, etc. for Romance languages, using computational methods based on machine learning models for sequence modelling including encoder-decoder transformers in the Flan-T5 family and conditional random fields, and recently obtained state of the art results on these tasks, showing that computer-assisted methods where computational methods are integrated with linguistic knowledge is a viable direction for tackling these problems in historical linguistics.
18:30 -> Socializing and Brainstorming
Friday 19 September
9:00 - 10:30 -> Mihai Dascălu & Andreea Duțulescu
Tutorial: AI Tools for Education
The practical session focuses on leveraging vLLM, a high-performance inference library, to achieve structured and accelerated processing with LLMs. Students will learn how to optimize inference workflows, enabling faster response times and efficient resource utilization without compromising output quality. The session also covers generating structured outputs such as regex patterns and JSON formatting, enhancing clarity and usability for downstream applications and data extraction. Through hands-on exercises, attendees will gain practical skills to implement vLLM for scalable, low-latency NLP workflows that deliver both speed and precise, interpretable results.
10:30 - 11:00 -> Coffee break
11:00 - 12:30 -> Mihai Dascălu & Andreea Duțulescu
Hands-on session: Faster Interface and Structured Outputs and wit vLLM
The practical session focuses on leveraging vLLM, a high-performance inference library, to achieve structured and accelerated processing with LLMs. Students will learn how to optimize inference workflows, enabling faster response times and efficient resource utilization without compromising output quality. The session also covers generating structured outputs such as regex patterns and JSON formatting, enhancing clarity and usability for downstream applications and data extraction. Through hands-on exercises, attendees will gain practical skills to implement vLLM for scalable, low-latency NLP workflows that deliver both speed and precise, interpretable results.
12:30 - 14:30 -> Lunch
14:30 - 16:00 -> Rada Mihalcea & Angana Borah
Tutorial: Using Multi-Agent Systems to Explore and Model Human Social Behavior
Recent advancements in multi-agent systems have led to a rapidly growing research area focused on simulating increasingly complex human behaviors — such as group consensus, implicit bias, persuasion, and, in some cases, cooperation or conflict. Yet, these systems exist in a paradox: they are computational and artificial, entirely lacking the intrinsic consciousness, emotions, and social intuition that define human individuals and societies. In this tutorial, we will explore the evolving relationship between AI agents and human behavior, drawing on large-scale generative agent experiments, studies on bias in multi-agent interactions, and insights into misinformation and group behavior. We will also discuss the broader implications of these systems — not only for the future of AI but also for human-centered disciplines such as psychology, sociology, and ethics, where they can challenge or facilitate our understanding of intelligence, agency, and collective decision-making.
This tutorial will also include a hands-on component, where we will demonstrate how to instrument LLM agents to explore various dimensions of social behavior in multi-agent settings, with additional exercises on evaluating agent interactions and testing for bias.
16:00 - 16:30 -> Coffee break
16:30 - 18:00 ->Rada Mihalcea & Angana Borah
Hand-on session: Using Multi-Agent Systems to Explore and Model Human Social Behavior
Recent advancements in multi-agent systems have led to a rapidly growing research area focused on simulating increasingly complex human behaviors — such as group consensus, implicit bias, persuasion, and, in some cases, cooperation or conflict. Yet, these systems exist in a paradox: they are computational and artificial, entirely lacking the intrinsic consciousness, emotions, and social intuition that define human individuals and societies. In this tutorial, we will explore the evolving relationship between AI agents and human behavior, drawing on large-scale generative agent experiments, studies on bias in multi-agent interactions, and insights into misinformation and group behavior. We will also discuss the broader implications of these systems — not only for the future of AI but also for human-centered disciplines such as psychology, sociology, and ethics, where they can challenge or facilitate our understanding of intelligence, agency, and collective decision-making.
This tutorial will also include a hands-on component, where we will demonstrate how to instrument LLM agents to explore various dimensions of social behavior in multi-agent settings, with additional exercises on evaluating agent interactions and testing for bias.
19:30 -> Excursion and Gala dinner
Saturday 20 September
9:00 - 10:30 -> James Davenport
Tutorial: Generative AI emerged when the AI Act was nearly finalised
The speaker will polish his crystal ball and try to guess what happens as the standardisation process reacts to this development. The EU AI Act is one of the most significant pieces of Ai legislation in the world. This is accepted, even if reluctantly, by all in AI. What is not often realised is that this is written as product safety legislation, a point of view that is unfamiliar to many in AI. Like other product saeft legislation, it is written in fairly general terms, using words such as ``unbiased'', leaving it to standards to describe what this actually means in detail. Again, this is unfamiliar to many in AI. We will therefore first look at the Act, and the European standardisation process. In particular, we will look at the various actors in this process. Then we will look at the particular standardisation activities as they currently relate to Natural Language Processing.