Sara Papi

prof_pic.jpg

I am an AI Researcher at FBK (Fondazione Bruno Kessler), working on speech processing and multimodal LLMs within the MEETWEEN and DVPS Horizon European projects. I received my PhD cum laude in Information Engineering and Computer Science from the University of Trento in 2024, with a focus on simultaneous speech translation and subtitling. My research interests span multimodal and crosslingual instruction-following models, speech foundation models, and LLMs. My work has been recognized with awards, including the Best PhD Graduate 2024 Award in Information and Communication Technology from the University of Trento, an Outstanding Paper and SAC Award at ACL 2024, and a Social Impact Paper Award at EMNLP 2024. I actively contribute to the community as an organizer of the IWSLT Evaluation Campaign and as an Area Chair or reviewer for major conferences in speech and NLP, such as *ACL and Interspeech.

(I love elephants ♥️🐘)

Leave an anonymous feedback here! 😊

News

Selected Publications

  1. ICLR
    Instruction Following
    MCIF: Multimodal crosslingual instruction-following benchmark from scientific talks
    Sara Papi, Maike Züfle, Marco Gaido, and 5 more authors
    In The Thirteenth International Conference on Learning Representations, 2026
  2. TACL Speech Translation
    How “Real” is Your Real-Time Simultaneous Speech-to-Text Translation System?
    Sara Papi, Peter Polák, Dominik Macháček, and 1 more author
    Transactions of the Association for Computational Linguistics, Apr 2025
  3. EMNLP
    Dataset
    MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
    Marco Gaido*Sara Papi*, Luisa Bentivogli, and 6 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024
  4. EMNLP
    Human-Centered AI
    What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study
    Beatrice Savoldi, Sara Papi, Matteo Negri, and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (Social Impact Paper Award) , Nov 2024
  5. ACL
    Speech Translation
    StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
    Sara Papi, Marco Gaido, Matteo Negri, and 1 more author
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  6. ACL
    Speech Translation
    Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
    Marco Gaido, Sara Papi, Matteo Negri, and 1 more author
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Outstanding Paper and SAC Award) , Aug 2024
  7. ACL
    Automatic Subtitling
    SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
    Marco Gaido, Sara Papi, Matteo Negri, and 2 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  8. ACL
    When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP
    Sara Papi*, Marco Gaido*, Andrea Pilzer, and 1 more author
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  9. TACL Automatic Subtitling
    Direct Speech Translation for Automatic Subtitling
    Sara Papi, Marco Gaido, Alina Karakanta, and 3 more authors
    Transactions of the Association for Computational Linguistics, Nov 2023