MATA (మాట): Mindful Assessment of the Telugu Abilities of Large Language Models

LREC 2026

1University of Potsdam, Potsdam, Germany 2National Research Council, Ottawa, Canada

Abstract

In this paper, we introduce MATA, a novel evaluation dataset to assess the ability of Large Language Models (LLMs) in Telugu language, comprising 729 carefully curated multiple-choice and open-ended questions that span diverse linguistic dimensions. We evaluate 11 open-weight and closed-source LLMs on our dataset and present a fine-grained analysis of their performance. Further, we empirically show how LLMs rely on superficial heuristics such as answer position and distractor patterns for multiple-choice questions. Finally, we also compare LLM-as-a-judge evaluation with human evaluation for open-ended questions assess its reliability in a low-resource language. We argue that such fine-grained evaluation is essential for understanding model limitations and can inform the development of more linguistically capable LLMs, while also serving as a foundation for future research in Telugu NLP.

Video in Telugu (Auto Generated by NotebookLM )

BibTeX

@inproceedings{kranti-etal-2026-mata,
  title = {MATA: Mindful Assessment of the Telugu Abilities of Large Language Models},
  author = {Kranti, Chalamalasetti and Vajjala, Sowmya},
  booktitle = {Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)},
  month = {May},
  year = {2026},
  pages = {4239--4256},
  address = {Palma, Mallorca, Spain},
  publisher = {European Language Resources Association (ELRA)},
  doi = {10.63317/2qyza2xt6xac}
}