CALAMITA pre-proposal

AILC, the Italian Association for Computational Linguistics, is launching a collaborative effort to develop a dynamic and growing benchmark for evaluating LLMs’ capabilities in Italian.

In the long term, we aim to establish a suite of tasks in the form of a benchmark which can be accessed through a shared platform and a live leaderboard. This would allow for ongoing evaluation of any existing and newly developed Italian or multilingual LLMs.

In the short term, we are looking to start building this benchmark through a series of challenges collaboratively construed by the research community. Concretely, this happens through the present call for challenge contributions. In a similar style to standard Natural Language Processing shared tasks, participants are asked to contribute a task and the corresponding dataset with which a set of LLMs should be challenged

Please, fill in all the fields and provide as much information as possible for the organisers to assess the nature of the challenge you propose.

Sign in to Google to save your progress. Learn more
Proposers *
Contact email *
Short description of the proposed challenge.  Please specify which model's ability you'd like to test. (max 2500 chars) *
Please, provide a brief data description including whether it's already available or not, originally created or translated from existing benchmark, whether it is or will be annotated and by whom and with which labels, etc. *
How do you plan to evaluate your model? Zero-shot or few-shot? (optional)
Clear selection
Provide an example of a prompt if it is already available. (optional)
In the case of few-shot, how many examples are allowed? (optional)
Please provide the URL if the benchmark is already available. (optional)
Additional notes (optional, max 2500 chars)
Submit
Clear form
Never submit passwords through Google Forms.
This form was created inside of University of Groningen. Report Abuse