CALAMITA pre-proposal

JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

AILC, the Italian Association for Computational Linguistics, is launching a collaborative effort to develop a dynamic and growing benchmark for evaluating LLMs’ capabilities in Italian.

In the long term, we aim to establish a suite of tasks in the form of a benchmark which can be accessed through a shared platform and a live leaderboard. This would allow for ongoing evaluation of any existing and newly developed Italian or multilingual LLMs.

In the short term, we are looking to start building this benchmark through a series of challenges collaboratively construed by the research community. Concretely, this happens through the present call for challenge contributions. In a similar style to standard Natural Language Processing shared tasks, participants are asked to contribute a task and the corresponding dataset with which a set of LLMs should be challenged.

Please, fill in all the fields and provide as much information as possible for the organisers to assess the nature of the challenge you propose.

Proposers *

Contact email *

Short description of the proposed challenge. Please specify which model's ability you'd like to test. (max 2500 chars) *

Please, provide a brief data description including whether it's already available or not, originally created or translated from existing benchmark, whether it is or will be annotated and by whom and with which labels, etc. *

How do you plan to evaluate your model? Zero-shot or few-shot? (optional)

Zero-shot

Few-shot

Clear selection

Provide an example of a prompt if it is already available. (optional)

In the case of few-shot, how many examples are allowed? (optional)

Please provide the URL if the benchmark is already available. (optional)

Additional notes (optional, max 2500 chars)

Submit

Clear form

Never submit passwords through Google Forms.

This form was created inside of University of Groningen. Report Abuse

Forms