HIDDEN VOICES is an open source project with the initial goal of building intelligent tools to aid in adding 10,000 women's biography drafts to Wikipedia.This project was born in Superbloom Studios (
www.superbloomstudios.co) and RBCDSAI (
https://rbcdsai.iitm.ac.in/) and IITMAA (
https://iitmaa.org/) , along with many other partners.
The gender data gap is a major barrier to more equitable solutions across domains. The spoken and written impressions on the web ( text, audio and video) content is vastly outpacing any other form of data. Online curated content is also the building block data source of many AI/ ML solutions like automated speech recognition and language models that form the basis of many products and services. But there is a measurable quantitative lack of representation of gender diverse voices in these core digital data sources.
Natural language models are increasingly forming the basis of various consumer interaction services and the models depend on open web datasets, including Wikipedia. Independently, all of us use digital first tools like Wikipedia to initiate our worldview formation on many subjects.
A long list of several past analyses have discussed the gender asymmetry in Wikipedia participation, biographical representation and challenges to continued presence of biographies by challenging notions of “notable” contributions. At the same time, It is a critical observation that almost half (47%) users of Wikipedia are women in one survey. Hence, while there are multiple layers of complexity to resolve the nature of equitable representation across all digital platforms, we believe there is significant value in increasing representation in Wikipedia.
Some of the major barriers reported include editors' gender and interest but also contributions from external sources. To address these, Project HIDDEN VOICES aims to develop information theoretical approaches , ML assisted auto identification and validation of external sources and textual analysis methods to auto-generate a first draft of Wikipedia- style biography. As part of the project, we also develop resources to educate our community participants on the need for continued participation.
We appreciate your interest and ask for your participation to combat the gender data gap to build the foundation for a balanced and equitable future.