Project HIDDEN VOICES :  Reducing the gender data gap in curated web content
HIDDEN VOICES is  an open source project with the initial goal of building intelligent tools to aid in adding 10,000 women's biography drafts to Wikipedia.This project was born in Superbloom Studios (www.superbloomstudios.co) and RBCDSAI ( https://rbcdsai.iitm.ac.in/)  and IITMAA (https://iitmaa.org/) , along with many other partners.

The gender data gap is a major barrier to more equitable solutions across domains. The spoken and written impressions on the web ( text, audio and video) content is vastly outpacing any other form of data. Online curated content is also the building block data source  of many AI/ ML solutions like automated speech recognition and language models that form the basis of many products and services.  But there is a measurable quantitative lack of representation of gender diverse voices in these core digital data sources.

Natural language models are increasingly forming the basis of various consumer interaction services and the models depend on open web datasets, including Wikipedia. Independently, all of us use digital first tools like Wikipedia to initiate our worldview formation on many subjects.

A long list of several past analyses have discussed the gender asymmetry in Wikipedia participation, biographical representation and challenges to continued presence of biographies by challenging notions of “notable” contributions. At the same time, It is a critical observation that almost half (47%) users of Wikipedia are women in one survey. Hence, while there are multiple layers of complexity to resolve the nature of equitable representation across all digital platforms, we believe there is significant value in increasing representation in Wikipedia.

Some of the major barriers reported include  editors' gender and interest but also contributions from external sources.  To address these, Project HIDDEN VOICES aims to develop  information theoretical approaches , ML assisted auto identification and validation of external sources and textual analysis methods to auto-generate a first draft of Wikipedia- style biography. As part of the project, we also develop resources to educate our community participants on the need for continued participation.

We appreciate your interest and ask for your participation to combat  the gender data gap to build the foundation for a balanced and equitable future.

Email *
Would you like to contribute to this open source project? *
What is your name? *
Please identify your area of expertise *
Required
Approximately how much time would you be able to volunteer/contribute in a month? *
Would your employer credit time for this effort?
Clear selection
Are you interested in a paid full time role in this project ?
Clear selection
Which country are you based in?
Submit
Clear form
Never submit passwords through Google Forms.
reCAPTCHA
This form was created inside of Rajashree Baskaran. Report Abuse