Appen’s range of AI projects and diverse global
SAN FRANCISCO — April 29, 2021 — Appen Limited (ASX:APX), the leading provider of high-quality training data for organizations that build effective AI systems at scale, is enabling
AI projects based on biased or incomplete data don’t work for everyone. According to a report published by PNAS in March 2020 (Proceedings of the National Academy of Sciences), popular automated speech recognition (ASR) systems that are used for virtual assistants, closed captioning, hands-free computing and much more,
“The quality and diversity of training data directly impacts the performance and bias present in AI models,” said Appen CEO Mark Brayan. “As a data partner, we can supply complete training data for many use cases to ensure AI models work for everyone. It’s critical that we engage a diverse group of individuals to produce, label, and validate the data to ensure the model being trained is not only equitable, but also built responsibly.”
Range of Appen Language Projects
Appen demonstrates its commitment to creating AI for everyone through a variety of projects and partnerships focused on the diversity of languages and dialects.
· Translators without Borders (TWB) partnership – Appen, in partnership with TWB, Amazon, Carnegie Mellon University, Facebook, Google, John Hopkins University, Microsoft, and Translated joined the Translation Initiative for COVID-19 (TICO-19), which supported the development of language technology to make COVID-19 information available in as many languages as possible, including languages in developing countries like Congolese Swahili, Tigrinya, and Nigerian Fulfulde.
· The Inuktitut translation project – In collaboration with the Government of Nunavut, Microsoft added
· The Canadian French translation project – Appen coordinated with native language consultants to help Microsoft add “Canadian French” as a language option in Microsoft Translator.
· African American Vernacular English (AAVE) off-the-shelf datasets – Most existing training datasets used in ASR, search engines, voice assistants and sentiment analysis are not representative of AAVE. To make high-quality AAVE data available, Appen is working with AAVE speakers among its crowd of annotators to collect data for an OTS dataset based on conversations about a broad range of topics.
“Biased AI data leads to projects that can fail to deliver the expected business results and harm individuals they are supposed to benefit,” said Dr. Judith Bishop, Senior Director of AI Specialists at Appen. “The scale and complexity of AI projects makes it impossible for most companies to acquire sufficient unbiased high-quality data without partnering with an AI data expert. Appen’s commitment to developing the most diverse and expert crowd of data annotators provides the industry with a clearly differentiated resource for building fair and ethical AI projects.”
Appen’s Leading Approach to Diversity
Appen relies on training data annotators
Appen also offers off-the-shelf (OTS) datasets designed to make it easier and faster for businesses to acquire the high-quality training data they need to accelerate their AI and machine learning projects. OTS datasets are available for 80 languages and multiple dialects, including hard-to-acquire languages such as multiple varieties of the Arabic language, Croatian, Greek, Hungarian, Thai and more.
According to the United Nations Department of Economic and Social Affairs, “about 97 percent of the world’s population speaks just 4 percent of its  languages”. That 4 percent is only 280 languages – yet the number of languages well-served by AI core technologies, is a fraction of that number. Appen aims to help increase that number through these and future projects.
About Appen Limited
Appen collects and labels images, text, speech, audio, and video used to build and continuously improve the world’s most innovative artificial intelligence systems. With expertise in more than 235 languages, a global crowd of over 1 million skilled contractors, and the industry’s most advanced AI-assisted data annotation platform, Appen solutions provide the quality, security, and speed required by leaders in technology, automotive, financial services, retail, manufacturing, and governments worldwide. Founded in 1996, Appen has customers and offices around the world.