How do we protect Māori data in the era of generative AI?

I’m a firm believer in the power of artificial intelligence to improve our wellbeing, productivity and economic performance.

This new wave of generative AI applications is making that potential apparent to people all over the world with large language models (LLMs) underpinning a range of compelling applications that grows by the week.

Megan Tapsell, Chair of the AI Forum

It’s exciting to watch, but as with any rapidly evolving technology, there are already warning signs about the potential downsides of its use. We are seeing that right now in the strikes by Hollywood actors and writers who fear AI will be used to replace their creative work.

We are seeing it in lawsuits over copyright infringement, as content owners push back against efforts to scrape their data for inclusion in LLMs. Looking at these developments through a cultural lens, the picture gets even murkier.

Feeding the models

Over the last 30 years we’ve seen some egregious appropriation of te reo Māori as well as Māori imagery and knowledge for commercial purposes. It led to embarrassing and high-profile u-turns by local and foreign companies forced to ditch brand names and artwork because they didn’t consult Māori to begin with.

In the era of AI, there’s the potential for this type of appropriation to become more insidious and widespread. Take, for example, moko kauae, which many wāhine Māori proudly wear as a taonga gifted to them by their tūpuna.

The distinctive facial tattoo is a deeply personal representation of a wāhine’s cultural identity. But images of it could easily be scraped from social media profiles, and images in online news stories, for inclusion in LLMs.

It is in the interests of text-to-image model creators like Midjourney and Stable Diffusion to be able to generate compelling and realistic imagery in response to the widely varying text prompts its users enter. But the companies behind these models have some serious questions to ask themselves about where they source the data to train their models, and whether it's culturally appropriate to include them.

The same equally goes for te reo Māori, which is considered a taonga under Te Tiriti o Waitangi. When it comes to LLM, it is just another language that can be fed into the model to allow for easy and rapid translation, and population of fully-formed articles and search engine answers.

Don’t get me wrong, the use of AI could really help achieve the Government’s goal to have one million New Zealanders speaking basic te reo Māori by 2040. It could help tailor language lessons to the needs of learners, and serve up answers in response to questions about the history of Aotearoa, that properly reflect Māori culture and heritage.

Guidance from Māori

But we all know that any AI model is only as good as the data used to train it. In the rush to create generative AI apps and services, the data landgrab could end up misrepresenting and exploiting indigenous cultures.

It doesn’t have to be that way. There are no specific regulations governing use of Māori data in the private sector, and certainly no legislation specific to the use of AI. But we are seeing really positive initiatives emerging that offer guidance for how Māori data can be used responsibly.

Many of these efforts, such as Te Kāhui Raraunga, the Māori Data Governance Model, have been designed with the system-wide governance of Māori data in the public sector in mind. That makes sense, as that’s where Māori data is used in ways that have the most tangible impact on the lives of Māori.

But Te Kāhui Raraunga and the AI Forum’s own AI Principles can help any organisation to take a responsible approach to developing AI systems. The Data Iwi Leaders Group is also doing incredibly valuable work to inform efforts in this area.

Making proper use of Māori data in AI comes down to three things:

Consult from the beginning: Before you go out and start gathering data for a LLM or AI platform, talk to Māori about what it might actually mean for Māori. If you don’t know who to talk to, the AI Forum can point you in the right direction.
Employ more Māori: If you want to design and build AI products and services for the New Zealand population, reflect the population in the team that is building those products. Māori and Pasifika are underrepresented in STEM-related fields and that will take some time to turn around. But you can make a difference right now by opening up opportunities to young Māori and Pasifika to play a role in the development and deployment of AI products and services, drawing on their knowledge of te ao Māori in the process.
Practice responsible AI: Are your language models being trained appropriately for the Aotearoa context? Have you put the right guardrails in place to prevent your AI-powered chatbot from spewing out misinformation or offensive answers? Do you have sufficient human oversight of model training efforts? These are the sorts of questions anyone involved in developing AI systems should be asking themselves. The AI Forum’s AI Principles are a good starting point, in conjunction with other valuable guides such as Australia’s AI Principles, and Microsoft’s AI Impact Assessment tool.

We are in the midst of an AI revolution which will shake up virtually every industry, and impact how all of us work, learn, and entertain ourselves. But in the months and years to come, we need to do everything in our power to ensure that the cultural appropriation that Māori have been subjected to for over a century isn’t simply amplified with the use of AI. We all have the power to play a role in preventing that outcome.

Peter Griffin

Science and Technology Journalist