ConversationTechSummitAsia

Google and Georgia Tech Unveil DolphinGemma: AI-Driven Breakthrough in Dolphin Communication

CHAT explainer video
CHAT explainer video

For decades, deciphering the complex vocalizations of dolphins has been a scientific challenge.

Today, coinciding with National Dolphin Day, Google and Georgia Tech have announced a pivotal advancement with DolphinGemma, an AI model capable of understanding and generating dolphin-like sounds. This collaborative effort with the Wild Dolphin Project (WDP) symbolizes a significant step towards interspecies communication.

WDP: Decades of Dedicated Research

Since 1985, the WDP has conducted the longest-running underwater study of wild Atlantic spotted dolphins. Their approach has been deep yet non-intrusive, analyzing communication and behavior across generations in the Bahamas. Their extensive dataset includes years of audio-visual recordings paired with detailed dolphin life histories and observed behaviors.

WDP’s research focuses on the natural interaction of dolphins, correlating specific sounds to behaviors. Examples include:

  • Signature whistles, unique to individuals, act as names between mothers and calves.
  • Squawks during fights signify more aggressive interactions.
  • Buzzes often used in courtship or when chasing predators.

Launching DolphinGemma

DolphinGemma represents a new frontier, leveraging WDP’s comprehensive dataset. Developed by Google, the AI utilizes advanced audio technologies like the SoundStream tokenizer to effectively model natural dolphin sounds. It processes sound sequences to predict subsequent sounds, akin to how language models forecast words in human language.

Field-tested on WDP’s Pixel devices, DolphinGemma aims to identify and predict sound patterns, revealing potential structures and meanings of dolphin communication. Insights drawn from Google’s Gemma models guide this groundbreaking research.

Utilizing Pixel Phones for Real-Time Analysis

Parallel to natural communication analysis, WDP explores two-way interactions with dolphins using technology like the CHAT system (Cetacean Hearing Augmentation Telemetry), created in collaboration with Georgia Tech. CHAT underpins a simpler language using synthetic whistles linked to objects dolphins exhibit interest in.

A Google Pixel 6 prototype handles real-time analysis, with plans to upgrade to Pixel 9. This advancement involves deep learning model integrations for enhanced interaction speed and efficacy.

Open Access and Collaborative Potential

DolphinGemma will be shared publicly as an open model this summer, inviting researchers to explore its adaptation across different species of cetaceans. Trained primarily on Atlantic spotted dolphins, its open-source nature encourages further refinement and application globally.

By doing so, the project seeks to empower researchers with robust tools that accelerate data analysis, fostering deeper insights into dolphin intelligence and communication.

This collaborative venture led by WDP, Georgia Tech, and Google outlines a future where human-dolphin communication advances towards mutual understanding. Discover more at aitechtrend.com.