AI Outperforms Humans in Emotional Intelligence Tests

AI Surpasses Human Abilities in Emotional Intelligence

While artificial intelligence (AI) is often celebrated for its prowess in mathematics, coding, and data analysis, a new study reveals that its capabilities extend well beyond logic and computation. Researchers from the University of Geneva and the University of Bern have found that AI systems, such as ChatGPT and Gemini, outperform humans in emotional intelligence tests. The findings challenge the long-held belief that emotions and empathy are uniquely human traits.

Contents

AI Surpasses Human Abilities in Emotional Intelligence

Evaluating Emotional Understanding in AI

How AI Demonstrated Emotional Intelligence

AI as a Creator of Emotional Intelligence Tests

Limitations and Nuanced Differences

Implications for the Future of Affective Computing

In a groundbreaking study, six leading AI models—ChatGPT-4, ChatGPT-o1, Gemini 1.5 Flash, Copilot 365, Claude 3.5 Haiku, and DeepSeek V3—were tested using established psychological assessments. The outcome was striking: on average, AI models answered 81% of emotional understanding questions correctly, compared to just 56% for human participants.

Evaluating Emotional Understanding in AI

To evaluate AI’s emotional intelligence, the researchers utilized tests typically used to assess humans. These included the Situational Test of Emotion Understanding (STEU) and the Geneva Emotion Knowledge Test – Blends (GEMOK-Blends), which measure the ability to recognize emotions in different contexts. Additionally, emotional regulation was assessed using the Situational Test of Emotion Management (STEM) and subtests from the Geneva Emotional Competence Test (GECo).

Each AI model was tested ten times to ensure consistency and accuracy. The models displayed remarkable agreement in their emotional judgments, indicating they had developed a similar understanding of emotional responses—despite not being specifically trained to identify emotions.

“LLMs can not only identify the best option among many available ones, but also create new scenarios that suit the desired context,” said Katja Schlegel, lead author of the study and lecturer at the University of Bern’s Institute of Psychology.

How AI Demonstrated Emotional Intelligence

In one of the test scenarios, participants were asked how to handle a situation where Employee A stole an idea from Employee B and received praise from a supervisor. The best emotionally intelligent response was to calmly approach the supervisor rather than confronting the offending colleague or seeking revenge. AI systems consistently chose the most emotionally appropriate answers in such scenarios, demonstrating both empathy and self-regulation.

“The results showed significantly higher scores for the LLMs—82%, compared to 56% by human participants,” explained Marcello Mortillaro, senior scientist at the Swiss Centre for Affective Sciences. “This indicates that these AIs not only comprehend emotions, but also possess an understanding of functioning with emotional intelligence.”

AI as a Creator of Emotional Intelligence Tests

Encouraged by AI’s performance, the researchers took the study further by asking ChatGPT-4 to generate its own emotional intelligence tests. The AI created realistic scenarios with multiple-choice answers, each pointing to the most emotionally intelligent solution. These AI-generated tests were then given to 467 human participants alongside the original psychologist-designed versions.

The outcome was surprising: participants performed equally well on both versions. The AI-created tests were found to be just as clear, challenging, and varied as those made by humans. Statistically, they had “equivalent difficulty,” suggesting the AI had internalized not only emotional reasoning but also the structural logic behind test creation.

Limitations and Nuanced Differences

Despite the AI’s impressive capabilities, researchers noted some minor limitations. Human-written questions were rated slightly clearer, and the scenarios in AI-generated tests were less diverse. However, these differences were not substantial enough to impact the study’s conclusions.

Importantly, 88% of the items in the AI-generated test were found to be completely original, rather than reworded versions of existing questions. The AI’s tests also showed similar correlations with vocabulary and other emotional intelligence measures, reinforcing that they were accurately assessing the intended traits.

Implications for the Future of Affective Computing

The findings open new possibilities for affective computing—the field of AI focused on understanding and responding to human emotions. Traditionally, this area relied on detecting emotions through facial expressions, tone of voice, or word choice. However, large language models (LLMs) may now offer a new approach: reasoning about emotions based on human-like understanding developed through vast textual learning.

While these models do not actually feel emotions, they can recognize and respond to them in ways that are contextually appropriate. For example, a tutoring bot could identify when a student is frustrated and offer encouragement, or a healthcare assistant could provide comforting words to a patient. These capabilities, though lacking true human emotion, can still offer meaningful and helpful interactions.

This breakthrough could transform how technology is used in therapy, education, and emotional coaching—provided that it is implemented responsibly and guided by experts in the field.

This article is inspired by content from Original Source. It has been rephrased for originality. Images are credited to the original source.