Google Contractors Compare Gemini and Anthropic’s Claude for AI Evaluation

Google contractors are assessing the performance of its AI model Gemini by comparing its responses to those generated by Anthropic’s Claude, The process involves scoring both models on factors such as accuracy, verbosity, and safety, with contractors taking up to 30 minutes per evaluation. Some responses reportedly identified themselves as Claude, raising questions about Google’s reliance on a competitor’s model.

Claude, recognized for prioritizing safety, often refuses to answer unsafe prompts. In contrast, Gemini has faced scrutiny for generating responses flagged for inappropriate content. Despite Google’s investment in Anthropic, Claude’s terms of service restrict its use for training competing AI models without approval.

Google DeepMind clarified that while comparative evaluations are standard practice, Gemini is not trained using Claude’s outputs. Anthropic has declined to comment on whether Google secured permission for these comparisons. Concerns have also been raised about Gemini’s accuracy, particularly on sensitive topics like healthcare.

Google Contractors Compare Gemini and Anthropic’s Claude for AI Evaluation

Tags:

Sunfox Technologies Named India’s Most Promising Startup at Mahakumbh 2025

Honeywell Unveils AI-Powered Digital Holographic Microscopy for Advanced Particle and Cell Analysi

Microsoft Expands Free Access to Copilot AI | Amazon Unveils AI-Powered Alexa | Tech News | 28 Feb

Rini Gupta

Leave a Reply

About

Quick Links

Important Links

Social Links

More From Us