Crowdsourced AI benchmarks have serious flaws, some experts say

TechCrunch 11h ago

Crowdsourced AI benchmarks have serious flaws, some experts say - TechCrunch

Quick Summary:

AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models.

But some experts say that there are serious problems with this approach from an ethical and academic perspective.

Hadgu and Kristine Gloria, who formerly led the Aspen Institute’s Emergent and Intelligent Technologies Initiative, also made the case that model evaluators should be compensated for their work.

More Pictures

Crowdsourced AI benchmarks have serious flaws, some experts say - TechCrunch (Picture 1)

or

Share This Story

Article Details

Author / Journalist: Kyle Wiggers

Category: Technology

Markets:

Topics:

Source Website Secure: Yes (HTTPS)

News Sentiment: Neutral

Fact Checked: Legitimate

Article Type: News Report

Published On: 2025-04-22 @ 12:30:00 (11 hours ago)

News Timezone: GMT -5:00

News Source URL: techcrunch.com

Language: English

Article Length: 798 words

Reading Time: 5 minutes read

Sentences: 19 lines

Sentence Length: 42 words per sentence (average)

Platforms: Desktop Web, Mobile Web, iOS App, Android App

Copyright Owner: © TechCrunch

News ID: 28148539

View Article Analysis

About TechCrunch

Main Topics: Technology

Official Website: techcrunch.com

Update Frequency: 7 posts per day

Year Established: 2005

Headquarters: United States

News Last Updated: 8 hours ago

Coverage Areas: United States

Ownership: Independent Company

Publication Timezone: GMT -5:00

Content Availability: Worldwide

News Language: English

RSS Feed: Available (XML)

API Access: Available (JSON, REST)

Website Security: Secure (HTTPS)

Publisher ID: #20

Publisher Details

Frequently Asked Questions

How long will it take to read this news story?

Which news outlet covered this story?

How trustworthy is 'TechCrunch' news outlet?

What do people currently think of this news story?

How do I report this news for inaccuracy?

Share This Story

More News from TechCrunch

India’s EV startup Ather cuts IPO size to $308M, seeking $1.4B post-money valuation

8h ago

Two undergrads built an AI speech model to rival NotebookLM

8h ago

Adaptive Computer wants to reinvent the PC with ‘vibe’ coding for non-programmers

9h ago

Marks & Spencer confirms cybersecurity incident amid ongoing disruption

9h ago

Uber customers can now earn Delta SkyMiles from rides or deliveries

11h ago

Latest Jobs

Founding Engineer

Founding engineer (full stack)

Bratislava Region

Accounts Receivable (AR) Manager

More Technology News

IPL 2025: Harsha Bhogle clears air on absence from KKR game at Eden Gardens

Business Standard

Lyrid meteor shower 2025: Where and when to watch the show in India?

Business Standard

Hyundai teams up with IndianOil to test hydrogen fuel cell in India

Business Standard

Dr Reddy's, Lupin recall products in US due to manufacturing errors: USFDA

Business Standard

India's HCLTech narrowly misses quarterly revenue estimates

Channel News Asia

BEAMSTART is a global entrepreneurship community, serving as a catalyst for innovation and collaboration. With a mission to empower entrepreneurs, we offer exclusive deals with savings totaling over $1,000,000, curated news, events, and a vast investor database. Through our portal, we aim to foster a supportive ecosystem where like-minded individuals can connect and create opportunities for growth and success.

Our Company

© Copyright 2025 BEAMSTART. All Rights Reserved.

Home

Jobs

Investors

Members