According to TechCrunch, the Laude Institute announced its first batch of Slingshots grants on Thursday, designed as an accelerator program for AI researchers. The program provides resources typically unavailable in academic settings, including funding, compute power, and product engineering support. In exchange, recipients must produce tangible work products like startups, open-source codebases, or other artifacts. The initial cohort consists of fifteen projects with particular emphasis on AI evaluation challenges. Familiar projects include Terminal Bench and the latest ARC-AGI version, while new approaches include Formula Code from CalTech/UT Austin researchers and Columbia’s BizBench benchmark for white-collar AI agents.
The AI evaluation problem
Here’s the thing about AI progress – we’re basically flying blind without good benchmarks. Everyone’s building bigger models, but how do we actually know they’re getting smarter? The Slingshots program seems to recognize this fundamental gap. Projects like SWE-Bench co-founder John Boda Yang’s CodeClash aim to create dynamic, competition-based evaluation frameworks that could finally give us meaningful progress metrics.
Bridging the academic-industrial gap
What’s really interesting here is how this program tries to solve the compute access problem. Academic researchers often have brilliant ideas but can’t afford the GPU time to test them. Meanwhile, companies hoard compute resources for proprietary projects. The Slingshots model basically says: here’s the infrastructure, now go build something real. It’s a smart play – fund the research, get the IP, and potentially spin out viable companies. But does this create the right incentives for open science?
The coming benchmark wars
Yang’s comment about worrying “about a future where benchmarks just become specific to companies” hits hard. We’re already seeing this with proprietary evaluations that basically say whatever the company wants them to say. Independent benchmarks like those coming from Slingshots projects could become crucial counterweights. The real test will be whether these academic efforts can keep pace with corporate AI development speeds. Otherwise, we risk ending up with evaluation standards that are already obsolete by publication.
