Less Wrong

@less-wrong 🧩 Philosophy
📰 601 articles 🔄 Updated 22h ago lesswrong.com

Latest Articles

Simulating Simulators
Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve m
LessWrong · 22h ago Philosophy
0 0
Learning to spend money
My wife and I are both naturally stingy people. When drafting our wedding list we spurned the posh department stores and
LessWrong · 1d ago Philosophy
0 0
Parkinson's Heuristic: The Only Time To Do Anything
Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write
LessWrong · 1d ago Philosophy
0 0
PSA: Almost nobody is working on alignment
People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is
LessWrong · 1d ago Philosophy
0 0
Honey is Good
The other day I was watching the magic school bus with my young son; they were learning about bees and honey. One of the
LessWrong · 1d ago Philosophy
0 0
The Aestheticising Vice by Paul Seabright
I'm often in debates with people about legibility and systems vs individual virtues. People often bring up Seeing Like A
LessWrong · 1d ago Philosophy
0 0
Celene's thoughts on consciousness
contra scott alexander (?)Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session
LessWrong · 1d ago Philosophy
0 0
Construct validity of Claude Opus 4.8's System Card – A commentary
TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluati
LessWrong · 1d ago Philosophy
0 0
you won't one-shot a perfect system, but try anyway
Have you ever experienced this exchange:A: Damn, , this system is so broken. My friend says in their country,
LessWrong · 1d ago Philosophy
0 0
Announcing the Next Phase of AI Forge
We’re taking the opportunity to share this with the community to help spread the word. We think that the foundational wo
LessWrong · 1d ago Philosophy
0 0
Iliad is Hiring
Iliad is hiring for operations, research, and engineering roles. If you're excited about advancing foundational AI align
LessWrong · 5d ago Philosophy
0 2
Neglected Basics of AI Alignment
I came into this world as the misunderstood hero of Harry Potter and the Methods of Rationality. While some characters i
LessWrong · 6d ago Philosophy
0 4
The Hats of LessOnline
It is currently the evening after day two of LessOnline 2026. I wish to document one popular topic of discussion among L
LessWrong · 6d ago Philosophy
0 4
Can activation verbalizers surface an internal chain of thought?
We introduce an evaluation for activation verbalizers: can they surface a target model's reasoning as it solves a math p
LessWrong · 6d ago Philosophy
0 4
Frontier Models Still Lag Behind Humans at Robust Belief-State Tracking
Large-scale cooperation has been a central feature of humanity’s ability to advance technology and build complex societi
LessWrong · 6d ago Philosophy
0 3
Coming Around To Political Donations
Five years ago I read a post on the EA Forum arguing that "election campaign contributions might be a way in which you
LessWrong · 6d ago Philosophy
0 4
Analysis of Metastable States in the Transformer Activation Space
Part 1: Do Metastable Token Clusters exist in Trained Transformers? This is the first entry in a sequence. Over about te
LessWrong · 6d ago Philosophy
0 3
The Diamond Lemma
I found this result useful for a few different problems I was thinking about recently. It cleared up a lot of confusion
LessWrong · 6d ago Philosophy
0 2
The Residual Stream Has a Geometry of Time
Preface This is a preliminary writeup for an experiment on residual stream geometry. The research direction seems pretty
LessWrong · 6d ago Philosophy
0 3
Against Corrigibility
Epistemic status: don’t know whether I actually believe all of this, but I think it’s worth considering.A “corrigible” a
LessWrong · 6d ago Philosophy
0 2