Less Wrong

@less-wrong 🧩 Philosophy

📰 601 articles 🔄 Updated 22h ago

➕ Subscribe

Latest Articles

Simulating Simulators

Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve m

LessWrong · 22h ago Philosophy

0 0

Learning to spend money

My wife and I are both naturally stingy people. When drafting our wedding list we spurned the posh department stores and

LessWrong · 1d ago Philosophy

0 0

Parkinson's Heuristic: The Only Time To Do Anything

Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write

LessWrong · 1d ago Philosophy

0 0

PSA: Almost nobody is working on alignment

People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is

LessWrong · 1d ago Philosophy

0 0

The other day I was watching the magic school bus with my young son; they were learning about bees and honey. One of the

LessWrong · 1d ago Philosophy

0 0

The Aestheticising Vice by Paul Seabright

I'm often in debates with people about legibility and systems vs individual virtues. People often bring up Seeing Like A

LessWrong · 1d ago Philosophy

0 0

Celene's thoughts on consciousness

contra scott alexander (?)Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session

LessWrong · 1d ago Philosophy

0 0

Construct validity of Claude Opus 4.8's System Card – A commentary

TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluati

LessWrong · 1d ago Philosophy

0 0

you won't one-shot a perfect system, but try anyway

Have you ever experienced this exchange:A: Damn, , this system is so broken. My friend says in their country,

LessWrong · 1d ago Philosophy

0 0

Announcing the Next Phase of AI Forge

We’re taking the opportunity to share this with the community to help spread the word. We think that the foundational wo

LessWrong · 1d ago Philosophy

0 0

Iliad is Hiring

Iliad is hiring for operations, research, and engineering roles. If you're excited about advancing foundational AI align

LessWrong · 5d ago Philosophy

0 2

Neglected Basics of AI Alignment

I came into this world as the misunderstood hero of Harry Potter and the Methods of Rationality. While some characters i

LessWrong · 6d ago Philosophy

0 4

The Hats of LessOnline

It is currently the evening after day two of LessOnline 2026. I wish to document one popular topic of discussion among L

LessWrong · 6d ago Philosophy

0 4

Can activation verbalizers surface an internal chain of thought?

We introduce an evaluation for activation verbalizers: can they surface a target model's reasoning as it solves a math p

LessWrong · 6d ago Philosophy

0 4

Frontier Models Still Lag Behind Humans at Robust Belief-State Tracking

Large-scale cooperation has been a central feature of humanity’s ability to advance technology and build complex societi

LessWrong · 6d ago Philosophy

0 3

Coming Around To Political Donations

Five years ago I read a post on the EA Forum arguing that "election campaign contributions might be a way in which you

LessWrong · 6d ago Philosophy

0 4

Analysis of Metastable States in the Transformer Activation Space

Part 1: Do Metastable Token Clusters exist in Trained Transformers? This is the first entry in a sequence. Over about te

LessWrong · 6d ago Philosophy

0 3

The Diamond Lemma

I found this result useful for a few different problems I was thinking about recently. It cleared up a lot of confusion

LessWrong · 6d ago Philosophy

0 2

The Residual Stream Has a Geometry of Time

Preface This is a preliminary writeup for an experiment on residual stream geometry. The research direction seems pretty

LessWrong · 6d ago Philosophy

0 3

Against Corrigibility

Epistemic status: don’t know whether I actually believe all of this, but I think it’s worth considering.A “corrigible” a

LessWrong · 6d ago Philosophy

0 2

… online