🏠 Home
Philosophy
🧩
Philosophy
11 channels · 979 articles
Articles
Asymmetry Between Defensive and Acquisitive Instrumental Deception
Write-up of a recent research sprint looking at factors influencing strategic deception in modelsTL;DRI tested models in a controlled scenario where they could deceptively inflate self-reported performance to influence an upcoming budget decision in their favour. Varying the budget proposal around a baseline lets us measure (a) whether models exhibit an asymmetry between deception to defend against a loss vs. to opportunistically gain advantages, and (b) whether deception rates grow smoothly wit
0
3
Context Modification as a Negative Alignment Tax
Context Rot
Every LLM gets worse as its context grows. Chroma tested 18 frontier models and found performance degradation in all of them, often by double-digit percentages on tasks where short-context performance was strong. The industry calls this "context rot": the gradual degradation of response quality as irrelevant history accumulates in the context window.
The standard fix is compaction: when the context gets too long, summarize it and throw away the original. Claude Code auto-compacts at
0
3
Best Intro AI X-Risk Resource?
I'd like the best short article and video intro explainers, shooting for the 15 minute range.
At least one of the articles shouldn't be on LessWrong, because some will get turned off by this forum.
It should be simple and not require prerequisite knowledge. My parents, and ideally my grandparents, should be able to understand it. Failing that, a normal college student at an average university should be able to; or at least a STEM major.It should have links to more details, in case someone's in
0
5
Sawtooth Problems
Red Button, Blue ButtonOn April 24th, 2026, Tim Urban put forth the following poll on Twitter/X:Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?I love this dilemma, and I'm exhausted by it. I’ve been thinking about it for two straight weeks, and have spent nearly all t
0
4
Control Debt
Notes on the gap: what control evaluations assume implementation in labs.It is 2027, and a frontier lab grew suspicions: plausibly, their model is scheming. Not a surprise for the control team. For more than a year, they worked on a protocol. Trusted monitoring is tested on their benchmark setting, with all agent actions, as well as with suspiciousness-based defer-to-trusted triggers, thresholds from the red-teaming policy, and human escalation in higher risks. In simulation, the safety/useful
0
3
Could Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism
OverviewThis is a project proposal and early research on the question of how and whether Frontier AI researchers (not companies themselves) might take on personal risk and pledge to conditionally pause AI development. I am looking for feedback on whether there’s a version of this that researchers might find palatable, and if so what the details might look like. I am especially interested in hearing from people with experience in frontier AI development or doing similar advocacy and outreach work
0
3
The Goblins Are the Paperclips
Last week OpenAI published Where the goblins came from, explaining why their models started slipping creature metaphors into unrelated outputs. The story has been treated as a quirky anecdote: endearing, slightly embarrassing, fixed with a developer-prompt instruction. But I think it deserves a more interesting reading, since the goblin episode is the cleanest evidence we have for the optimization mechanics that paperclip arguments rely on, and the usual objections to those arguments don't engag
0
2
Somerville Porchfest 2026
This afternoon
Cecilia and
I played for
Somerville
Porchfest, with
Harris
calling and Danner running sound. There was rain, but not enough keep
us from playing, or to keep folks from dancing:
We were originally planning to be on Morrison Ave, where we've been for
years.
Two weeks out, though, I learned that it wouldn't be possible to close
Morrison this year. [1] After lots of scrambling, talking to neighbors
and the city, and some help from Lance
Davis, we were able to get per
0
2
The AI Industrial Explosion — Part 2: Transition Dynamics
This is Part 2 of a series on post-AGI economic growth. Part 1 established that a fully automated economy could double roughly every year using current technology. But the US economy does not currently look like a self-reproducing capital machine. It overproduces consumer goods and services relative to maximum growth, and underproduces machinery and raw metals. It cannot instantaneously switch to rapid growth, because it simply does not produce enough of the stuff that makes stuff.
Using the inp
0
1
International Law Cannot Prevent Extinction Either
The context for this post is primarily Only Law Can Prevent Extinction, but after first drafting a half-assed comment, I decided to get off my ass and write a whole-assed post.I agree with Eliezer's main thesis that individual violence against AI researchers is both morally wrong and strategically stupid. Where I disagree is with the claim that international law can prevent extinction. It can't, for the following reasons.I. International law is largely a fiction (especially when interests diverg
0
1
Martian Gargoyles and Lunar Fish: Chilean Artist Alejandra Acosta’s Wondrous Embroidered Illustrations for This World’s First Book Theorizing Life on Other Worlds
It is the sunset of the 1600s. Milton has just pioneered the use of the word space to connote outer space. Kepler has just pioneered science fiction by imagining space travel, but going only as far as the Moon. Gravity is a brand new concept and the notion of a galaxy is still more than two centuries away. The universe is as big as our Solar System, which has six planets orbiting a sun we have only just conceded, after burning the seers at the stake, does not revolve around us.
Against this bac
0
1
3 Kinds of Loneliness and 4 Kinds of Forever
Loneliness is the fundamental condition of life — we are born by another, but born alone; die around others (if we are lucky and loved), but die alone; we spend our lives islanded in our one and only human experience — in these particular bodies and minds and circumstances drawn from the cosmic lottery — amid the immense ocean of time and chance teeming with all possible experience. Everything of beauty and substance that we make — every poem, every painting, every friend
0
1
Out of Context Philosophy
If you open up a philosophy article or chapter on your computer, the software you’re using, now updated with various AI features, may present you with something like the following message: “This looks like a long article. Would you like me to summarize it for you?”
You may be unlikely to use this feature. You’re skilled at reading philosophy and you understand the value of reading through it yourself.
But what about other people? What about your students?
What can we tell
0
1
Hertfordshire to Eliminate Philosophy from Its Curriculum
In a couple of years, students at the University of Hertfordshire will be unable to take a philosophy course there.
The administration has announced a decision to eliminate the whole of the undergraduate philosophy program, according to several sources. The decision cuts not just the philosophy degree program, but, eventually, all philosophy classes.
A “teach-out agreement” will be worked out for existing philosophy students, and students who recently signed up to study philosophy in
0
1
Four Culture Fixes
Humanity has broken its superpower of cultural evolution, at least at the level of large cultural units, the units that set our game theoretic equilibria of key norms, values, and status markers. 300yrs ago these units had great variety, were under strong select pressures, and had slow rates of change of environment and internal drift. But since then, all four of these key control parameters have since gotten much worse. Unless we achieve human level AI soon, our dominant world civ’s popul
0
1
Robert Ladenson (1943-2026)
Robert F. Ladenson, professor emeritus of philosophy at Illinois Institute of Technology and founder of the Intercollegiate Ethics Bowl, has died.
(The following obituary is via Wayne Yuen.)
Robert Ladenson (1943-2026)
Robert F. Ladenson, the founder of the Ethics Bowl, philosopher and legal scholar, passed away on May 3, 2026 in Los Angeles. He dedicated his career to exploring the intersection of law, ethics, and education, and leaves behind a legacy that transformed competitive debate
0
1
Gary Snyder on How to Unbreak the World
“What we’d hope for on the planet is creativity and sanity, conviviality, the real work of our hands and minds.”
“The universe is made of stories, not atoms,” Muriel Rukeyser wrote in her poem “The Speed of Darkness” not long after James Baldwin told an audience of writers that “we made the world we’re living in and we have to make it over.” We make the world not with our ballots — though they do, oh they do matter — but with the stories we tell o
0
1
Poetry: I Too, Dislike It
I was a latecomer to poetry, curling my nose at it in that confounding and rather embarrassing way we have of discounting what we don’t understand, dismissing as useless what we don’t know how to use. And then I met Emily Levine. Across the aisle on a transatlantic flight, across our half century of age difference, we became instant and abiding friends.
Emily Levine (Portrait by John Keatley)
Intellectually dazzling, creatively mischievous, and ecstatically funny, Emily took it upon
0
1
Géraud de Cordemoy
[Revised entry by Fred Ablondi on May 7, 2026.
Changes to: Main text, Bibliography]
Geraud de Cordemoy (1626 - 1684) was one of the more important Cartesian philosophers during the decades immediately following the death of Descartes. While he is in some respects a very orthodox Cartesian, Cordemoy was the only Cartesian to embrace atomism, and one of the first to argue for occasionalism. Though a lawyer by profession, Cordemoy was a prominent figure in Parisian philosophical circles. His two
0
1
The Neurophysiology of Enchantment: How Music Casts Its Spell on Us
“Music so readily transports us from the present to the past, or from what is actual to what is possible.”
“Music,” the trailblazing composer Julia Perry wrote, “has a unifying effect on the peoples of the world, because they all understand and love it… And when they find themselves enjoying and loving the same music, they find themselves loving one another.” But there is something beyond humanistic ideology in this elemental truth — something woven into the very s
0
1
Asymmetry Between Defensive and Acquisitive Instrumental Deception
Write-up of a recent research sprint looking at factors influencing strategic deception in modelsTL;DRI tested models in
0
3
Context Modification as a Negative Alignment Tax
Context Rot
Every LLM gets worse as its context grows. Chroma tested 18 frontier models and found performance degradatio
0
3
Best Intro AI X-Risk Resource?
I'd like the best short article and video intro explainers, shooting for the 15 minute range.
At least one of the artic
0
5
Sawtooth Problems
Red Button, Blue ButtonOn April 24th, 2026, Tim Urban put forth the following poll on Twitter/X:Everyone in the world ha
0
4
Control Debt
Notes on the gap: what control evaluations assume implementation in labs.It is 2027, and a frontier lab grew suspicions
0
3
Could Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism
OverviewThis is a project proposal and early research on the question of how and whether Frontier AI researchers (not co
0
3
The Goblins Are the Paperclips
Last week OpenAI published Where the goblins came from, explaining why their models started slipping creature metaphors
0
2
Somerville Porchfest 2026
This afternoon
Cecilia and
I played for
Somerville
Porchfest, with
Harris
calling and Danner running sound. There
0
2
The AI Industrial Explosion — Part 2: Transition Dynamics
This is Part 2 of a series on post-AGI economic growth. Part 1 established that a fully automated economy could double r
0
1
International Law Cannot Prevent Extinction Either
The context for this post is primarily Only Law Can Prevent Extinction, but after first drafting a half-assed comment, I
0
1
Martian Gargoyles and Lunar Fish: Chilean Artist Alejandra Acosta’s Wondrous Embroidered Illustrations for This World’s First Book Theorizing Life on Other Worlds
It is the sunset of the 1600s. Milton has just pioneered the use of the word space to connote outer space. Kepler has ju
0
1
3 Kinds of Loneliness and 4 Kinds of Forever
Loneliness is the fundamental condition of life — we are born by another, but born alone; die around others (if we
0
1
Out of Context Philosophy
If you open up a philosophy article or chapter on your computer, the software you’re using, now updated with vario
0
1
Hertfordshire to Eliminate Philosophy from Its Curriculum
In a couple of years, students at the University of Hertfordshire will be unable to take a philosophy course there.
The
0
1
Four Culture Fixes
Humanity has broken its superpower of cultural evolution, at least at the level of large cultural units, the units that
0
1
Robert Ladenson (1943-2026)
Robert F. Ladenson, professor emeritus of philosophy at Illinois Institute of Technology and founder of the Intercollegi
0
1
Gary Snyder on How to Unbreak the World
“What we’d hope for on the planet is creativity and sanity, conviviality, the real work of our hands and minds.
0
1
Poetry: I Too, Dislike It
I was a latecomer to poetry, curling my nose at it in that confounding and rather embarrassing way we have of discountin
0
1
Asymmetry Between Defensive and Acquisitive Instrumental Deception
Write-up of a recent research sprint looking at factors influencing strategic deception in modelsTL;DRI tested models in a control…
💬 0
👁 3
Context Modification as a Negative Alignment Tax
LessWrong · 3d ago
💬 0
👁 3
Best Intro AI X-Risk Resource?
LessWrong · 3d ago
💬 0
👁 5
Sawtooth Problems
LessWrong · 3d ago
💬 0
👁 4
Control Debt
LessWrong · 3d ago
Could Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism
LessWrong · 3d ago
The Goblins Are the Paperclips
LessWrong · 3d ago
Somerville Porchfest 2026
LessWrong · 3d ago
The AI Industrial Explosion — Part 2: Transition Dynamics
This is Part 2 of a series on post-AGI economic growth. Part 1 established that a fully automated economy could double roughly eve…
💬 0
👁 1
International Law Cannot Prevent Extinction Either
LessWrong · 3d ago
💬 0
👁 1
Martian Gargoyles and Lunar Fish: Chilean Artist Alejandra Acosta’s Wondrous Embroidered Illustrations for This World’s First Book Theorizing Life on Other Worlds
The Marginalian · 4d ago
💬 0
👁 1
3 Kinds of Loneliness and 4 Kinds of Forever
The Marginalian · 4d ago
💬 0
👁 1

Out of Context Philosophy
Daily Nous · 5d ago

Hertfordshire to Eliminate Philosophy from Its Curriculum
Daily Nous · 5d ago

Four Culture Fixes
Overcoming Bias · 5d ago

Robert Ladenson (1943-2026)
Daily Nous · 5d ago
Gary Snyder on How to Unbreak the World
“What we’d hope for on the planet is creativity and sanity, conviviality, the real work of our hands and minds.”
“The…
💬 0
👁 1
Asymmetry Between Defensive and Acquisitive Instrumental Deception
Write-up of a recent research sprint looking at factors influencing strategic deception in modelsTL;DRI tested models in a controlled scenario where they could deceptively inflate self-reported performance to influence an upcoming budget decision in their favour. Varying the budget proposal around a baseline lets us measure (a) whether models exhibit an asymmetry between deception to defend against a loss vs. to opportunistically gain advantages, and (b) whether deception rates grow smoothly wit
0
3 👁
Context Modification as a Negative Alignment Tax
Context Rot
Every LLM gets worse as its context grows. Chroma tested 18 frontier models and found performance degradation in all of them, often by double-digit percentages on tasks where short-context performance was strong. The industry calls this "context rot": the gradual degradation of response quality as irrelevant history accumulates in the context window.
The standard fix is compaction: when the context gets too long, summarize it and throw away the original. Claude Code auto-compacts at
0
3 👁
Best Intro AI X-Risk Resource?
I'd like the best short article and video intro explainers, shooting for the 15 minute range.
At least one of the articles shouldn't be on LessWrong, because some will get turned off by this forum.
It should be simple and not require prerequisite knowledge. My parents, and ideally my grandparents, should be able to understand it. Failing that, a normal college student at an average university should be able to; or at least a STEM major.It should have links to more details, in case someone's in
0
5 👁
Sawtooth Problems
Red Button, Blue ButtonOn April 24th, 2026, Tim Urban put forth the following poll on Twitter/X:Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?I love this dilemma, and I'm exhausted by it. I’ve been thinking about it for two straight weeks, and have spent nearly all t
0
4 👁
Control Debt
Notes on the gap: what control evaluations assume implementation in labs.It is 2027, and a frontier lab grew suspicions: plausibly, their model is scheming. Not a surprise for the control team. For more than a year, they worked on a protocol. Trusted monitoring is tested on their benchmark setting, with all agent actions, as well as with suspiciousness-based defer-to-trusted triggers, thresholds from the red-teaming policy, and human escalation in higher risks. In simulation, the safety/useful
0
3 👁
Could Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism
OverviewThis is a project proposal and early research on the question of how and whether Frontier AI researchers (not companies themselves) might take on personal risk and pledge to conditionally pause AI development. I am looking for feedback on whether there’s a version of this that researchers might find palatable, and if so what the details might look like. I am especially interested in hearing from people with experience in frontier AI development or doing similar advocacy and outreach work
0
3 👁
The Goblins Are the Paperclips
Last week OpenAI published Where the goblins came from, explaining why their models started slipping creature metaphors into unrelated outputs. The story has been treated as a quirky anecdote: endearing, slightly embarrassing, fixed with a developer-prompt instruction. But I think it deserves a more interesting reading, since the goblin episode is the cleanest evidence we have for the optimization mechanics that paperclip arguments rely on, and the usual objections to those arguments don't engag
0
2 👁
Somerville Porchfest 2026
This afternoon
Cecilia and
I played for
Somerville
Porchfest, with
Harris
calling and Danner running sound. There was rain, but not enough keep
us from playing, or to keep folks from dancing:
We were originally planning to be on Morrison Ave, where we've been for
years.
Two weeks out, though, I learned that it wouldn't be possible to close
Morrison this year. [1] After lots of scrambling, talking to neighbors
and the city, and some help from Lance
Davis, we were able to get per
0
2 👁
The AI Industrial Explosion — Part 2: Transition Dynamics
This is Part 2 of a series on post-AGI economic growth. Part 1 established that a fully automated economy could double roughly every year using current technology. But the US economy does not currently look like a self-reproducing capital machine. It overproduces consumer goods and services relative to maximum growth, and underproduces machinery and raw metals. It cannot instantaneously switch to rapid growth, because it simply does not produce enough of the stuff that makes stuff.
Using the inp
0
1 👁
International Law Cannot Prevent Extinction Either
The context for this post is primarily Only Law Can Prevent Extinction, but after first drafting a half-assed comment, I decided to get off my ass and write a whole-assed post.I agree with Eliezer's main thesis that individual violence against AI researchers is both morally wrong and strategically stupid. Where I disagree is with the claim that international law can prevent extinction. It can't, for the following reasons.I. International law is largely a fiction (especially when interests diverg
0
1 👁
Martian Gargoyles and Lunar Fish: Chilean Artist Alejandra Acosta’s Wondrous Embroidered Illustrations for This World’s First Book Theorizing Life on Other Worlds
It is the sunset of the 1600s. Milton has just pioneered the use of the word space to connote outer space. Kepler has just pioneered science fiction by imagining space travel, but going only as far as the Moon. Gravity is a brand new concept and the notion of a galaxy is still more than two centuries away. The universe is as big as our Solar System, which has six planets orbiting a sun we have only just conceded, after burning the seers at the stake, does not revolve around us.
Against this bac
0
1 👁
3 Kinds of Loneliness and 4 Kinds of Forever
Loneliness is the fundamental condition of life — we are born by another, but born alone; die around others (if we are lucky and loved), but die alone; we spend our lives islanded in our one and only human experience — in these particular bodies and minds and circumstances drawn from the cosmic lottery — amid the immense ocean of time and chance teeming with all possible experience. Everything of beauty and substance that we make — every poem, every painting, every friend
0
1 👁
Out of Context Philosophy
If you open up a philosophy article or chapter on your computer, the software you’re using, now updated with various AI features, may present you with something like the following message: “This looks like a long article. Would you like me to summarize it for you?”
You may be unlikely to use this feature. You’re skilled at reading philosophy and you understand the value of reading through it yourself.
But what about other people? What about your students?
What can we tell
0
1 👁
Hertfordshire to Eliminate Philosophy from Its Curriculum
In a couple of years, students at the University of Hertfordshire will be unable to take a philosophy course there.
The administration has announced a decision to eliminate the whole of the undergraduate philosophy program, according to several sources. The decision cuts not just the philosophy degree program, but, eventually, all philosophy classes.
A “teach-out agreement” will be worked out for existing philosophy students, and students who recently signed up to study philosophy in
0
1 👁
Four Culture Fixes
Humanity has broken its superpower of cultural evolution, at least at the level of large cultural units, the units that set our game theoretic equilibria of key norms, values, and status markers. 300yrs ago these units had great variety, were under strong select pressures, and had slow rates of change of environment and internal drift. But since then, all four of these key control parameters have since gotten much worse. Unless we achieve human level AI soon, our dominant world civ’s popul
0
1 👁
Robert Ladenson (1943-2026)
Robert F. Ladenson, professor emeritus of philosophy at Illinois Institute of Technology and founder of the Intercollegiate Ethics Bowl, has died.
(The following obituary is via Wayne Yuen.)
Robert Ladenson (1943-2026)
Robert F. Ladenson, the founder of the Ethics Bowl, philosopher and legal scholar, passed away on May 3, 2026 in Los Angeles. He dedicated his career to exploring the intersection of law, ethics, and education, and leaves behind a legacy that transformed competitive debate
0
1 👁
Gary Snyder on How to Unbreak the World
“What we’d hope for on the planet is creativity and sanity, conviviality, the real work of our hands and minds.”
“The universe is made of stories, not atoms,” Muriel Rukeyser wrote in her poem “The Speed of Darkness” not long after James Baldwin told an audience of writers that “we made the world we’re living in and we have to make it over.” We make the world not with our ballots — though they do, oh they do matter — but with the stories we tell o
0
1 👁
Poetry: I Too, Dislike It
I was a latecomer to poetry, curling my nose at it in that confounding and rather embarrassing way we have of discounting what we don’t understand, dismissing as useless what we don’t know how to use. And then I met Emily Levine. Across the aisle on a transatlantic flight, across our half century of age difference, we became instant and abiding friends.
Emily Levine (Portrait by John Keatley)
Intellectually dazzling, creatively mischievous, and ecstatically funny, Emily took it upon
0
1 👁
Géraud de Cordemoy
[Revised entry by Fred Ablondi on May 7, 2026.
Changes to: Main text, Bibliography]
Geraud de Cordemoy (1626 - 1684) was one of the more important Cartesian philosophers during the decades immediately following the death of Descartes. While he is in some respects a very orthodox Cartesian, Cordemoy was the only Cartesian to embrace atomism, and one of the first to argue for occasionalism. Though a lawyer by profession, Cordemoy was a prominent figure in Parisian philosophical circles. His two
0
1 👁
The Neurophysiology of Enchantment: How Music Casts Its Spell on Us
“Music so readily transports us from the present to the past, or from what is actual to what is possible.”
“Music,” the trailblazing composer Julia Perry wrote, “has a unifying effect on the peoples of the world, because they all understand and love it… And when they find themselves enjoying and loving the same music, they find themselves loving one another.” But there is something beyond humanistic ideology in this elemental truth — something woven into the very s
0
1 👁