🏠 Home
Philosophy
🧩
Philosophy
11 channels · 1,218 articles
Articles
Simulating Simulators
Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve mostly kept private since. I promised myself that when labs moved on to focusing on interpretability vector activations in place of reasoning traces for what invariably gets Goodharted, that it’d be a necessary disclosure as the risks in what might get trampled over outweighed the risks in what might end up targeted.And well… here we are.P.S. TL;DRs added where possible.Board Ga
0
0
Learning to spend money
My wife and I are both naturally stingy people. When drafting our wedding list we spurned the posh department stores and I carefully picked out the lowest price best quality items on Amazon instead. I bought 100 dollar beds and 100 dollar mattresses, and we slept on them for a year and a half because "we're anyway emigrating soon". When we did emigrate, I ended up shipping them and we slept on them for another year and a half, much to my pregnant wife's annoyance.
We might have overdone it given
0
0
Parkinson's Heuristic: The Only Time To Do Anything
Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write a report, they'll take a month, but if you give them a week, they'll take a week, and then they'll have three weeks to do three other reports! The one-week and four-week reports won't be identical; but in my experience it is surprisingly often nearly as useful as the four-week version, and you get strictly more work out of people.(I, myself, am a special case of "people".)This
0
0
PSA: Almost nobody is working on alignment
People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is not true. Most people are not working on making sure superintelligent AIs are aligned with human values or follow human instructions.Currently, the people who we know of that work on alignment are roughly:The Alignment Research Center who work on a research bet by Paul ChristianoProbably Sequent who just got announced yesterdayParts of GDM (agent foundations work, some debate w
0
0
Honey is Good
The other day I was watching the magic school bus with my young son; they were learning about bees and honey. One of the characters says, “We shouldn't take the honey, the bees didn't make it for us” and another character replies with “But if we don't take the honey, then I won't have any? I want the honey”.This struck me as close to a “First argument”. Thanks to evolution, an organism wouldn't exist if it didn't want to survive. The first argument is "Survival is Good" and Survival = Calories =
0
0
Roman Ingarden
[Revised entry by Amie Thomasson on June 11, 2026.
Changes to: Main text, Bibliography]
Roman Ingarden (1893 - 1970) was a Polish phenomenologist, ontologist and aesthetician. A student of Edmund Husserl's in both Gottingen and Freiburg, Ingarden was a realist phenomenologist who spent much of his career working against what he took to be Husserl's turn to transcendental idealism. As preparatory work for narrowing down possible solutions to the realism/idealism problem, Ingarden developed onto
0
2
The Aestheticising Vice by Paul Seabright
I'm often in debates with people about legibility and systems vs individual virtues. People often bring up Seeing Like A State, Secrets of Our Success, and other books or articles in that vein to buttress the case for metis over top-down high modernist design. I sometimes found the conversations shallow, and Paul Seabright's 1999 (!) review of Seeing Like A State helps explain why.___In the Languedoc there is a vineyard that teaches us an important lesson about textbook learning and its applicat
0
0
Celene's thoughts on consciousness
contra scott alexander (?)Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session where participants could ask him questions and he would respond to them unless the questions were about eulogies, in which case he would pause to think for a few seconds before kindly passing. At one point or another, the questions drifted to theories of consciousness.As a kind-of-illusionist, I worked up the courage to raise my hand and ask him what I should do if I wasn’t su
0
0
Construct validity of Claude Opus 4.8's System Card – A commentary
TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluation methods. Three main concerns: 1) chain-of-thought monitoring misses reasoning that never surfaces in the text; 2) evaluation awareness is under-estimated; 3) the evaluators come largely from the same model family, so agreement may reflect shared assumptions. None of this shows Opus 4.8 is unsafe but only that some verdicts are more confident than the methods warrant.Introduc
0
0
you won't one-shot a perfect system, but try anyway
Have you ever experienced this exchange:A: Damn, , this system is so broken. My friend says in their country,
0
0
Announcing the Next Phase of AI Forge
We’re taking the opportunity to share this with the community to help spread the word. We think that the foundational work being done in the AI Forge project to bring the government into conversation with academia and industry is a crucial step to ensure alignment research gets deployed into government and military applications. See the announcement below.Launching University RFI and Critical AI Challenges ReportDear Colleagues,I am thrilled to announce the official launch of the next phase of t
0
0
Biotech Paper Game
Imagine a biotech firm that funds projects to develop new products, and typically bases their projects on one or more academic papers. This firm wants to learn which papers are promising as bases for new projects. But they want any info they induce to be available only to them, and not to rivals.Here’s a simple way to do this. Pick a pool of people who seem able to judge promising papers, and give them each N tokens. (Some may get more than others, and tokens might be given at some steady
0
0
Ibn Sina’s Metaphysics
[Revised entry by Olga Lizzini on June 9, 2026.
Changes to: Main text, Bibliography]
For Avicenna (Ibn Sīnā) metaphysics is a science (ʿilm), i.e., a perfectly rationally established discipline that allows human reason to achieve an authentic understanding of the inner structure of the world. Metaphysics is the science of being qua being and therefore the science that explains every being. In his interpretation, Avicenna fuses the Aristotelian tradition, which he intends to renew (Gutas 2014),
0
0
How to Stop Waiting and Start Living: A Jolt from Henry James
“It wouldn’t have been failure to be bankrupt, dishonoured, pilloried, hanged; it was failure not to be anything.”
“The things we want are transformative, and we don’t know or only think we know what is on the other side of that transformation,” Rebecca Solnit wrote in her exquisite Field Guide to Getting Lost.
The wanting starts out innocently — awaiting the birthday, the new bicycle, Christmas morning; awaiting the school year to end, or to begin. Soon, we are a
0
0
Light and Shade in The Classroom (guest post)
“I’m teaching care for their own particular point of view, a disdain for all things ‘vibes’ that aren’t carefully thought out, and a deep understanding of the courage it takes to withdraw from other people for a while, to have braved a thought all on your own.”
That’s Robert Wallace, associate professor of philosophy at California Polytechnic State University, San Luis Obispo (Cal Poly). In the following guest post, he pushes back against a kind of approach to
0
0
Summer 2026
Summer is here, and with it, as you may have noticed, a more relaxed pace at Daily Nous.
There will be somewhat fewer news stories and more guest posts.
Travel and other commitments may mean less time for comment moderation, which may mean that comments on some posts are closed, or that comments sometimes take more time to appear than usual. Thanks for your patience.
I hope your summer is off to a good start.The post Summer 2026 first appeared on Daily Nous.
0
1
Practical Reason and the Structure of Actions
[Revised entry by Elijah Millgram and Margaret Bowman on June 8, 2026.
Changes to: Main text, Bibliography]
A wave of recent philosophical work on practical rationality is organized by the following implicit argument: practical reasoning is figuring out what to do; to do is to act; so the forms of practical inference can be derived from the structure or features of action. Now it is not as though earlier work in analytic philosophy had failed to register the connection between action and pract
0
0
Robert Louis Stevenson on Falling in Love and Loving Beyond the Fall
It seems odd, wrong even, that “patience” and “passion” — the twin roots of love — should share a root in pāti, Latin for “to suffer.” But anyone who has lived, who has loved unskillfully or loved the unskilled, knows that the experience can be our sharpest instrument of suffering. We say we “fall” in love precisely because we know we can get bruised, know that the trap door it opens beneath our feet hurls us into depths we are entirely
0
0
Why Excess Regulation?
Our world consists of many coupled evolving systems, including systems of competing species, nations, political parties, firms, cultures, charities, and even academics. These systems vary in many ways, but a key difference is in their adaption power - how fast can each one search to find and adopt more adaptive alternatives.If the strength of influence between such systems were symmetric, then systems with stronger adaption power would tend to tame and drive the weaker ones. This would promote o
0
0
The Dictionary of Obscure Sorrows: Uncommonly Lovely Invented Words for What We Feel but Cannot Name
“Words are events, they do things, change things. They transform both speaker and hearer; they feed energy back and forth and amplify it. They feed understanding or emotion back and forth and amplify it,” Ursula K. Le Guin wrote in her exquisite manifesto for the magic of real human conversation. Each word is a portable cathedral in which we clarify and sanctify our experience, a reliquary and a laboratory, holding the history of our search for meaning and the pliancy of the possible
0
0
Simulating Simulators
Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve m
0
0
Learning to spend money
My wife and I are both naturally stingy people. When drafting our wedding list we spurned the posh department stores and
0
0
Parkinson's Heuristic: The Only Time To Do Anything
Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write
0
0
PSA: Almost nobody is working on alignment
People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is
0
0
Honey is Good
The other day I was watching the magic school bus with my young son; they were learning about bees and honey. One of the
0
0
Roman Ingarden
[Revised entry by Amie Thomasson on June 11, 2026.
Changes to: Main text, Bibliography]
Roman Ingarden (1893 - 1970) w
0
2
The Aestheticising Vice by Paul Seabright
I'm often in debates with people about legibility and systems vs individual virtues. People often bring up Seeing Like A
0
0
Celene's thoughts on consciousness
contra scott alexander (?)Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session
0
0
Construct validity of Claude Opus 4.8's System Card – A commentary
TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluati
0
0
you won't one-shot a perfect system, but try anyway
Have you ever experienced this exchange:A: Damn, , this system is so broken. My friend says in their country,
0
0
Announcing the Next Phase of AI Forge
We’re taking the opportunity to share this with the community to help spread the word. We think that the foundational wo
0
0
Biotech Paper Game
Imagine a biotech firm that funds projects to develop new products, and typically bases their projects on one or more ac
0
0
Ibn Sina’s Metaphysics
[Revised entry by Olga Lizzini on June 9, 2026.
Changes to: Main text, Bibliography]
For Avicenna (Ibn Sīnā) metaphysi
0
0
How to Stop Waiting and Start Living: A Jolt from Henry James
“It wouldn’t have been failure to be bankrupt, dishonoured, pilloried, hanged; it was failure not to be anyt
0
0
Light and Shade in The Classroom (guest post)
“I’m teaching care for their own particular point of view, a disdain for all things ‘vibes’ that
0
0
Summer 2026
Summer is here, and with it, as you may have noticed, a more relaxed pace at Daily Nous.
There will be somewhat fewer n
0
1
Practical Reason and the Structure of Actions
[Revised entry by Elijah Millgram and Margaret Bowman on June 8, 2026.
Changes to: Main text, Bibliography]
A wave of
0
0
Robert Louis Stevenson on Falling in Love and Loving Beyond the Fall
It seems odd, wrong even, that “patience” and “passion” — the twin roots of love — s
0
0
Simulating Simulators
Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve mostly kept…
💬 0
👁 0
Learning to spend money
LessWrong · 1d ago
💬 0
👁 0
Parkinson's Heuristic: The Only Time To Do Anything
LessWrong · 1d ago
💬 0
👁 0
PSA: Almost nobody is working on alignment
LessWrong · 1d ago
💬 0
👁 0
Honey is Good
LessWrong · 1d ago
Roman Ingarden
Stanford Encyclopedia of Philosophy · 1d ago
The Aestheticising Vice by Paul Seabright
LessWrong · 1d ago
Celene's thoughts on consciousness
LessWrong · 1d ago
Construct validity of Claude Opus 4.8's System Card – A commentary
TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluation methods…
💬 0
👁 0
you won't one-shot a perfect system, but try anyway
LessWrong · 1d ago
💬 0
👁 0
Announcing the Next Phase of AI Forge
LessWrong · 1d ago
💬 0
👁 0
Biotech Paper Game
Overcoming Bias · 2d ago
💬 0
👁 0
Ibn Sina’s Metaphysics
Stanford Encyclopedia of Philosophy · 3d ago

How to Stop Waiting and Start Living: A Jolt from Henry James
The Marginalian · 3d ago

Light and Shade in The Classroom (guest post)
Daily Nous · 4d ago

Summer 2026
Daily Nous · 4d ago
Practical Reason and the Structure of Actions
[Revised entry by Elijah Millgram and Margaret Bowman on June 8, 2026.
Changes to: Main text, Bibliography]
A wave of recent phi…
💬 0
👁 0
Robert Louis Stevenson on Falling in Love and Loving Beyond the Fall
The Marginalian · 4d ago
💬 0
👁 0
Why Excess Regulation?
Overcoming Bias · 4d ago
💬 0
👁 0
The Dictionary of Obscure Sorrows: Uncommonly Lovely Invented Words for What We Feel but Cannot Name
The Marginalian · 4d ago
💬 0
👁 0
Simulating Simulators
Author’s note: This piece relates to things I initially discovered in Opus 4 over the months after release, which I’ve mostly kept private since. I promised myself that when labs moved on to focusing on interpretability vector activations in place of reasoning traces for what invariably gets Goodharted, that it’d be a necessary disclosure as the risks in what might get trampled over outweighed the risks in what might end up targeted.And well… here we are.P.S. TL;DRs added where possible.Board Ga
0
0 👁
Learning to spend money
My wife and I are both naturally stingy people. When drafting our wedding list we spurned the posh department stores and I carefully picked out the lowest price best quality items on Amazon instead. I bought 100 dollar beds and 100 dollar mattresses, and we slept on them for a year and a half because "we're anyway emigrating soon". When we did emigrate, I ended up shipping them and we slept on them for another year and a half, much to my pregnant wife's annoyance.
We might have overdone it given
0
0 👁
Parkinson's Heuristic: The Only Time To Do Anything
Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write a report, they'll take a month, but if you give them a week, they'll take a week, and then they'll have three weeks to do three other reports! The one-week and four-week reports won't be identical; but in my experience it is surprisingly often nearly as useful as the four-week version, and you get strictly more work out of people.(I, myself, am a special case of "people".)This
0
0 👁
PSA: Almost nobody is working on alignment
People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is not true. Most people are not working on making sure superintelligent AIs are aligned with human values or follow human instructions.Currently, the people who we know of that work on alignment are roughly:The Alignment Research Center who work on a research bet by Paul ChristianoProbably Sequent who just got announced yesterdayParts of GDM (agent foundations work, some debate w
0
0 👁
Honey is Good
The other day I was watching the magic school bus with my young son; they were learning about bees and honey. One of the characters says, “We shouldn't take the honey, the bees didn't make it for us” and another character replies with “But if we don't take the honey, then I won't have any? I want the honey”.This struck me as close to a “First argument”. Thanks to evolution, an organism wouldn't exist if it didn't want to survive. The first argument is "Survival is Good" and Survival = Calories =
0
0 👁
Roman Ingarden
[Revised entry by Amie Thomasson on June 11, 2026.
Changes to: Main text, Bibliography]
Roman Ingarden (1893 - 1970) was a Polish phenomenologist, ontologist and aesthetician. A student of Edmund Husserl's in both Gottingen and Freiburg, Ingarden was a realist phenomenologist who spent much of his career working against what he took to be Husserl's turn to transcendental idealism. As preparatory work for narrowing down possible solutions to the realism/idealism problem, Ingarden developed onto
0
2 👁
The Aestheticising Vice by Paul Seabright
I'm often in debates with people about legibility and systems vs individual virtues. People often bring up Seeing Like A State, Secrets of Our Success, and other books or articles in that vein to buttress the case for metis over top-down high modernist design. I sometimes found the conversations shallow, and Paul Seabright's 1999 (!) review of Seeing Like A State helps explain why.___In the Languedoc there is a vineyard that teaches us an important lesson about textbook learning and its applicat
0
0 👁
Celene's thoughts on consciousness
contra scott alexander (?)Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session where participants could ask him questions and he would respond to them unless the questions were about eulogies, in which case he would pause to think for a few seconds before kindly passing. At one point or another, the questions drifted to theories of consciousness.As a kind-of-illusionist, I worked up the courage to raise my hand and ask him what I should do if I wasn’t su
0
0 👁
Construct validity of Claude Opus 4.8's System Card – A commentary
TL;DR: A read of the Claude Opus 4.8 system card with a focus on alignment assessment and construct validity of evaluation methods. Three main concerns: 1) chain-of-thought monitoring misses reasoning that never surfaces in the text; 2) evaluation awareness is under-estimated; 3) the evaluators come largely from the same model family, so agreement may reflect shared assumptions. None of this shows Opus 4.8 is unsafe but only that some verdicts are more confident than the methods warrant.Introduc
0
0 👁
you won't one-shot a perfect system, but try anyway
Have you ever experienced this exchange:A: Damn, , this system is so broken. My friend says in their country,
0
0 👁
Announcing the Next Phase of AI Forge
We’re taking the opportunity to share this with the community to help spread the word. We think that the foundational work being done in the AI Forge project to bring the government into conversation with academia and industry is a crucial step to ensure alignment research gets deployed into government and military applications. See the announcement below.Launching University RFI and Critical AI Challenges ReportDear Colleagues,I am thrilled to announce the official launch of the next phase of t
0
0 👁
Biotech Paper Game
Imagine a biotech firm that funds projects to develop new products, and typically bases their projects on one or more academic papers. This firm wants to learn which papers are promising as bases for new projects. But they want any info they induce to be available only to them, and not to rivals.Here’s a simple way to do this. Pick a pool of people who seem able to judge promising papers, and give them each N tokens. (Some may get more than others, and tokens might be given at some steady
0
0 👁
Ibn Sina’s Metaphysics
[Revised entry by Olga Lizzini on June 9, 2026.
Changes to: Main text, Bibliography]
For Avicenna (Ibn Sīnā) metaphysics is a science (ʿilm), i.e., a perfectly rationally established discipline that allows human reason to achieve an authentic understanding of the inner structure of the world. Metaphysics is the science of being qua being and therefore the science that explains every being. In his interpretation, Avicenna fuses the Aristotelian tradition, which he intends to renew (Gutas 2014),
0
0 👁
How to Stop Waiting and Start Living: A Jolt from Henry James
“It wouldn’t have been failure to be bankrupt, dishonoured, pilloried, hanged; it was failure not to be anything.”
“The things we want are transformative, and we don’t know or only think we know what is on the other side of that transformation,” Rebecca Solnit wrote in her exquisite Field Guide to Getting Lost.
The wanting starts out innocently — awaiting the birthday, the new bicycle, Christmas morning; awaiting the school year to end, or to begin. Soon, we are a
0
0 👁
Light and Shade in The Classroom (guest post)
“I’m teaching care for their own particular point of view, a disdain for all things ‘vibes’ that aren’t carefully thought out, and a deep understanding of the courage it takes to withdraw from other people for a while, to have braved a thought all on your own.”
That’s Robert Wallace, associate professor of philosophy at California Polytechnic State University, San Luis Obispo (Cal Poly). In the following guest post, he pushes back against a kind of approach to
0
0 👁
Summer 2026
Summer is here, and with it, as you may have noticed, a more relaxed pace at Daily Nous.
There will be somewhat fewer news stories and more guest posts.
Travel and other commitments may mean less time for comment moderation, which may mean that comments on some posts are closed, or that comments sometimes take more time to appear than usual. Thanks for your patience.
I hope your summer is off to a good start.The post Summer 2026 first appeared on Daily Nous.
0
1 👁
Practical Reason and the Structure of Actions
[Revised entry by Elijah Millgram and Margaret Bowman on June 8, 2026.
Changes to: Main text, Bibliography]
A wave of recent philosophical work on practical rationality is organized by the following implicit argument: practical reasoning is figuring out what to do; to do is to act; so the forms of practical inference can be derived from the structure or features of action. Now it is not as though earlier work in analytic philosophy had failed to register the connection between action and pract
0
0 👁
Robert Louis Stevenson on Falling in Love and Loving Beyond the Fall
It seems odd, wrong even, that “patience” and “passion” — the twin roots of love — should share a root in pāti, Latin for “to suffer.” But anyone who has lived, who has loved unskillfully or loved the unskilled, knows that the experience can be our sharpest instrument of suffering. We say we “fall” in love precisely because we know we can get bruised, know that the trap door it opens beneath our feet hurls us into depths we are entirely
0
0 👁
Why Excess Regulation?
Our world consists of many coupled evolving systems, including systems of competing species, nations, political parties, firms, cultures, charities, and even academics. These systems vary in many ways, but a key difference is in their adaption power - how fast can each one search to find and adopt more adaptive alternatives.If the strength of influence between such systems were symmetric, then systems with stronger adaption power would tend to tame and drive the weaker ones. This would promote o
0
0 👁
The Dictionary of Obscure Sorrows: Uncommonly Lovely Invented Words for What We Feel but Cannot Name
“Words are events, they do things, change things. They transform both speaker and hearer; they feed energy back and forth and amplify it. They feed understanding or emotion back and forth and amplify it,” Ursula K. Le Guin wrote in her exquisite manifesto for the magic of real human conversation. Each word is a portable cathedral in which we clarify and sanctify our experience, a reliquary and a laboratory, holding the history of our search for meaning and the pliancy of the possible
0
0 👁