News

lesswrong. com
lesswrong. com > posts > s Af MCp WLfk Hq F5 Gix > a-brief-list-of-ways-ai-safety-efforts-could-be-net-negative

A brief list of ways AI safety efforts could be net negative " Less Wrong

2+ hour, 1+ min ago  (245+ words) I'm not aware of a good list of downside risks for AI safety broadly[1], so I decided to make one. This is not intended to be fully comprehensive, these are just the ones that I personally take seriously[2][3]: (This list…...

Symbols: btc-usd
lesswrong. com
lesswrong. com > posts > o AKsu X5 Xp Px FSEo HM > futarchy-is-insecure-without-a-proposal-gatekeeper

Futarchy is insecure without a proposal gatekeeper " Less Wrong

5+ hour, 51+ min ago  (916+ words) Asset futarchy is attractive because it lets markets compare a proposal's expected effect on token value. That comparison is only reliable when conditional prices track the proposal's causal effect rather than strategic behavior around the decision rule. The attacks below…...

Symbols: nyse:cpng
lesswrong. com
lesswrong. com > posts > o AKsu X5 Xp Px FSEo HM > futarchy-is-not-secure-without-a-proposal-gatekeeper

Futarchy is not secure without a proposal gatekeeper " Less Wrong

5+ hour, 51+ min ago  (916+ words) Asset futarchy is attractive because it lets markets compare a proposal's expected effect on token value. That comparison is only reliable when conditional prices track the proposal's causal effect rather than strategic behavior around the decision rule. The attacks below…...

lesswrong. com
lesswrong. com > posts > Gv Hmv Da J2 CPo Jp Ljt > typical-minds-aren-t

Typical Minds Aren't " Less Wrong

3+ hour, 3+ min ago  (308+ words) We all know the typical mind fallacy'the bias where we assume that other people's minds are much like our own. It happens because most of our evidence for what minds are like comes from experiencing what our own mind is…...

Symbols: d05.S0,u11.S0,z74.S0,594.S0,ses.si,z4d.si
lesswrong. com
lesswrong. com > posts > cmbnqd AJRo WBHmo RT > the-one-week-sprint

The one-week sprint " Less Wrong

5+ hour, 28+ min ago  (441+ words) Recently I've been working in one-week sprints, and I've really enjoyed it! Tl; dr I need to do a lot of creative knowledge work, and have recently fallen into a routine which IMO is pretty good at facilitating that. Monday…...

Symbols: six:your,chqm-fm,ckzz-fm
lesswrong. com
lesswrong. com > posts > o AKsu X5 Xp Px FSEo HM > adversarial-proposal-design-in-asset-futarchy

Adversarial Proposal Design in Asset Futarchy " Less Wrong

5+ hour, 51+ min ago  (652+ words) Asset futarchy is hardest to attack when conditional prices stay tightly coupled to a proposal's real causal effect on ASSET value. The proposal strategies below work by loosening that coupling. A proposer promises value-creating work, but treats delivery as the…...

Symbols: non-ig,nyse:cpng,asx:ire,kdrn-us,nasdaq:bcdf
lesswrong. com
lesswrong. com > posts > ona Smioc Xt BYG5 BZZ > research-agenda-interpretive-debate

Research agenda: Interpretive debate " Less Wrong

18+ hour, 27+ min ago  (674+ words) One sentence pitch: our goal is to develop a piece of epistemic infrastructure for iteratively and empirically answering interpretive questions about AI models, where the accumulation of empirics leads to resolution of interpretive ambiguity and/or calibration of uncertainty. This…...

lesswrong. com
lesswrong. com > posts > w5 Bw R4848 C5t5zw8c > does-it-feel-any-different-to-be-reverse-chiral-life

Does it feel any different to be reverse-chiral life? " Less Wrong

19+ hour, 17+ min ago  (1637+ words) I will examine the concept of chirality (the difference between a right hand and a left hand, generalized) and its relevance to philosophy of mind. Philosophy of mind often deals with colors: colors of worldly objects and of mental representations…...

Symbols: btc-usd,d05.S0,u11.S0,z74.S0,594.S0,ses.si
lesswrong. com
lesswrong. com > posts > 6 Np RFNps99c Tjbh Fx > midjourney-s-spa-or-when-sci-fi-tries-to-become-mundane

Midjourney's Spa, or when sci-fi tries to become mundane " Less Wrong

19+ hour, 46+ min ago  (522+ words) Midjourney has just announced their jump from being just the "makes funny images" AI company to being the "revolutionises diagnostics and human medicine forever" AI company, as a side gig. Here's the post. Basically, they've announced the creation of a…...

Symbols: nasdaq:thrm
lesswrong. com
lesswrong. com > posts > gso Lo KY4 Spzc3 PEb6 > the-distillation-double-bind-distilling-misaligned-models

The distillation double bind: Distilling misaligned models either transfers misalignment or it doesn't " Less Wrong

20+ hour, 53+ min ago  (368+ words) Suppose we have a dangerous misaligned AI that can fool alignment audits, and distill it into a student model. Two things can happen: "...

Symbols: gpt-4o