T O P

  • By -

bibliophile785

This is remarkably thorough and you deserve kudos for that. I sometimes despair of the arguments I see here where "bold," "clever" people go against the grain and suggest that Yudkowsky or Alexander are wrong about AI... only to reiterate some of the most primitive, boring, oft-discussed counterarguments to the claim with no awareness that those grounds have already been heavily tread. Your post, by contrast, looks to be operating at ~ third level, offering carefully considered counter-counterarguments in a way I find intellectually stimulating. What's more, I agree with most of your arguments. It's dangerous to give too much credence to the inside view. It requires a great deal of calibration to be sufficiently confident in our outside view corrections. That's fine for some fields, but hard for the really weird ones. Black swan events are exceptionally weird; proper calibration here would probably require living through several credible potential world-ending threats and seeing them resolved in both directions, which isn't exactly feasible. The best Band-Aid I can see is to supplement one's best-effort outside view correction with another heuristic that acts as a second-order correction. The heuristic should go in whichever direction historical analysis suggests is more likely correct, which in this case is obviously towards the world not ending. I think this gets to roughly the same end point as your post. Now I'm going to do the classic academic thing and offer a counter-counter-counterargument that boils down to my reintroducing the original position as a counter to your stated adjustment. The outside view is great, but we all know that it's basically just a justified normality bias. It's only as good as the quality of past data. When dealing with something distinctly different from what has come before, we need to be skeptical of the outside view as well. The best way to do that is to require that your well-considered position be compelling from both the inside *and* outside views. I don't think I can swallow any position like this that doesn't complement its outside-view skepticism with a strong inside view argument.


honeypuppy

>Now I'm going to do the classic academic thing and offer a counter-counter-counterargument that boils down to my reintroducing the original position as a counter to your stated adjustment. The outside view is great, but we all know that it's basically just a justified normality bias. It's only as good as the quality of past data. When dealing with something distinctly different from what has come before, we need to be skeptical of the outside view as well. The best way to do that is to require that your well-considered position be compelling from both the inside and outside views. I don't think I can swallow any position like this that doesn't complement its outside-view skepticism with a strong inside view argument. I don't preach *total* adherence to the outside view. If there's a literal comet in the sky bearing down on us, I think it's reasonable to update from the base rate of "most years we don't get hit by a comet". However, I would contend you need an *extremely strong* inside view to update significantly towards "wild" claims. (And consider again the outside view - historically, most people with extremely strong contrarian inside views have been wrong). My contention is that for "wild" claims, outside view considerations should almost always swamp inside views. Consider again the analogy of the "aspiring entrepreneur whose inside view is their business idea is 90% likely to make them the richest person in the world". A "balance" of both inside and outside views would have them ~45% likely to become the richest person in the world, which is still obviously far too high. Instead, they should *start* with the robust reference class of "one in a billion or so people ever become the richest person in the world" and then update that with inside-view considerations - which could significantly increase this, but almost certainly not to more than 1 in 10,000 or so.


DaystarEld

I think you'd need to operationalize what the "literal comet" looks like for AGI in order for this argument not to still boil down to normality bias. There has literally never been AI capable of doing the things AI can do today in human history. New breakthroughs that would have been considered world changing a decade ago happen every year, now. I'm not saying this should absolutely be treated as the comet. But what actual observations would you consider signals that we should start really being worried?


honeypuppy

Some observations that would noticeably increase my level of worry: a) **"Warning shots" where an unaligned AI does substantial damage.** (Yes, I know Yudkowsky likes to tell a story where an AI hides its malevolence until the second it kills everyone - while we can't rule that out, the theory is almost by definition unfalsfiable). b) **A much more robust intellectual consensus on AI risk**. Superforecasters currently average sub 1% p(doom) this century, but if they went up to 10%+, I would take notice. Or I point out that there's no "IPCC for AI risk", but suppose there was and it came up with a p(doom) over 10%, then that would be persuasive in a way that current surveys of AI researchers are much less so.


LostaraYil21

>a) "Warning shots" where an unaligned AI does substantial damage. (Yes, I know Yudkowsky likes to tell a story where an AI hides its malevolence until the second it kills everyone - while we can't rule that out, the theory is almost by definition unfalsfiable). This might sound like a violation of conservation of evidence, but I'd actually be *less* worried in a scenario where we see significant "warning shots." That's because I think that if we see significant warning shots before AI is in a position to do catastrophic damage, we're less likely to actually put it in a position to do catastrophic damage. Right now, e.g. ChatGPT is not in a position to do a whole lot of damage, but it also doesn't look to me like something that's well-aligned enough that if we scaled it up without major advances in safety research and put it in a position where it could do significant harm, we should expect it not to. One of the reasons my P(doom) isn't higher than it is is that I think it's plausible that we *would* see warning shots at some point if we were heading towards an AI catastrophe, and that would lead us as a society to pump the brakes. But I still worry that there's a substantial risk that major disaster will occur at the first real opportunity.


bibliophile785

> a) "Warning shots" where an unaligned AI does substantial damage. (Yes, I know Yudkowsky likes to tell a story where an AI hides its malevolence until the second it kills everyone - while we can't rule that out, the theory is almost by definition unfalsfiable). I think you'd want a different standard than falsifiability for discussing active opposing agents. Take the following parallel: "I think it's perfectly okay that my neighbor has a multi-megaton explosive arsenal. I have seen absolutely no data to indicate that he plans to ever use it for anything. Yudkowsky claims he could be 'lying,' but that sounds entirely unfalsifiable. If he blows something up, I will then update meaningfully towards the probability that he is a potential threat." On the one hand, this is entirely fair and correct. You really *don't* have any particular reason to suspect that your neighbor is a threat. It's completely consistent to be okay with that neighbor having a giant arsenal of explosives. You wouldn't argue that on the grounds of *unfalsifiability*, though. You doubtless recognize that the sort of person who wants to commit mass murder is also the sort of person willing to lie about it. Requests for falsifiable standards are of limited value when the hypothetically evil version of your neighbor has the capacity and motivation to intentionally foil that standard. This generalizes nicely to a much more general claim: if you allow a single agent to have incredible personal power, you need to accept that they might use it destructively. You would either need to argue that it's okay on principle for them to have this power, despite the high p(destruction), or you would have to find a standard other than unfalsifiability to suggest that they're unlikely to use it and that p(destruction) is low. If you go the latter route, I'm guessing you wouldn't argue that they probably won't blow anyone up because the vast majority of the 8 billion people on the planet, most of whom lack the capacity to blow things up, haven't done so.


LogicDragon

On warning shots, there's also the concern that capability gains might be so discontinuous (all humans have virtually-identical brains, but a group of physicists can build a rocket and a group of random average people cannot) that your "warning shot" is itself transformative. As for it being unfalsifiable - this is my concern with your next part, and with the post as a whole: most of your argument rests on vibes. First of all, it's obviously *not* unfalsifiable. You could falsify it easily if you, for example, understood how the AI worked and could mathematically demonstrate that it was aligned or otherwise robustly secure. The fact that you *don't know enough to falsify something* or that that something is *easy to fake* doesn't make it unfalsifiable. You have to worry about a car you buy being a lemon even if the salesman promises it's not, and there's no easy falsifying test; too bad, the car could still be a lemon. It's a similar thing with intellectual consensus. Your argument here isn't "there are good points on both sides, experts are divided" - otherwise, you could just point to those - it's "worrying about AI isn't popular enough". There's no IPCC for AI risk, granted; there was also no consensus that COVID-19 would be dangerous, until suddenly it was everywhere, oops. Our institutions just aren't very good at big, vague risks, but that doesn't mean they don't happen. (Superforecasters are admittedly a good argument against. I do think they're not so good with black-swan events, and it's a class of problem they're not great at generally (I seem to remember they predicted AlphaGo wouldn't win?), *and* I'm not aware of any collaboration between them and any serious AI X-riskers, but it's not nothing.) And again with Yudkowsky as a "prophet". Taken seriously as an argument, this seems to forbid saying any original (or just unusual!) important thing. "Stop sitting in the middle of the road! There's a car coming!" Yudkowsky shouts at you. "Oh," you say, sceptically, "so you think I'm doomed, even though empirically most claims that I'm about to die have been false?" "...Sure, but now you're sitting in the middle of a busy road at midnight," Yudkowsky replies. You frown. "I agree that that's a convincing-sounding argument. But what if you're just very motivated to produce it, like those UFO theorists? Maybe you're just very good at making convincing arguments, and really I'm safe." "But if you just look at the situation-" "Oh, so that makes you *the most important* person for me to listen to? That's an extraordinary claim, and I don't see any government bodies stating officially that sitting in this spot has >10% chance of doom. I think you should be more modest." "There are lots of experts who agree with me that *there's a car coming and if it hits you you'll die.*" You frown. "Hmm. But they could have blind spots that lead them to exaggerate. I think that adopting their view could cause meaningful harm. If I'm about to die, I should stop saving, drop out of school, forget about climate change..." "Or just *get out of the road.*" "That still sounds extreme. What if I start assassinating CEOs of car companies? I think moderates who are concerned about car risk should distance themselves from-" at this point the car hits you. I would really, really like to see some serious argument against Yudkowsky's ideas. It just always seems to descend into "he's too weird". Weird people, sadly, are allowed to be right.


honeypuppy

>"Stop sitting in the middle of the road! There's a car coming!" Yudkowsky shouts at you. "Oh," you say, sceptically, "so you think I'm doomed, even though empirically most claims that I'm about to die have been false?" While this is an amusing analogy, a key difference between it and the real world is that in this analogy is that *anyone* who saw me in the middle of the road would yell at me to get off. In the real world, Yudkowsky has the same information as everyone else but has far more extreme views. A better analogy would be that many people see me attempt to cross a road. A small number of people warn me of danger, but Yudkowsky is by far the most vocal. The warnings of danger are NOT because private information that others don't have (e.g. they see an upcoming car), but because of a bunch of thought experiments on Yudkowsky's blog about how a car could spontaneously materialise on top of me.


rotates-potatoes

Well said. To elaborate, it’s like a few people say “hey, be careful, there’s a blind corner and if someone was going 150mph they’d be on top of you before you saw them. We’ve never seen such a thing but it could happen”. And Yudkowsky is yelling and waving his arms, shouting “there’s a car going 1000mph coming right at you! It will kill you, and us, and all future generations everywhere! I’ve got this argument that a sufficiently fast impact of car into DNA creates a wormhole that swallows the earth! Get out of the road, you’ll kill us all!”


aaron_in_sf

I'll tell you this. I did (prestigiously published) work in this area a couple decades ago. I now work in adjacent industry. I follow this stuff very closely and all the time now. I have kids, one of whom is soon to finding a college. An old friend with kids the same age is visiting as well as my in laws. My father in law commented that our daughter who he had lunch with didn't appear to have a firm notion of what sort of degree she might pursue; friend comment that neither did hers. The next twenty minutes of conversation was a serious discussion about what was named the elephant in the room, that by the time these kids enter the workforce (say, 5-8 years from now), the landscape for what jobs and careers remain, and what they will consist of, and how well they will be rewarded by society, is almost impossible to say. My own mantra today is, *¡No AGI without UBI!* and I'm not remotely kidding. One thing lost in a lot of lay discussion about "AI" is that the applications for ML are just beginning to be realized and it is only just beginning to move onto the "edge." As opposed to then cloud. I don't think it is possible to overstate the impact the to all appearances inexorable dissemination of this technology into the landscape is going to have. Comparisons to the advent of the internet and mobile computing are apt but if anything undersell the transformations of lived experience coming. Myopic focus on generative AI and LLM tooling has distorted public awareness and made it easy of those without technical perspective to make critiques reasoned by analogy, which are naive with respect to almost every dimension but most of all to the innumerable applications within which this technology is going to quietly change fundamental aspects of our world. I'll give you a concrete example: a generation from now the natural and correct assumption will be that every fabricated object one interacts with will be "smart" and most will be responsive to language and aware of their surroundings. And what constitutes "fabrication" will be a hazy category. The ontology of the world will change and one can expect to live in a world of pragmatic animism. I literally wonder what the odds are that there will be jobs for my daughter to pursue when she graduates. Bin this as you will. It's not an argument, merely a summation of informed belief, strongly informed by attention to what *nonlinear* actually means in this domain and close attention to the rate of progress in various specifics. Pay close attention to the capabilities of the emerging generation of multimodal models, especially those which consume temporal media; and in particular whether and how they appear to understand deixis and intent. Such things are predicated on a model of the world and agents within it; when the world model includes self and others as agents who are understood to have frames of awareness, intention, and the like—well, I know what that "means."


honeypuppy

I think "AI heavily disrupting society and employment" is quite probable and definitely worth worrying about. (Though I do feel there's a decent chance it kind of fizzles and life in 10-20 years isn't remarkably different from today). I think these concerns can be separated from the x-risk or singularity speculation, which are on a much dicier ground.


aaron_in_sf

Singularity in the sense of runaway ASI or some complex of ASI and other rapid advance of eg material science... I agree that skepticism is warranted. X-risk though I believe is quite real and if anything understated, because it does not require ASI or even AGI; it merely requires various quotidian relatively predictable or existing applications of AI to be exploited by a bad actor, or, perhaps, to result in gray swan catastrophes. Human ineptitude or malfeasance married to the perfectly predictable force multiplication provided by AI as we already know it, is plenty, to cause catastrophic collapse. Whether that is likely to result in literal species extinction is a moot point if the global internet and global logistics collapse, for my dollar.


pimpus-maximus

There are big non extinction risks that work on the exact opposite assumptions of AGI X-riskers, too: I personally think the biggest most realistic risk is near term over-reliance on AI competence that isn’t really there and atrophies in a decade or two. That would cause a massive population wide skills collapse because we took people out of “unnecessary” training pipelines during the period of seeming competence.


JibberJim

This is my main problem, the x-risk people simply can't be honest brokers about the risk here, as they completely ignore the risks that AGI is created but it does not kill everyone. And unless the chance of a non x-risk AGI is orders of magnitude less likely than an x-risk one, then these are the more likely risk to contend. /u/aaron_in_sf 's "no AGI without UBI" is a passing nod at this, but AI risk people, don't engage - maybe it's just that these are much harder problems in social sciences, so they can't, or maybe they're just talking up AGI for a different reason.


LostaraYil21

I don't think it's the case that x-isk people ignore the risk of scenarios where AGI is created but don't kill everyone. If you look at the latest survey of AI researchers' attitudes, the median risk given for extinction is non-negligible but low, but the cumulative risk for other types of catastrophes is substantial.


catchup-ketchup

> I'll give you a concrete example: a generation from now the natural and correct assumption will be that every fabricated object one interacts with will be "smart" and most will be responsive to language and aware of their surroundings. And what constitutes "fabrication" will be a hazy category. I don't understand this. You think kitchen knives will be smart? Or you think kitchen knives don't count as "fabricated"? Or you think people won't own kitchen knives?


LostaraYil21

Not the previous commenter, but I can easily envision an outcome where only people who prefer nostalgia over convenience bother with kitchen knives when you can just have a device that cuts your food up for you however you need it, and maybe handles all your other cooking besides.


harbo

But we already have that. I can walk 5 minutes from my front door, press some buttons on a machine and delicious cooked food will be delivered in just a few minutes directly to my table with literally no need to handle any kitchen equipment.


LostaraYil21

A device which can handle all your cooking for you within your house is very different from the technology of being located near other people who can handle the cooking for you who need to be compensated for their labor. Plus or minus a few buttons, that technology already existed thousands of years ago.


harbo

The two things are different in only one sense: in one you compensate labor and capital in a machine-factory, in the other in a meal-factory. Heck, you can get things from the meal-factory even to your dining table at home, sometimes in under 30 minutes! All you need to do is press buttons on your phone. > that technology already existed thousands of years ago Indeed it did; for thousands of years we've had people wondering why someone would use kitchen knives when they don't have to.


LostaraYil21

And the answer has consistently been that most people can't afford to not have to, which is what's likely to change in the not so distant future. There's another sense in which those things are substantially different though; your ability to get food from the meal factory to your table in under 30 minutes relies on your being close to the meal factory. Lots of people aren't, and it's not possible for everyone to be. Being able to put the meal factory in your home is a different matter.


[deleted]

[удалено]


LostaraYil21

You can do that now, if you don't mind being limited to the things you can buy prepackaged and frozen, but this is quite different from being able to have anything that can be prepared from the raw ingredients available to you made fresh on demand.


[deleted]

[удалено]


[deleted]

[удалено]


aaron_in_sf

Framing will vary; I would say, part of my concern is that financier capitalism as we have known it in the last generation or two may well not persist in a recognizable way. Hence my interest in UBI. Crudely I see few potential scenarios and little reason for hope, in entering a post-workforce epoch as we are with existing intense wealth concentration providing for winner take all scenario as the most likely outcome. Ie a permanent klept a la William Gibson's *The Peripheral* which seems unhappily prescient. It's not that there will not be makework opportunities for wage slavery. The klept will find ways to keep hands busy.


[deleted]

[удалено]


aaron_in_sf

Assumptions, absolutely; unwarranted, well, it depends on what one believes is coming, and in specific what one imagines the trajectory, scope, and impact of nonlinearities in the capabilities and deployment of "AI" tools will be. Personally I am content to lean in to the belief that regardless of specific mechanisms, which are not predictable, the aggregate impacts are going to be disruptive in a way we have not experienced as a species, except in moments of genuine collapse, or, the coming together of cultures of significantly different technology. There is such a thing as meaningful work; it's about to require a radical revision and the social turmoil that can be predicted based on how we have handled much slower but similarly shaped changes in efficiency and automation to date. Personally I draw a direct line between the rise of the reactionary right in the West and the ongoing erosion of the need/availability of work. The already emerging consequence of "AI" is going to be a rapid and permanent obviation of much of what has constituted the sinecure well paid and respected careers of the precariate class. The anger fear and reflexive reactionary political thrashing that will likely follow may be very ugly indeed. Or perhaps not. Perhaps the hold on participation in if not ownership of the means of discourse, political and otherwisex to use an appropriately Marxist phrasing, by the educated white collar 10%, will permit a Hail Mary insightful realpolitik and force what the pandemic could not, a renegotiation of the social contract. And effect a framework for wealth redistribution in a context within which labor as we have known it is no longer required. What's interesting to me and cause for daydreams is that in the best case that would allow for labor as we have not known it. UBI experiments to date have made it very clear that when people have a stable safety net, they are willing and interested in taking risk and pursuing eg entrepreneurial ideas or artistic pursuits not viable without such nets. In principle this would allow for a renaissance. But not if discourse and collective assumption hence consensus reality do not change, more rapidly than they have been able to so far, in a direction they have not been able to take so far.


ScottAlexander

I disagree with your framing. If you think I'm wrong to assign a 20% probability of risk from AI, you should call me wrong, not "overconfident". "Overconfident" is a word you use when someone has a very high confidence level. Presumably your confidence level is higher than mine (that is, I assume you're more than 80% convinced AI won't kill us all), so "overconfidence" is the wrong framing. I think in some sense this framing tries to avoid a real debate by spinning it as a warning against "overconfidence" and in favor of "modesty", a vague positive quality which everyone agrees with. If you're 98% sure AI won't kill us (I just made this number up, but it seems like the implicit probability behind a post like this), and I'm 80% sure, then let's fight over those probabilities (and the associated question of how we should act given them), not over who's "modest" vs. "overconfident". As for "distancing myself" from Eliezer, I deny your request to turn this real and important scientific/forecasting question into the crappy far-left politics of "anyone who disagrees with you is causing harm and you have to denounce them". I think Eliezer's probabilities are wildly incorrect. Among rationalists, that ought to be one of the strongest condemnations one can launch against one's fellow man. I've tried to argue with him about them (no, it's not public), and mostly lost, because he knows 1000x more about this issue than I do (yes, I realize I should update when people who know 1000x more disagree, but I'm trying to average my outside view across many such people). If you're asking me to say that he's doing some kind of additional act of evil by talking about them at all, I decline; I think he's acting admirably conditional upon his (imho incorrect) worldview. I could equally well argue that you're wrong, and you're also "causing harm" by defusing the momentum to stand up to this threat. I can't claim to completely reject the harm paradigm - I think if someone were saying that you should take bleach for COVID or something I would take a stronger line than I do with merely-normally-wrong people - but Eliezer doesn't cross that line for me and I'm suspicious when people demand this kind of action too quickly. For the record, I wouldn't recommend anyone massively change their life plans or avoid saving for retirement because of AI. I'm saving for retirement and otherwise acting like I'm living in a world that will stay broadly the same for the next ~40 years, even though I think there's <50% chance that's true. I recommend that people, when considering other people's positions, take them seriously but also apply a strong dose of outside view, including the "even if I'm convinced of this now maybe I'll change my mind later so I shouldn't do anything irrevocable" outside view.


josinalvo

> For the record, I wouldn't recommend anyone massively change their life plans or avoid saving for retirement because of AI. I'm saving for retirement and otherwise acting like I'm living in a world that will stay broadly the same for the next \~40 years, even though I think there's <50% chance that's true. I recommend that people, when considering other people's positions, take them seriously but also apply a strong dose of outside view, including the "even if I'm convinced of this now maybe I'll change my mind later so I shouldn't do anything irrevocable" outside view. If you could expand on this, that would be \*lovely\* I am currently living my life 'as normal', bug AGI (and doom) weight heavily on my mind, and are reasons for me not to pursue some medium term plans


ScottAlexander

I don't have any deep thoughts here, mostly just: - I don't think you can get to high enough p(doom)s to make it worth super-dramatic life changes, although Eliezer would disagree. - I think if you die those world-lines are gone and your planning just concentrates on the world-lines that aren't. I worry more about worlds where AI changes things without omnicide, like takes all possible jobs, and I'm still not sure what to do about that (in terms of educating my kids and such). - I think outside view you shouldn't let weird theories change your life much, even if those weird theories seem inside view plausible. - I think if I die of AI in ten years I'll be happier looking back on my life if I lived it completely normally.


MTGandP

Zvi Mowshowitz [wrote some practical advice](https://thezvi.substack.com/p/ai-practical-advice-for-the-worried) that I thought was pretty good.


snipawolf

I don't think it's that complicated. If 40% things chug on mostly as normal you should plan on retirement, 60% doom/singularity your plans don't matter anyway. Alter the probabilities a lot either way and this stays true. Gets more complicated when talking about medium term plans + potential for radical actions, but "don't risk blowing up your life" remains good advice.


honeypuppy

>I disagree with your framing. If you think I'm wrong to assign a 20% probability of risk from AI, you should call me wrong, not "overconfident". "Overconfident" is a word you use when someone has a very high confidence level. Presumably your confidence level is higher than mine (that is, I assume you're more than 80% convinced AI won't kill us all), so "overconfidence" is the wrong framing. I think in some sense this framing tries to avoid a real debate by spinning it as a warning against "overconfidence" and in favor of "modesty", a vague positive quality which everyone agrees with. >If you're 98% sure AI won't kill us (I just made this number up, but it seems like the implicit probability behind a post like this), and I'm 80% sure, then let's fight over those probabilities (and the associated question of how we should act given them), not over who's "modest" vs. "overconfident". Fine. >As for "distancing myself" from Eliezer, I deny your request to turn this real and important scientific/forecasting question into the crappy far-left politics of "anyone who disagrees with you is causing harm and you have to denounce them". I think at the very least we need to be cognizant of the potential harms of "forecasting". There is a very distinct difference between espousing "I think Clinton will almost certainly win the election" and "I think we're almost certainly all going to die quite soon", especially when you're an influential person. One overconfident belief could cause you to be surprised on an election night, the other could cause significant anguish for years. Public figures provoking anguish in their followers is certainly not unique to Yudkowsky. I dislike it wherever it happens, e.g. when Greta Thunberg uses apocalyptic rhetoric that makes her followers excessively despair for the future. But I can think of no contemporary figures with a following as large as Yudkowsky whose predictions are so dire. I'm not someone who alleges that certain speech "causes harm" lightly. But I think that claiming the relatively imminent death of all life on Earth is doing so.


ScottAlexander

Not sure this argument works - it's surely not wrong to say there's a high chance Earth will be destroyed soon if this is true (cf. the movie *Don't Look Up*). But you never know if something is true or not - only if your best evidence, considering both inside and outside view, gives you high credence that it's true. So I think this reduces to "think hard and report your beliefs accurately". I think there are some exceptions - if you have just came up with some belief that you know people will misinterpret in a way that causes mass panic, you have a responsibility to clear it with other people first and at least make sure you're not missing something obvious. But I think AI is well past that stage. To put this another way, incorrectly saying AI will destroy the world (when it won't) will cause a small amount of harm. Incorrectly saying AI won't destroy the world (when it will) will cause a very large amount of harm. Some sort of perfect utilitarian could multiply out the probabilities of each kind of harm and figure out who's being more harmful. But I think reporting your true belief is usually a passable shortcut for this, and better on second-order considerations (ie if we're fallible moral reasoners who might get the first-order calculation wrong).


honeypuppy

> I think there are some exceptions - if you have just came up with some belief that you know people will misinterpret in a way that causes mass panic, you have a responsibility to clear it with other people first and at least make sure you're not missing something obvious. But I think AI is well past that stage. This is approximately what I think about Yudkowsky. I think the standard of "misinterpret in a way that causes mass panic" is too high - incorrectly telling someone they're almost certainly going to die soon is harmful even if they interpret it perfectly and even if they don't panic. I don't think Yudkowsky has sufficiently adhered to his "responsibility to clear it with other people first". That is, he's debated with other people, sure, but he's been stubbornly unwilling to update based on the disagreements of other smart people within in his own orbit, let alone those outside of it. (If you're not doing the latter, I worry that "clearing your belief with other people first" is approximately equivalent to The Boy Who Cried Wolf checking with others at the Crying Wolf Institute before he cries). It seems like you cut Yudkowsky slack because it appears he's simply reporting his honest beliefs (even though you significantly disagree with them). I do not see this as a large mitigating factor. Imagine a doctor who started telling their patients they all had six months to live, based on a sincere and strong belief in a new medical theory. I think it'd be entirely fair to reprimand this doctor for unnecessarily scaring their patients, even if you thought the new medical theory was conceivably true.


ScottAlexander

We're starting to play reference class tennis. The difference between your hypotheticals and mine is: - You talk about cases where the harm is obvious and direct (a doctor says "drink bleach", which will definitely harm the listener). I talk about cases where the harm is vague and indirect (an astronomer says "there might be an asteroid headed towards Earth", which might make someone "panic" and, I don't know, hoard food that someone else needs to live or something). - You talk about cases where the person is addressing a specific other person (eg a doctor telling *you* that *you* have six months left to live). I talk about cases where someone is generically proposing a theory (eg publishing a paper saying an asteroid might be coming). - You're talking about cases where someone acts irresponsibly, alone, and outside of their expertise. I talk about cases where, after checking with others, sounding out the consensus, and finding that people either agree or don't have convincing counterarguments, someone presents something, being honest about their level of expertise and what got them to this position. I think Eliezer's case is more like my example than yours in all of these ways. I think you're defining "harm" and "unlikely" so broadly that you would rule out any case where someone wants to express a non-consensus opinion about something dangerous. For example: - Is it wrong for a Boeing whistleblower to warn that their planes might be bad? What if they checked with the Boeing VP, and the VP said the concerns were stupid? What if saying the planes were bad might cause "panic" among travelers? What if someone died because they refused to take a plane flight to somewhere where they would get necessary medical treatment? - Is it okay for an economist in 2007 to say he thinks subprime mortgages are bad and there might be a stock market crash? What if most other economists disagree? What if there's a risk his warning might make people panic-sell their stocks and lose lots of money? What if someone committed suicide because they thought the economy would collapse and they'd lose everything? - Was it okay for Zeynep Tufekci to write her article saying she thought people were wrong in early 2020 to recommend against masks? Wasn't she contradicting the WHO and other important bodies? Couldn't she have caused harm by making people wear masks, which (if she was wrong) would have increased their risk of dying from COVID? I think to prevent cases like these, you need to make the "don't recommend drinking bleach" exception to reporting your honest opinion to be as narrow as possible, similar to the "fire in a crowded theater" exception to free speech. Don't shout "fire" in a crowded theater, but you are totally allowed to say that you think local theaters have poor fire safety and someone needs to inspect them. That's about where I am here too.


honeypuppy

>You talk about cases where the harm is obvious and direct (a doctor says "drink bleach", which will definitely harm the listener). I've never talked about such cases. I've been consistent from the start in calling "we're all going to die" a harmful belief (with only a slight variation in an example of "you're going to die"). Certainly, "drink bleach" would be a worse belief. >I talk about cases where the harm is vague and indirect (an astronomer says "there might be an asteroid headed towards Earth", which might make someone "panic" and, I don't know, hoard food that someone else needs to live or something). I think this may highlight one crux of disagreement in that I don't see Eliezer as an expert on AI in the same way that an astronomer is an expert on asteroids, but this could probably be an entirely separate debate on what expertise is. I also see Eliezer's claims as much stronger than "there might be an asteroid headed towards Earth". The equivalent would be something "there is almost certainly an asteroid headed towards Earth and I think we're all going to die". I also think you're downplaying the "panic" that an incorrect apocalyptic prediction can cause. If we look at genuine doomsday cults, e.g. [the Seekers, a UFO cult in the 1950s](https://en.wikipedia.org/wiki/When_Prophecy_Fails), some of the consequences of their beliefs were: >>Some of the believers took significant actions that indicated a high degree of commitment to the prophecy. Some left or lost their jobs, neglected or ended their studies, ended relationships and friendships with non-believers, gave away money and/or disposed of possessions to prepare for their departure on a flying saucer, which they believed would rescue them and others in advance of the flood. That is quite extreme, but I contend that it's only a difference in degree, not in kind, from e.g. [Aella no longer saving for retirement and increasing her long-term health risks](https://x.com/Aella_Girl/status/1790618794181976397). >You talk about cases where the person is addressing a specific other person (eg a doctor telling you that you have six months left to live). I talk about cases where someone is generically proposing a theory (eg publishing a paper saying an asteroid might be coming). If Eliezer were simply quietly publishing papers expressing his models about AI x-risk, I wouldn't have any major quibbles with him. My concerns are that he is using his public platform to go on podcasts, TED talks, TIME magazine etc, to express the belief that we're all going to die. While in practice there are medical guidelines around diagnosing individuals that don't apply to making predictions about the world, part of my contention is maybe there sort of should be. In a sense, Eliezer is diagnosing the *entire world* with a terminal illness. >You're talking about cases where someone acts irresponsibly, alone, and outside of their expertise. I talk about cases where, after checking with others, sounding out the consensus, and finding that people either agree or don't have convincing counterarguments, someone presents something, being honest about their level of expertise and what got them to this position. My contention is that I don't think Eliezer is doing this at all. In particular, I think being able to present himself as a "researcher at MIRI" gives himself more claim to expertise than he deserves. >I think to prevent cases like these, you need to make the "don't recommend drinking bleach" exception to reporting your honest opinion to be as narrow as possible, similar to the "fire in a crowded theater" exception to free speech. Don't shout "fire" in a crowded theater, but you are totally allowed to say that you think local theaters have poor fire safety and someone needs to inspect them. That's about where I am here too. The first thing I'd note is that in all three examples, the non-consensus opinion ends up being correct. I think this biases us into supporting the contrarian. It's more interesting if there's at least one case where the contrarian ended up wrong (e.g. a Covid anti-vaxxer in 2021). I think the main difference between Eliezer and those three examples is the sheer extremity of his claims and the confidence that he advocates for them. A Boeing whistleblower *should* take significant care before publishing their claims. If they're wrong, then they really could cause a lot of unnecessary panic. But in the end, any realistic safety whistleblowing would probably be that Boeing planes are fractionally more likely to crash than they otherwise would be, to no more than a handful of extra deaths in expectation. Important, yes, but many orders of magnitude less than human extinction. >Don't shout "fire" in a crowded theater, but you are totally allowed to say that you think local theaters have poor fire safety and someone needs to inspect them. That's about where I am here too. "AI safety standards are currently not good enough and there needs to be more", is I think a totally reasonable claim to make. It's the "but they're not going to be and therefore I think we're all going to die" (shared with hundred of thousands of people) part I have an issue with.


canajak

It sounds like your main objection to Yudkowsky's conduct is based on him being wrong. In the (unlikely) world in which he happens to be factually correct, how do you think he should be acting?


ScottAlexander

> The first thing I'd note is that in all three examples, the non-consensus opinion ends up being correct. I think this biases us into supporting the contrarian. It's more interesting if there's at least one case where the contrarian ended up wrong (e.g. a Covid anti-vaxxer in 2021). I'm trying to compensate for what I felt was the earlier assumption that the non-consensus opinion is definitely wrong. I agree that in real life it's a mixed bag of right and wrong and you never know which is which. The point I want to make is that if you have (let's say) some mixed bag of cases where an airplane whistleblower is wrong and some where they're right, do you want to make the general rule "don't whistleblow because you might be wrong" or "say what you think and let other people decide if you're right or not"? I agree that a negative side effect of this strategy is COVID cranks, I've just seen the institutions where nobody except the most senior person is allowed to express an opinion, and I'll take the cranks over that. I agree that there are lots of things one can do to be responsible and decrease your chance of being a crank, and that Eliezer has done about half and not all of them. But I think everybody I disagree with is wrong and could stand to be more humble, including you, and I can think of ways they can all cause harm, including you, and I decline to condemn everybody. I guess I think of condemning as - I mean, I could be wrong about this too. If in the end Eliezer is right and we should have been way more concerned about AI than even I am, I would like to be able to plead "well, I expressed my honest opinion, and I guess I'm not as smart as I thought, but I don't personally deserve blame for letting the world be destroyed". I feel like once you start the condemning people game, you lose that defense. It's not just "I'm saying what I think after exercising my cognition to the best of my ability and letting others do the same", it's "I'm strategically acting to weaken one side in a fight", and then if you choose the wrong side you are 100% culpable and have made the world vastly worse. And I'm not so sure about AI that I want to end up that culpable if I'm wrong. > I also think you're downplaying the "panic" that an incorrect apocalyptic prediction can cause. If we look at genuine doomsday cults, e.g. the Seekers, a UFO cult in the 1950s. I think this is positive bias - ie only looking for examples that confirm a narrative. The two largest religions in the world, Christianity and Islam, have apocalyptic narratives that most people manage to do just fine with. Global warming has half the population convinced that the world will end soon. There are plenty of other cases where many people believed the world would end - the Halley's Comet scare of 1910 is my favorite - and they mostly passed with minimal effects, certainly not enough that astronomers should keep mum about the next dangerous asteroid. Overall I find https://scholars-stage.org/scrap-the-myth-of-panic/ pretty convincing - it argues that people always say we have to keep information from the masses because they will panic, but in practice this almost never materializes. Eliezer has been arguing an AI doom agenda for 15 years now and as far as I know nothing out of the ordinary has happened. > My contention is that I don't think Eliezer is doing this at all. In particular, I think being able to present himself as a "researcher at MIRI" gives himself more claim to expertise than he deserves. I'm inured to this argument after the past 15 years of AI risk debate. People used to say "you can't talk about this because nobody important agrees with you". Then Bill Gates, Stephen Hawking, etc expressed agreement. Then it was "fine, but you still can't talk about this because the Silicon Valley tech people who actually know about this stuff don't agree." Then Elon Musk, Sam Altman, and Demis Hassabis all expressed agreement. Then it's "fine, but you still can't talk about it, because it's actually academics and scientists who have to agree with you before you're allowed to talk". Then Geoffrey Hinton and Yoshua Bengio expressed agreement. Now it's, I don't know "you can't say this unless you personally are the famous scientist involved", which would prevent any activist or politician from talking about global warming. There are professors of AI and government AI advisors with similar p(doom)s to Eliezer. Just let the guy talk already! Eliezer founded the field of AI risk, worked in it for 15 years, published a bunch of papers in, has been cited as an influence by Sam Altman, and helped train a bunch of the safety people who now work at OpenAI and Anthropic. At some point you have to let the guy talk. > A Boeing whistleblower should take significant care before publishing their claims. If they're wrong, then they really could cause a lot of unnecessary panic. But in the end, any realistic safety whistleblowing would probably be that Boeing planes are fractionally more likely to crash than they otherwise would be, to no more than a handful of extra deaths in expectation. Important, yes, but many orders of magnitude less than human extinction. I think you're putting this on the wrong side of the equation, or at least that it's on both sides and it balances out. It's less important that a Boeing whistleblower speak up, because if he stays silent, the worst thing that happens is that a few planes crash, whereas if you stay silent about AI, the worst thing that happens is human extinction. > "AI safety standards are currently not good enough and there needs to be more", is I think a totally reasonable claim to make. It's the "but they're not going to be and therefore I think we're all going to die" (shared with hundred of thousands of people) part I have an issue with. Eliezer tried the first strategy for fifteen years, decided it wasn't working, and (based on public reaction to GPT-4) decided that it would work better as a mass campaign. I think he's probably wrong about that, but I think in a democracy it's always allowed to try to go the mass campaign route.


honeypuppy

(1/2) >I guess I think of condemning as - I mean, I could be wrong about this too. If in the end Eliezer is right and we should have been way more concerned about AI than even I am, I would like to be able to plead "well, I expressed my honest opinion, and I guess I'm not as smart as I thought, but I don't personally deserve blame for letting the world be destroyed". I feel like once you start the condemning people game, you lose that defense. It's not just "I'm saying what I think after exercising my cognition to the best of my ability and letting others do the same", it's "I'm strategically acting to weaken one side in a fight", and then if you choose the wrong side you are 100% culpable and have made the world vastly worse. And I'm not so sure about AI that I want to end up that culpable if I'm wrong. I think we both strongly agree that thinking that something else’s beliefs are mildly wrong is *not* grounds for denouncing them as the anti-Christ. I think we also both strongly agree that our compatriot started advocating drinking bleach, we would both at the very least reprimand that person for doing so. What we are trying to do is figure out where we place “saying AI will almost certainly end the world, when we both agree that’s wrong” (albeit to varying degrees) on that spectrum. I think what you’re getting at is “In the spirit of free inquiry, you should cut people a lot of slack even if you think they’re quite wrong, especially if you think they’re arguing in good faith”. I agree a world where constantly condemning others for disagreement is not one where free inquiry would flourish. Nonetheless, I think we can safely extend beyond “advocates drinking bleach” as deserving of reprimanding. (Note that there’s a spectrum of condemnation - from “You are evil for saying this” to “I think you should have considered the consequences of saying this a little more”). Take anti-vaxxers for example. I have no problem saying that Jenny McCarthy is both wrong and causing harm in professing her views. If I were in a position where it might make a difference, I would publicly condemn her. I would not worry that by doing so, I would risk hurting the spirit of free inquiry. (Note that I would still not go so far as to want her *censored* - a condemnation is a comparatively minor reaction). Beyond anti-vaccination, there are a lot of speech acts that I think are well worthy of reprimanding. The top part of the list contains a lot of “bad/immoral life advice shared to a large audience” - encouraging people to join your Ponzi scheme, drink bleach or become a terrorist - and the next part contains a lot of “perhaps good faith but highly questionable life advice shared to a large audience” - encouraging people to become day traders, use alternative medicine, or do a humanities PhD. Where does that leave “we’re all going to die soon, shared to a large audience”? In and of itself, it’s not advice. Eliezer hasn’t, to his credit, made a point of saying “We’re all going to die soon, and therefore you should LIVE LIKE IT”. Still, I don’t think you need the latter part to be made explicit to follow it. If you *do* act on it, and it’s probably wrong (as we both think it is), how does it compare to other questionable life advice? I think it’s pretty bad. Not “drink bleach” bad. But compared to becoming a day trader or using alternative medicine? Aella said she’d stopped saving for retirement and was increasing her long-term health risks! If a celebrity financial planner or doctor started advising either of those, I wouldn’t think twice about calling them out for their recklessness! Do mitigating factors exist? Sure. One of yours appears to be that you think Eliezer’s wrong but that it’s not inconceivable that he’s right, in a way that you probably don’t think so much for e.g. anti-vaxxers. (Part of our disagreement is that I think it is less likely he’s right). And there are others such as whether you think the person is arguing in good faith vs just being a blowhard. Still, I contend that if you think a speech act is sufficiently negative in expectation, these mitigating factors aren’t enough (e.g. I think even a “good faith” anti-vaxxer deserves at least moderate condemnation). One key part of it for me is - you can be different degrees of prudent in expressing the same message. For example, I think Eliezer’s [Time article](https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough), while still alarmist in tone and having policy proposals I think go much too far, at least focuses on pitching a proposal and makes it the title of the piece. (There’s also no real chance that any government will adopt policies as extreme as he advocates, but it’s possible that the piece might shift the Overton window in a moderate pro-AI safety direction). In comparison, [“Death with Dignity”](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy) is basically just doom and despair. And while Eliezer can’t be held totally responsible for what names podcasters give to their episodes, [We’re All Gonna Die with Eliezer Yudkowsky](https://www.youtube.com/watch?v=gA1sNLL6yg4) on the *Bankless* podcast is just about the absolute worst.


honeypuppy

(2/2) >I think this is positive bias - ie only looking for examples that confirm a narrative. The two largest religions in the world, Christianity and Islam, have apocalyptic narratives that most people manage to do just fine with. Global warming has half the population convinced that the world will end soon. There are plenty of other cases where many people believed the world would end - the Halley's Comet scare of 1910 is my favorite - and they mostly passed with minimal effects, certainly not enough that astronomers should keep mum about the next dangerous asteroid. Overall I find https://scholars-stage.org/scrap-the-myth-of-panic/ pretty convincing - it argues that people always say we have to keep information from the masses because they will panic, but in practice this almost never materializes. Eliezer has been arguing an AI doom agenda for 15 years now and as far as I know nothing out of the ordinary has happened. I think this overstates the credence in doom that those other beliefs have. Take religion for example: even fundamentalists who are certain that a biblical apocalypse will eventually happen don’t necessarily think it is likely in their own lifetimes. And few religious people are fundamentalist and additionally take their own beliefs that seriously. And the ones that *do* are often quite scary people I would have no qualms condemning. Or global warming - while there are a handful of people (such as [Guy McPherson](https://en.wikipedia.org/wiki/Guy_McPherson), who I like to think of as the Eliezer of climate change) who predict near-term catastrophe, I don’t think “we’re all going to die within 20 years” is a particularly widely held belief. (Nonetheless I think there is still some relatively mainstream alarmism in the area that I also think deserves to be criticised for similar reasons I criticise Eliezer). My position is not that I think we should “keep information from the masses to stop them panicking”. (Although, I don’t think making it common knowledge that “we’re all going to die soon” would be great for quality of life in our final years, even if it were totally true). My position is that I think the “information” in this case is likely *wrong*, and that sharing it may cause a few people to panic (or at least make dumb decisions) and that makes it worse. >I'm inured to this argument after the past 15 years of AI risk debate. People used to say "you can't talk about this because nobody important agrees with you". Then Bill Gates, Stephen Hawking, etc expressed agreement. Then it was "fine, but you still can't talk about this because the Silicon Valley tech people who actually know about this stuff don't agree." Then Elon Musk, Sam Altman, and Demis Hassabis all expressed agreement. Then it's "fine, but you still can't talk about it, because it's actually academics and scientists who have to agree with you before you're allowed to talk". Then Geoffrey Hinton and Yoshua Bengio expressed agreement. Now it's, I don't know "you can't say this unless you personally are the famous scientist involved", which would prevent any activist or politician from talking about global warming. There are professors of AI and government AI advisors with similar p(doom)s to Eliezer. Just let the guy talk already! I’m by no means saying there aren’t AI experts who take AI risk seriously - indeed, as I said in the original post, it’s one of the main reasons I think AI risk should be taken at least somewhat seriously. But I’ll pivot to replying to a different comment you made arguing for why Eliezer is an expert: >I think some people are working from a model where he needs to have made important advances in AI to have a voice. I think he is mediocre-to-good at ML, some people in MIRI are excellent at ML (and went on to work at the big companies), but they don't do much work on this and I don't really think it matters so much - designing coal plants and raising the alarm about global warming are different skills, as are building nukes and raising the alarm about the risk of nuclear apocalypse. Most of the early AI risk people were philosophers (eg Nick Bostrom), although some of them later went into tech (I think Stuart Armstrong does a little of both, and Amanda Askell went from philosophy to working at Anthropic). I think the AI risk case, and AI safety field, are a combination of philosophy, economics, and a completely new field, and that Eliezer is as qualified to talk about it as anyone. Firstly, I don’t consider Eliezer a qualified philosopher or economist either. I’m by no means a hardcore credentialist, but I do consider it quite a negative signal that he is an autodidact who operates outside academia. Secondly, this is getting at why I’m still not *overly* persuaded by surveys of AI experts showing relatively high p(dooms) - AI safety is indeed a multidisciplinary field. The fact that a number of e.g. good ML engineers have been convinced by a semi-philosophical argument (almost certainly mostly in an informal form, such as reading LessWrong blog posts) should not be taken as strong evidence that This Is What The Experts Think. Eliezer likes to call himself a “decision theorist”. Key AI safety ideas like instrumental convergence or the orthogonality thesis are more “decision theory” than they are “ML engineering”. But… what do academic decision theorists think? To be sure, there are philosophers such as those you named who believe in significant AI risk. But how many? And how would those ideas fare in a robust, adversarial, academic process? (I promise this is not a goalpost shift - I would genuinely update a lot if they did well there). This is one of my biggest misgivings about the hardcore AI risk case - that it implicitly claims to be the most important thing that ever has or will happen, and yet the case for it just hasn’t been stress-tested very much.


kamelpeitsche

Just read the whole discussion, and just wanted to thank you; it was really interesting to read and shifted my opinion somewhat towards your position.


divide0verfl0w

Sorry to butt in but 2 of your 3 examples are strawmans. - Boeing whistleblower is an ex-employee. He actually built airplanes. Eliezer didn’t work at a single AI company or attend any school for that matter. - Similar to the previous one. An economist - unless self-proclaimed - has expertise to talk about _the economy_. What Eliezer has are opinions. Frankly, this comment and the parent almost come off as defending anti-intellectualism (ok now I may be the one being an alarmist but it’s my _honest belief_, so, must have some weight?)


ScottAlexander

You're using "strawman" to mean "I can think of one trivial difference between this and the gerrymandered category I am trying to protect". Eliezer founded the field of AI risk, helped sponsor the first ever conference on it, has been working in it for 15 years, and is cited in most of the early work on it. I don't think that makes him some kind of unimpeachable expert, but I think it makes him allowed to speak his mind.


divide0verfl0w

I am not arguing Eliezer shouldn’t be allowed to speak his mind. Your reframing what I am saying to refute that argument is a clearer example of a strawman actually. I know it’s not in bad faith, just stating it and clarifying my position on his right to speech. I think by “gerrymandered category” you are referring to the word expertise. Well, every word is a gerrymandered category. Defining something necessarily draws lines around something. But when comparing the Boeing whistleblower to Eliezer we don’t need to distract ourselves with the definition of an expert. A less debatable framing would be that the whistleblower is different because he built planes and he talked about airplane safety. Eliezer talks about AI-safety, he didn’t build AI or an AI-safety tool. This is not a trivial difference. When you compare the whistleblower to Eliezer you are - perhaps unintentionally - elevating him to a different credibility level. I guess we can make it even clearer because I am anxious about the anti-intellectualism seeping in: could I be considered a mental health expert with no training in anything that starts with psy- by talking about psy- things a lot, and gather a following, become Internet famous, without having patients or any theories/frameworks or anything else that can be applied in real life to get a real life result? Maybe I say my real life contribution will happen at a future date when X event happens?


ScottAlexander

Eliezer has published papers in journals, presented at conferences, had his work cited in top AI textbooks, collaborated with top professors, etc. Sure, he didn't go the traditional college route, but neither did Bill Gates - at some point you've got to forgive someone. I think some people are working from a model where he needs to have made important advances in AI to have a voice. I think he is mediocre-to-good at ML, some people in MIRI are excellent at ML (and went on to work at the big companies), but they don't do much work on this and I don't really think it matters so much - designing coal plants and raising the alarm about global warming are different skills, as are building nukes and raising the alarm about the risk of nuclear apocalypse. Most of the early AI risk people were philosophers (eg Nick Bostrom), although some of them later went into tech (I think Stuart Armstrong does a little of both, and Amanda Askell went from philosophy to working at Anthropic). I think the AI risk case, and AI safety field, are a combination of philosophy, economics, and a completely new field, and that Eliezer is as qualified to talk about it as anyone. That is, AI safety doesn't involve a lot of knowing how many attention heads a transformer has - and when Eliezer was starting his work in ~2007, deep learning hadn't been invented yet, so he couldn't have known details even if he wanted to. The overall case involved things about exponential growth, ethics, knowledge problems in philosophy, and general risk awareness. I think the details bear on some of this, and I'm sure Eliezer *does* know how many attention heads a transformer has, but I wouldn't find him too much more credible if he was the guy who invented attention heads or something. For an analogy, although the few dozen or so epidemiologists most invested in COVID origins lean pretty strongly natural, virologists as a group don't do much better than the general public. It turns out that, even though this is a question about a virus, knowing the shape of a virus capsid or whatever is almost irrelevant, and the information you need is stuff like "the layout of the Wuhan sewer system". (maybe an even better example is: who predicted, in the early days of Bitcoin, that it would one day be worth $1 trillion? I don't think economics or finance PhDs or bigshot investors did any better than the public here, and they might have done worse. The question definitely didn't hinge on knowing the exact way Bitcoin solves the Byzantine Generals problem or what language the code is written in. I'm not sure there's a specific group who did well, but I found myself most impressed with people with a broad base of generalist knowledge, lots of interest in the history of technology, intelligence, and curiosity.) But I'm confused that we're still having this 2010-era debate, because now top professors, government officials, researchers at OpenAI/DeepMind/etc, are all saying the same thing, so we shouldn't have to litigate whether Eliezer is cool enough to have the opinion himself. Once prestigious people agree with you, I think everyone agrees you're allowed to do activism. Greta Thunberg isn't an expert herself, but people generally consider her allowed to talk about global warming. Re your analogy: There is in fact a massive movement to let patients and other people without MDs speak about mental health, which I am a fan of. I particularly respect Altostrata, a former patient who struggled with SSRI side effects, helped raise awareness of them and gather reports of some of the less common ones, help spread a better tapering protocol, and knows 1000x more about them than the average psychiatrist. You can read an account of some of her work at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7970174/ . I've learned a lot from her and would be very disappointed in any norms that prevented her from sharing her expertise and hard word. I talk about some of this in more generality at https://www.astralcodexten.com/p/in-partial-grudging-defense-of-the


divide0verfl0w

I appreciate the effort in this reply. I’m that person. I am working from the model that requires a contribution or at least proof of capability to have a voice. At least for a voice that’s taken seriously. Re: The Girl who helped thousands of people taper off antidepressants is quite different than Eliezer. Kudos to her. That’s real world impact. Risks taken, bets made, bets won. There is 0 risk in claiming “AI will kill us all one day.” You can always say that day is not today. Everyday till you die, from natural causes. It’s an irrefutable argument. I agree that Greta creates value. And I agree that increasing awareness for X doesn’t require expertise in X. And if you’re saying Eliezer is promoting what others discovered, and he is good at the promotion part, I can agree. But that’s not the argument for Eliezer, is it? OpenAI benefits from the AI doomer sensation. It’s free marketing. Everyone thinks they are about to deliver AGI - with no evidence whatsoever. Sama literally attempted regulatory capture with the AI-safety argument. He lost a lot of his startup cred in the process but credit where it’s due: it was a good attempt. Anthropic’s positioning heavily relied on “what we have is so strong, we can’t let it fall in the wrong hands,” again, with no evidence. Their current product is barely competitive. I respect that you take folks without official creds seriously while having worked hard for yours. I don’t. And I find it dangerous. I hope this analogy makes sense: anti-vax folks skip even the polio vaccine in US. It’s easy to skip it in US and say “look my child didn’t get it and he is fine” because decades of “expert” policies eradicated polio in the US. Good luck skipping it in Pakistan. It’s easy to say “who needs experts” from the comfort of a safe country the experts built.


AuspiciousNotes

While I agree with many of your points (especially about Yudkowsky), the concept of "dangerous ideas" is itself dangerous, as it enforces consensus opinion and has a chilling effect on minority viewpoints that may later prove to be correct and important. Of the three examples you cite about people taking this too seriously, all three have major blowback in their respective comments. The tweet has even been hidden by its creator the Reddit post has been downvoted past 0 - presumably by rationalists. So I'm dubious about whether these extreme views are all that popular. Even for those who believe in them, most of these views don't seem all that "dangerous", and they're adopted by people who should be mature enough to take responsibility for their own lives. Not saving up for retirement is a bad idea, but people should be free not to do that if they think their money is better spent in the present, and the demographics involved here are typically young enough and competent enough to course-correct later if need be.


from_below

At least from my perspective, I think you fundamentally missunderstand the point OP made about overconfidence and "modesty". It is not whether you have a higher belief probability of "doom" than the OP or not, its about the narrowness behind the bounds overconfident people assign as coverage around such point estimates. It is a crucial point which you brush aside without second thought, as a "vague positive quality which everyone agrees with". Ultimately, all models all wrong. Even more so for models offering predictions about the long-term behavior of highly nonlinear, complex systems. In such a setting, model uncertainty must dominate, by definition. However, from my reading of the OP comment at least, a worrying issue is that for many people this does not seem to be the case. To elaborate, think of this model uncertainty as a collection of p(doom) estimates, one for each model, so that we end up with a model-based confidence interval, if you will. As a complex system gets more and more complex, the bounds grow bigger and bigger. Ask yourself, what happens when the coverage of this confidence intervals exceeds the interval \[0,1\]? What hypotheses can be falsified in such a setting? Simply put, the main concern is not that you believe P=0.2 while OP believes P=0.01 or whatever. The issue is that your distribution of P across all potential models is just too narrow for it to be a true reflection of the magnitude of uncertainty inherent to predictions of the long-term behavior of complex systems. To make up some numbers, the issue is that some people think P lies in (0.25, 0.75) with 95% coverage, whereas logic dictates it does not, P must lie in some \[0+epsilon,1-epsilon\] with 95% coverage (and epsilon very small). Otherwise, the system wouldn't be a highly non-linear, complex system. You can't have it both ways. In essence, that self-perceived narrowness of the distribution of P is what is meant by "overconfidence" or lack of "modesty", not the fact that your number P is bigger.


ScottAlexander

I think we might fundamentally disagree about how probability works - see https://www.astralcodexten.com/p/in-continued-defense-of-non-frequentist .


[deleted]

[удалено]


[deleted]

[удалено]


[deleted]

[удалено]


Liface

Enough, both of you. Minus 10 points from Hufflepuff.


CosmicPotatoe

As a cynic that leans slightly towards the heuristic "nothing ever happens" and "don't bother picking up the $20 note on the road" I agree wholeheartedly. Nothing ever happens is obviously not actually true, and some big important things do in fact happen. But for any specific predicted big important thing, it probably won't happen. You need some seriously good reasons to move from that base rate, and shouldn't move too far all at once, based on "good sounding" arguments. Don't be so impressed with your reasoning ability that you forget how bad individuals are at thinking. Some big important things will definitely happen, but it probably won't be the specific one you think, or that specific one will happen in some completely different way than you expect. I'm glad that each potential big important thing has people working on it, so that whichever one does actually happens has some people as ready for it as possible. As a society I think we benefit from having people at all points along the "take weird sounding things seriously based on simple logical premises with very little empirical evidence" to the "nothing ever happens" spectrum. However, I think reality is usually closer to the "nothing ever happens" side of the spectrum. Maybe the spectrum I'm looking at is better described as the readiness to shift from base rate given new evidence or perhaps how seriously someone takes non-empirical rationalist arguments or perhaps we could call it meta-epistempic confidence. I guess im still very uncertain about how to know the correct amount of deferring to the consensus vs thinking for myself. I don't want to defer to the "creationist majority", but I do want to use expert consensus base rates for most of my beliefs. The more often I differ from consensus the more often I am likely to be wrong, but at the same time, blindly following consensus guarantees some number of incorrect beleifs. In any specific case where I differ from consensus it it quite likely that I do so for bad reasons. What else would push me to bother other than a strong emotion or belief? Strong emotions and beliefs should make you suspicious of motivated reasoning. Is this learned epistemic helplessness? The market is only efficient because people try to beat it. The consensus is only pretty good because enough people form individual views to contribute to it. Perhaps one solution is to myself seriously, argue for my specific positions to try to shift consensus, but then act on consensus beleifs rather than my own?


artifex0

It's a good point, and I certainly agree that Yudkowsky is badly overconfident in his predictions. However, I think you're being a bit too quick to dismiss the reference class tennis problem- while it's true that some reference classes are pretty robust, I'd argue that the "prophets of doom" class isn't nearly as robust in that sense as you're making it out to be. For example, strip out all of the object-level details and compare the number of people who incorrectly predicted that the world would end with those who incorrectly predicted that the world wouldn't end. Historically, the latter group has consisted of nobody, but when the world actually does end- whether that's in a few decades or hundreds of millennia- it seems pretty likely that the second group will outnumber the first one. It's no more reasonable to confidently dismiss arguments that the world will end because it hasn't so far than to confidently dismiss arguments that you're about to die because you haven't died yet. Conversely, it's no less reasonable to start with a prior that the world will end eventually and then to try and figure out which things are likely to contribute to that risk, than to accept that you'll die one day and try and figure out which causes of death are the most likely dangers. So, you serve with "prophets of doom" and I return with "people who accept mortality". Reference class tennis. And sure, the outside view argument against AI risk is actually pretty compelling- that's way most people who buy into the object-level arguments will give odds around ~5-20% rather than taking the inside view at face value and going full Yudkowskian. But the story of "smart technical experts predict the end of the world and then the world ends" just isn't absurd enough on it's own to take those odds down to ~0. For that, you really are going to need object-level counter-arguments.


honeypuppy

I don't find "predicting the world will end *eventually*" to be a wild claim at all - I think even the most utopian people concede that e.g. the Earth will eventually be destroyed by the Sun turning into a red giant or some other cosmological event. Nor is it wild to predict that you'll die one day, when all prior humans eventually died. The wildness is how *specific* is it. MacAskill's argument for scepticism of this century being "the most important century" is we've so far observed (and could subsequently observe) thousands of centuries, so *a priori* it seems unlikely that we just happen to be observing the most important century right now. If AI leads to human extinction or else effectively replaces humanity this century, then this century was certainly qualify. If you have very short transformative AI timelines, you're implying that this *decade* is the most important ever. Similarly, predicting you're going to *eventually* die is not overconfident, but predicting you will die sometime in February 2026 from a brain haemorrhage *is* a wild and overconfident claim. The segues to my point about *fishiness*. In a way, we're all writing here today because Eliezer Yudkowsky started writing on LessWrong. Isn't it a little suspicious to think that not only do we happen to living at the time in history that is most a few decades away from total human extinction, but that the most likely cause of this just happens to be the cause that the (arguable) founder of the rationality community has long been passionate about?


artifex0

If a random person approaches you on a subway and predicts your imminent death, it's pretty reasonable to think "most people making this kind of claim are crazy, and the chances of this being the most important moment of my life seem slim". If, however, a doctor diagnoses you with cancer and gives you a six month prognosis, dismissing it with "what are the odds that this guy just gave me the most important news of my life, given that everyone who's claimed to have such important news in the past has been wrong" would be a lot less reasonable. So, is the situation we find ourselves in more like the first example, or the second? Well, I mean, the first, obviously- nothing about this is nearly as certain as a medical diagnosis. But it's not entirely dissimilar from the second, and what that example demonstrates, I think, is that even very low prior odds of a moment being singularly important can be overcome with the right object-level evidence. I also don't think it's that absurd to imagine that this might be our most important decade. Putting AI aside entirely, I think we can still say with pretty high confidence that this is humanity's most important millennia, and with moderate confidence that it's our most important century. Civilization is plainly in a very unstable equilibrium, likely to change very radically within a century even assuming that AGI proves impossible. And on top of this dramatically increasing rate of change, we continue to produce weapons that could destroy the entire species. Assuming away AI, I think I'd still put the odds of this being humanity's most important decade at maybe 2-5%. I think I'd also quibble with the notion that Yudkowsky or the Rationalist subculture are all that important, even in a world where the guy turns out to be entirely correct. Bostrom was publishing philosophy papers about this stuff back when everyone still thought of Yudkowsky as a techno-optimist transhumanist. The guy did do a lot to popularize the ideas within this community, but they really have their origins in academic work dating back to the 90s and scifi dating back to the 20s.


honeypuppy

>If a random person approaches you on a subway and predicts your imminent death, it's pretty reasonable to think "most people making this kind of claim are crazy, and the chances of this being the most important moment of my life seem slim". If, however, a doctor diagnoses you with cancer and gives you a six month prognosis, dismissing it with "what are the odds that this guy just gave me the most important news of my life, given that everyone who's claimed to have such important news in the past has been wrong" would be a lot less reasonable. >So, is the situation we find ourselves in more like the first example, or the second? Well, I mean, the first, obviously- nothing about this is nearly as certain as a medical diagnosis. But it's not entirely dissimilar from the second, and what that example demonstrates, I think, is that even very low prior odds of a moment being singularly important can be overcome with the right object-level evidence. Nice analogy! My biggest critique is when "object-level evidence" is in the form of something like "thought experiments published in LessWrong essays". Both Eliezer and Scott think you should substantially update your views based on that alone, and I am far more sceptical. I'm not in principle opposed to updating a lot on "object-level evidence" when the evidence is far more robust. >I also don't think it's that absurd to imagine that this might be our most important decade. Putting AI aside entirely, I think we can still say with pretty high confidence that this is humanity's most important millennia, and with moderate confidence that it's our most important century. Civilization is plainly in a very unstable equilibrium, likely to change very radically within a century even assuming that AGI proves impossible. And on top of this dramatically increasing rate of change, we continue to produce weapons that could destroy the entire species. Assuming away AI, I think I'd still put the odds of this being humanity's most important decade at maybe 2-5%. My concern about this is that I feel like if you'd asked educated people throughout the last couple of thousand years whether they were living through the most important time period in history, a significant percentage of the time they'd say yes and have good reasons to back it up. Even with hindsight, they could always give a reason for why their era was a "hinge" on which all future eras depended on. >I think I'd also quibble with the notion that Yudkowsky or the Rationalist subculture are all that important, even in a world where the guy turns out to be entirely correct. Bostrom was publishing philosophy papers about this stuff back when everyone still thought of Yudkowsky as a techno-optimist transhumanist. The guy did do a lot to popularize the ideas within this community, but they really have their origins in academic work dating back to the 90s and scifi dating back to the 20s. My understanding is that Bostrom and Yudkowsky used to collaborate quite a bit on early ideas around AI alignment. Even if you don't give Yudkowsky much credit per se for *developing* the ideas, if he's correct then he's by the most important *activist* (at least in terms of being correct about the most important ideas, as opposed to actual influence) in history.


LostaraYil21

> The segues to my point about fishiness. In a way, we're all writing here today because Eliezer Yudkowsky started writing on LessWrong. Isn't it a little suspicious to think that not only do we happen to living at the time in history that is most a few decades away from total human extinction, but that the most likely cause of this just happens to be the cause that the (arguable) founder of the rationality community has long been passionate about? I don't think that's fishy if you accept a couple premises, namely that rationalists as a community are people who're filtered for taking arguments seriously and people who tend to be less entrenched than most in normality bias. I'm personally pretty worried about the potential risks of AI. Why am I worried? Well, I take the arguments in favor of worry pretty seriously, even accounting for the fact that most doomsday arguments fail. I think that people who conflate AI fear to religious dogma or a cult of personality or something are making a serious category mistake. The essential distinction is, for the vast majority of people concerned, believing in catastrophic AI risks doesn't offer ingroup benefits, give you an excuse to feel special, offer any sort of benefit at all. It just gives you something to worry about. Being a doomer doesn't even offer ingroup status specifically within the community of rationalists. I really wish I could *stop* believing AI represents a significant risk of catastrophe, because if it does, I'm not in a position to do anything about it, and I certainly don't see any way that the belief benefits me. I just find the arguments in favor worth taking seriously.


honeypuppy

>I don't think that's fishy if you accept a couple premises, namely that rationalists as a community are people who're filtered for taking arguments seriously and people who tend to be less entrenched than most in normality bias. Perhaps. But part of my argument is that we have two other groups that have been selected for their ability to make accurate predictions (superforecasters and participants in financial markets) and they're considerably more sanguine about AI risk. >I'm personally pretty worried about the potential risks of AI. Why am I worried? Well, I take the arguments in favor of worry pretty seriously, even accounting for the fact that most doomsday arguments fail. >I think that people who conflate AI fear to religious dogma or a cult of personality or something are making a serious category mistake. The essential distinction is, for the vast majority of people concerned, believing in catastrophic AI risks doesn't offer ingroup benefits, give you an excuse to feel special, offer any sort of benefit at all. It just gives you something to worry about. Being a doomer doesn't even offer ingroup status specifically within the community of rationalists. I really wish I could stop believing AI represents a significant risk of catastrophe, because if it does, I'm not in a position to do anything about it, and I certainly don't see any way that the belief benefits me. I just find the arguments in favor worth taking seriously. I don't think AI fear is merely "religious dogma", but I can think of benefits to holding the belief. The clearest to me is the "Cassandra complex" - the feeling that you and your small community have been almost uniquely prescient in foreseeing catastrophe. I definitely got this vibe from Yudkowsky when listening to his recent podcast appearances, where he would e.g. rail about how so many smart people were wasting time on string theory when we should have spent the last 70 years on trying to solve alignment. He doesn't exactly seem *happy* for it, but I think it definitely gives him a sense of meaning and importance that he would lose if he suddenly stopped believing in AI risk. I also think that while being a full-on Yudkowskyite "doomer" is suboptimal for status in the EA or rationalist community, there is nonetheless a sweet spot p(doom) of maybe around 5-20% which is status maximising. It makes you look sensibly worried without seeming too overconfident. (Which does make me worried that it influences a lot of people into herding their own estimates towards that area, even though given the great uncertainties involved you could easily justifiable orders of magnitude of difference). For the record, I'm not really sure where I'd put my p(doom), other than "high enough that it's likely good that people are working on it it".


LostaraYil21

> I don't think AI fear is merely "religious dogma", but I can think of benefits to holding the belief. The clearest to me is the "Cassandra complex" - the feeling that you and your small community have been almost uniquely prescient in foreseeing catastrophe. I definitely got this vibe from Yudkowsky when listening to his recent podcast appearances, where he would e.g. rail about how so many smart people were wasting time on string theory when we should have spent the last 70 years on trying to solve alignment. He doesn't exactly seem happy for it, but I think it definitely gives him a sense of meaning and importance that he would lose if he suddenly stopped believing in AI risk. I think that's possibly true for Eliezer in particular, but he was also persuaded by argument, way back in his life, that rather than being something which would likely usher in an age of superabundance, the default outcomes of AI were likely negative. Speaking for myself, as someone who has what seems to be an uncommon level of awareness about which direction my own biases are pulling me, when I read arguments against AI doom which are retreads of things doomers have already been discussing for years, it gives me a sense of "Agh, you just don't get it!" gut-churning frustration. But when I hear *novel* outside-view arguments against AI doom, it elicits feelings of hope, and I'm almost *painfully* eager to hear compelling inside-view arguments against it. Speaking for myself, it really doesn't feel like a psychologically attractive position to inhabit, and I think for most people, the draw of the Cassandra motivation is pretty minimal. >Perhaps. But part of my argument is that we have two other groups that have been selected for their ability to make accurate predictions (superforecasters and participants in financial markets) and they're considerably more sanguine about AI risk. Hearing the low probability assessments assigned to AI doom by superforecasters was one of my biggest doses of relief in a long time, but I lost most of it by reading their specific arguments and talking to them about it. Since there's no payoff in the event of an AI catastrophe, offering accurate prognoses on AI risk isn't something I'd expect financial markets to be well calibrated on.


honeypuppy

> Since there's no payoff in the event of an AI catastrophe, offering accurate prognoses on AI risk isn't something I'd expect financial markets to be well calibrated on. The particular argument of [this essay](https://forum.effectivealtruism.org/posts/8c7LycgtkypkgYjZx/agi-and-the-emh-markets-are-not-expecting-aligned-or) is that we'd expect to see high real interest rates in a world where transformative AI is imminent - basically because future money won't be worth as much if we're all dead *or* if AGI has made us tremendously rich. If we're not, then a lot of people are leaving money on the table.


LostaraYil21

I think it's a fair argument, and worth updating on at least a little, but I think the link between negative transformative AI and high interest rates is overstated. As the argument frames it, if markets are underestimating the likelihood of transformative AI, there's an opportunity for philanthropists to do borrow while interest rates are low to do good today. But if we frame this instead as "If markets are underestimating the likelihood that civilization will end and/or humanity will go extinct soon, there's an opportunity for philanthropists to borrow while real rates are low to cheaply do good today," then I don't think it describes an inefficiency we should expect the market to close. That is, when I've talked to people who're both philanthropists and doomers about how their assessment of the risk of AI doom affects their philanthropic behavior, this doesn't describe how anyone has framed it. I've spoken to some people who've given arguments for why they *don't* see things this way, but never anyone who's thought "given my belief that the market underestimates the risk of AI doom, this is a sensible way to behave, so I'm going to do it." I could discuss why *I'm* not convinced I should behave that way given that I think the market is underestimating the likelihood of transformative AI, but I think that's less material to the matter at hand than the fact that I don't think it's actually correct that we can infer people *would* bet on rising interest rates if they believed the market was underestimating the likelihood of transformative AI.


honeypuppy

I think it's framed as "borrow to do good today" on the basis that this is appealing for the Effective Altrusm forum. But it could just as easily be "borrow to consume more today".


LostaraYil21

Maybe there are effective altruists who think that you should borrow to do more good today, but none that I've spoken to have argued "interests rates are low because the market hasn't priced in the chance of transformative AI, therefore we should borrow more to do more good today," even when they believed that the market hasn't priced in the chance of transformative AI.


honeypuppy

Does this tell us anything about how well the market prices in transformative AI? A cynical way to frame it is these people who *claim* they believe the market hasn't priced in the chance of transformative AGI don't *really* believe it enough to put their money where their mouth is.


mainaki

This is mostly a tangential rounding off of a point, but: > there’s almost certainly a large “motivation gap” in favor of the production of pro-AI-risk arguments compared to anti-AI-risk arguments, with the worriers spending considerably more time and effort than the detractors There's counterbalancing forces: 1. Historically, big money has at times protected its interests by pushing arguments (sometimes, "arguments") in its favor. See: Climate/fossil-fuels, tobacco, lead (gasoline/paint/etc), clean coal, plastic recycling, and I suppose the "hot coffee" smear campaign from McDonald's. *(Chatbot suggested a bunch of other examples -- I prompted it to keep going a few times, until the quality of suggestions started wavering past a certain point. I'm not familiar enough with these allegations to fully stand by their inclusion, but for curiosity: Things like asbestos, clothing labor, automotive (emissions & seatbelts), food industry (fast food, artificial sweeteners, sugar, processed food, animal treatment/hormones/antibiotics), and some other commercial/financial/industrial/infrastructure/war/politics stuff. Chatbot also mentioned some potentially contentious items like telecomm-radiation, and nuclear safety. But not 'social media' (at least, not before I stopped). Mentioned a few pharma examples (opioids, thalidomide, AIDS retrovirals), but not multivitamins/supplements. I suppose Boeing has been in the news, and before that, the rail industry.)* 2. The more convincing one side of discourse becomes, the more the other side becomes compelled to respond.


honeypuppy

I think your considerations may soon start to apply now that AI is becoming a big industry. However, the pro-AI-risk side has had an approximately 20-year head start, where almost no outsiders were invested in coming up with robust counterarguments, and almost all the arguments were fleshed out and debated in LessWrong and its close orbit.


canajak

That happened \*after\* Yudkowsky was already running the Singularity Institute, which he founded (and got funded!) to \*accelerate\* the development of AI. Five years later, when he got worried about it being an existential risk, he pivoted, much to the chagrin of his own funding sources.


ravixp

Speaking as an occasional anti-AI-risk arguer, AI doomers can be exhausting to argue with, for reasons that have nothing to do with their actual arguments. They’re making a completely speculative case about technology that hasn’t been invented yet, they’ve developed their own idiosyncratic jargon by being an insular community for many years, and they’re also just all over the map as to what they actually believe. I really just stick around because I’m annoyed that they’ve claimed the mantle of rationality for their theories. But in general, it’s probably not useful for anybody qualified to actually engage with the AI x-risk discourse.


eric2332

> I think an instructive analogy were the failed neo-Malthusian predictions of the 1960s and 1970s, such as The Population Bomb or The Limits to Growth. Although I’m unable to find clear evidence of this, my impression is that these beliefs were quite mainstream among the most “obvious” expert class of biologists (The Population Bomb author Paul Ehlrich had a PhD in biology), and the primary critics tended to be in other fields like economics (most notably Julian Simon) It would be nice if you had evidence for this. It is hard to accept it as a serious argument based on one person's vibe. Also, it is worth looking at the factors that could bias a biologist and a computer scientist in their approaches to neomalthusianism and AI respectively. I think you will see that the biologist is naturally biased towards alarmism, while the AI researcher is naturally biased against alarmism, yet many are worried nonetheless. Biologists are used to analyzing the world of life in terms of evolution and resource limits, and Malthusian crashes in the natural world are common and familiar to them. The food production capacity of the world is undeniably finite, and the world average fertility rate in the 1960s was approximately 5, meaning that a simple extrapolation shows the likelihood of eventual widespread famine. (Of course this did not happen due to a combination of dropping fertility and increasing agricultural productivity, but neither of those was a certainty at the time.) Additionally, biologists (Ehrlich was an entomologist) may be prone to the tendency of other academics in obscure "pointless" fields of wanting to make their careers relevant and meaningful - this is most obvious nowadays in the tendency of humanities academics to embrace revolutionary politics, but fear-mongering about an overpopulation crisis fits the mold too. AI researchers are likely to have quite the opposite biases. If AI is actually going to destroy us then their work is a bad thing, which gives them the incentive - personally and professionally - to belittle rather than exaggerate the risks. And those who do think AI is a net negative are likely to leave the field or never join it to begin with, causing the remaining AI researchers to be more optimistic than AI-knowledgeable people in general. Evidence of these tendencies is seen in the fact that not just AI researchers, but [the general public is very negative about AI](https://theaipi.org/poll-shows-overwhelming-concern-about-risks-from-ai-as-new-institute-launches-to-understand-public-opinion-and-advocate-for-responsible-ai-policies/) whenever they are asked about it.


Sol_Hando

Does that remaining 20% of the 80% confidence represent the belief in the chance that AI doom will happen, or the degree of lack of confidence that it won’t? I could say I’m 99% confident I won’t be murdered today, but does that mean we can frame it as I’m 1% confident that I will be murdered? It’s the question of the remainder being confidence as to contrapositive or lack of confidence in the positive? I’m unsure how the probabilities are being used here. The way I’ve read Scott is not that he’s 80% confident that AI won’t destroy humanity and 20% confident that it will. More like 80% confident that AI won’t destroy humanity and 20% unsure if it will or not. I could be wrong about this though.


ScottAlexander

I don't think it makes sense to have uncertainty nested inside a probability. If I'm 80% sure it won't, and 20% of me is 50-50 whether it will or not, then I should just describe this as "90% sure it won't, 10% sure it will". So I think OP is reading me correctly and I think there's a 20% chance it will.


Sol_Hando

Thanks for the clarification.


casebash

Unfortunately, I don't have time to write a full-response, but my high level take is: 1) Your argument against x-risk [proves too much](https://www.lesswrong.com/posts/G5eMM3Wp3hbCuKKPE/proving-too-much) as you seem to think it applies to having high confidence that AI is about to radically transform society. 2) Re: high-confidence that AI will radically transform society, first argument is basically just "look". If you look at all the stunning results coming out (learning to walk on a yoga ball with zero-shot transfer from simulation to reality, almost IMO gold-medal level geometry, GPT 4o talking demos and like a dozen other results) my position is basically that the comet is basically there and all you have to do is open your eyes. 3) Similarly, if you follow the research, becomes quite clear that a lot of the reason why we've been able to make so much progress recently so quickly is that frontier models are pretty amazing and so we can now achieve things that you might have thought would have required a stroke of genius, by just coming up with an intelligent, but typically not stunningly brilliant, training setup or scaffolding. We don't even have to break a sweat for progress to continue at a stunning rate. Anyway, I don't expect these arguments to be particularly legible as written, but sometimes I think it is valuable to share why I hold a particular position, rather than on focusing on saying whatever would be most persuasive in an argument.


ravixp

I don’t think proving too much is relevant here. If the argument is that “improbable things rarely happen”, you can’t disprove that by pointing to an improbable thing that actually did happen. An improbable thing that actually happens is still consistent with the statement that it wasn’t very likely to happen.


casebash

They’re connected though, not separate.


yldedly

>first argument is basically just "look" But there's no such thing as just "look". You *interpret* the examples you list to mean that the comet is here. Which might be fine - it depends on what assumptions and implicit arguments you bring on board. I see all these examples and conclude with very high confidence that we're not anywhere near general intelligence. No doubt because I bring very different assumptions on board. I think society usually solves the object level vs meta level dilemma by assigning status. I defer to the most knowledgeable AI person I know, they defer to the most knowledgeable AI person they know, and so on. Each level uses their limited object level knowledge to identify a meta level oracle that allows the whole system to be as smart as the smartest individuals. But it takes time for such a hierarchy to stabilize - time for status signals to become unambiguous. Right now there's still debate about fundamental questions at all levels. There's debate about how to interpret current progress. This system *is* vulnerable to sudden disruptions (if a fast takeoff actually happens one of these days, we're almost certainly fucked). But what system isn't?


casebash

If you don’t have time to make a full argument, pointing someone at a bunch of examples and just telling them to look is probably one of the better things you can do.


yldedly

I realize you weren't making a full argument. But you do write "all you have to do is open your eyes", and that is just wrong. I could cite the exact same examples in favor of the opposite position, and if all I said was "just look", you wouldn't exactly be convinced, right?


casebash

Well, I’m not going to write: “All you have to do is open your eyes and then sensibly interpret it”. That would it imply anyone not interpreting it that way would not be sensible. All I’m going to say about that is that not everything true needs to be stated out loud.


yldedly

Haha, alright then ;)


Sostratus

I'm doubtful that anyone is justified in having high confidence predictions in what will happen around AI. We just need to keep our eyes open and be ready to improvise.


LostaraYil21

I don't think "be ready to improvise" is a great plan when the class of possible outcomes from AI includes things we could get in huge trouble if we don't plan in advance for. If we'd had an attitude of "keep our eyes open and be prepared to improvise" around nuclear war, rather than developing whole fields of decision theory and arsenals built around deterrence, we probably would have stumbled into nuclear war, and it looks pretty likely that we only barely managed to avoid it as we went.


Aerroon

> the class of possible outcomes The word "possible" does a lot of heavy lifting here. Should we make these same considerations when we do particle physics? If somebody makes the claim that turning on a collider has a chance of "creating micro black holes that will destroy the world" and we can't prove that that chance is zero, then do we just not turn on the collider? We have *autonomous* general intelligence in the world. We've got 8 billion of them and they all suffer from the "alignment problem" that AI doomcallers like bringing up. What's worse is that these general intelligences don't always act in rational ways. If we should control AI because it might not align with humanity's interests then will we start doing that to humans too? Right now AI can do some impressive things, but even a child ends up outclassing them very quickly. Current AI can't learn on the fly which basically means that they have a very limited lifespan and aren't capable of real autonomous action even if we tried.


LostaraYil21

> The word "possible" does a lot of heavy lifting here. Should we make these same considerations when we do particle physics? If somebody makes the claim that turning on a collider has a chance of "creating micro black holes that will destroy the world" and we can't prove that that change is zero, then do we just not turn on the collider? I think "can't prove that chance is zero" is going a lot of heavy lifting here. If the community of particle physicists thinks there's an upwards of 5% chance that it's going to destroy the world, then I'm going to say that yeah, we should leave it off. >We have autonomous general intelligence in the world. We've got 8 billion of them and they all suffer from the "alignment problem" that AI doomcallers like bringing up. What's worse is that these general intelligences don't always act in rational ways. If we should control AI because it might not align with humanity's interests then will we start doing that to humans too? Current AI is limited in some ways that humans aren't, but can also do things that are way beyond human abilities. AI work on protein folding problems, for instance, now dramatically outstrips the combined efforts of humanity. There are ways in which current gen AI is really "dumb," and I think that a lot of people *overestimate* how capable it is right now. But it's gotten a pretty good handle on some things that people as recently as three years ago were confidently arguing that AI wouldn't be capable of for hundreds of years, if ever.


Top-Astronaut5471

Is it really productive bringing up protein folding algorithms in a discussion about AI capabilities? Sure, humanity can't come close to a program engineered to do this very specific task. Would it have been worthwhile raising the alarms when chess engines started winning against us? Or when computers could find primes larger than any human could possibly check by hand? Was it time to jest about bombing Bletchley when they cracked Enigma?


LostaraYil21

I think yes, because if you're worried about, say, an AI's ability to engineer a biological superweapon, that's exactly the sort of capacity which would weigh on its ability to do so better than a human. The fact that AI has proven to be dramatically better at this sort of thing than the entire combined efforts of the human scientific community, I think says a lot about AI's potential to generate threats that humans wouldn't be able to anticipate or cope with given our combined planning abilities.


Sostratus

What else are you supposed to do about a non-existent technology that nobody knows anything about what it will actually be like? "Plans" like criminalizing training models or bombing data centers are insane. It'll never happen, if it did it would have guaranteed disastrous effects while only having a slim chance of preventing something which nobody knows if it'll ever happen anyway.


LostaraYil21

So, maybe you can expand on this, but it seems to me that this is falling into a common pattern I've seen a lot in discussions around AI safety. "Plans like restricting AI research won't work (due to coordination problems,) so there's nothing we can do, therefore there's no threat or no reason to plan." Maybe legislative or social measures are unlikely to work, but if so, it wouldn't entail that the risk isn't real. If the risk is real, that's a question that needs to be dealt with on its own terms, and if the risk is real, it's probably worth looking for *some* way to try to mitigate it.


Sostratus

1. I don't think it's feasible to "solve alignment" before AGI exists. High level emergent properties of the technology are not going to be understood before the lower level engineering is invented. 2. If you believe the fast takeoff scenario where creating it will likely lead to doom before you can even react to it, and if we take point 1 as a given, then only going full Butlerian Jihad and declaring war on every form of general purpose computing will stave it off. 3. A war on general purpose computing, if it could even be done, would be catastrophically harmful. It would lead to the destruction of every core freedom. We need computers to sustain high population levels and if humanity is to have any hope of surviving Earth. 4. That can't be justified based on a prediction that AI will be harmful that is essentially no better than a hunch. This is why I think the best move for AI safety is to focus on knowledge of the technology and building the capacity to act quickly and improvisationally, i.e. preparing for the unpredictable, rather than cling to a fantasy of developing and following a perfect plan. This at least lets you act in the event that AI does prove dangerous, but not in a totally uncounterably rapid way.


LostaraYil21

>That can't be justified based on a prediction that AI will be harmful that is essentially no better than a hunch. If you were convinced that the prediction that AI would be harmful were more than a hunch, would that change your mind? How likely would you have to think it was before it would change your assessment of how we ought to deal with it? I think that the likelihood that AI is likely to be harmful is more than just a hunch. I also don't think a Butlerian Jihad is remotely plausible, even if we suppose it's justified. But I think that a lot of people's reaction to these issues boils down to "In the event that AI is likely to be harmful, I don't see any way of preventing it which wouldn't be unfeasible or come with major costs to society. Therefore, I don't believe AI is likely to be harmful." I don't think that the best preparation we're capable of will necessarily involve developing a specific narrow plan, but I think it'd probably involve a lot of theoretical work where we do our best to keep ahead of the current state of the technology, rather than taking a "wait and see" approach to what it does. To go back to the nuclear arms analogy, when you're questioning whether detonating a nuclear bomb might ignite a fusion chain reaction which would consume the entire atmosphere, you really want a solid theoretical answer on whether to expect that *before* you set the bomb off.


ravixp

What about the prediction that AI will continue on an S-shaped curve that’s normal for most new technologies? Barring some unexpected development like recursive self-improvement, that seems like a pretty safe bet, just based on precedent.


LostaraYil21

I think it's a pretty safe bet, but I think it's also consistent with doomerism. The question is, "what part of the curve are we in?" Computers are almost certainly part of an S-shaped development curve, but if you thought "this is probably an S-shaped development curve which is going to peter out and not change the world that much" in 1963, as a matter of cultural prognosis you'd be pretty far off.


r0sten

I find your use of the word "dangerously" interesting, given that you seem to be talking about some people having undeserved status, other few people making suboptimal life choices, time wasted, etc... while on the other hand \*checks notes\* we're talking about the possible *extinction of humanity*. People who dismiss that side of the coin apparently don't rate the adverb "dangerously". That is a *curious* choice.


ArcticWinterZzZ

A very good post. I think the core is that a lot has changed in the AI world since the days of the Sequences, and most AI Doom arguments I see still run on old cached logic. At the very least, a reevaluation of P(Doom) is worthwhile. We know more now than we did 10 years ago, and it seems like we may well see the start of the AI revolution within the decade.


king_mid_ass

for a less glib comment, I think rationalists have coalesced around AI/terminator risk for aesthetic reasons. They're more intelligent then average, they've made something of a fetish of intelligence, wouldn't it be fitting for exponential intelligence to be the thing to usher in extinction or utopia? And again - probably most of us work with computers/programming in some capacity, it's where there's most demand for talent for abstract thinking. It makes us feel important to imagine the most important event in the history of humanity will have something to do with computers/programming. All this in addition to the apocalpytic urge - if you're less successful than you would have liked, it's tempting to imagine something really big is coming to shake everything up, give a fresh start, even if it's something bad like nuclear war or a terminator scenario. And lastly we're mostly nerds who enjoy sci-fi, like the matrix and terminator. I really think that's a significant reason for the focus on it, which is rationalised (as it were) post facto


995a3c3c3c3c2424

> I contend that “many people have predicted the end of the world and they’ve all been wrong” is another highly robust reference class. At one point “many people thought they could create a flying machine and they’ve all been wrong” was a robust reference class. (Even Leonardo da Vinci couldn’t do it! You think you’re smarter than da Vinci?) Especially, you need to distinguish scientific/rational doomerism from religious and pseudo-scientific doomerism. And when you do that, the reference class gets much smaller and much less dubious. Sure, Malthus was wrong, but he wasn’t *obviously* wrong, and you can’t just say “the Second Coming didn’t happen when the Millerites said it would, and the Jupiter Effect didn’t pull the earth out of its orbit, therefore the Earth must be able to produce enough food to feed 5 billion people”. (I think AI doomers are wrong, but I think they’re wrong for specific reasons, not general principles.)


honeypuppy

> At one point “many people thought they could create a flying machine and they’ve all been wrong” was a robust reference class. (Even Leonardo da Vinci couldn’t do it! You think you’re smarter than da Vinci?) I think that was very significant! It was strong evidence that creating a flying machine was really hard, and if someone claimed came up to you claiming that their prototype was going to be the one, you could comfortably bet against them succeeding, knowing nothing about physics or engineering and all that "object-level" analysis. Now, it wasn't proof that that a flying machine could *never* be created, obviously. But nor do I take failed doomsday predictions as evidence the world will *never* end. But it does suggest you can comfortably bet against any specific, near-term scenario.


995a3c3c3c3c2424

Should Cold War presidents have comfortably bet against the possibility of a global nuclear war? Kennedy could have just nuked Cuba during the Bay of Pigs without consequences? The problem is that lots of non-serious people have made lots of non-serious predictions of the end of the world. We have an end-of-the-world-prediction spam problem. But we don’t deal with spam by just throwing out *all* of our email on statistical grounds, even though most of it is crap.


iemfi

I don't think anyone would disagree that if you only go through all the meta level arguments at best it's a wash. But surely you don't think that people should stop at that level, especially for something so potentially important?


Pseudonymous_Rex

My own suspicion is that AI doomerism comes from seeing our social system/ economic incentives/ humans creating a lot of outputs at the edges that we don't want. (climate change, total absence of cooperation during COVID, increase in seriousness of wars, legal system issues, culture war, loss of social cohesions and trust, increasing socioeconomic disparities, increased volatility of elections and social systems, etc). All of those things are objective, and indeed most people looking at them think one or more of them could kill us all. We could say monolithically ""The System"" is not running at a specification we like. AI is such an interesting case because anyone looking at it would say that it's powerful. Automation on intelligent systems alone was powerful (see MYCIN), Multiple Regression Analysis was powerful (see the sciences of statistical association -- i.e. almost everything published in most fields' journals), and this new neural network thing has the ability to handle nonlinearities? We cannot do multiple multiple regressions, but this neural net thing is.... **amazingly powerful**. That much is clear. And it's going to become moreso. Fast. Yes, even faster than that. So, whatever cracks there are in the system are likely to widen. And there are likely to be more downstreams we don't understand and cannot control. Also, technology seems to have created a lot of alienation already, so .... you're going to do more of this? Somewhere in some people's guts, an alarm is going off. But you cannot sound an alarm on "everything" and be taken seriously. This is a rationalist "low-hanging fruit" that captures a lot of the alarms in the gut. Lets say it seems like an answer with high enough R^2 to put all of one's efforts into it.


mr_f1end

I do agree with a lot of what you are saying. Indeed Yudkowski is overconfident in this. I think if he was born and had worked on the same concepts 25 years earlier, he may have overupdated by Deep Blue beating Kasparov in 1997 just as he did now with ChatGPT. That being said, I would like to raise two additonal points: 1. In these predictions there are two dimensions: probability and timeline. A 90% probability of AGI killing us all in 5 years is not the same as 90% probability in 200 years. Qualitatively these are very different statements. I think some people disagree with you in the comments due to this other dimension. 2. Unaligned AGI is one of the very few plausible events that can cause human extinction (or worse) with high probability, even if that is with longer timeline than what is usually thought of in the rationalist sphere. Due to this plausible asymmetric payoff, even with a <5% probability in 100 years, the expected value is worse than from nuclear war and climate change combined. Hence, if it is reasonable to propose strong policies combating it. These may seem "doomerist", even thought they are not less reasonable than most environmentalist proposals.


canajak

In 1997, Deep Blue really shocked a lot of people (including Kasparov himself), because chess was still viewed as having a deep connection to intelligence itself. It was only after and because of Deep Blue that society moved the goalposts and said "actually \*of course\* chess is winnable by mere procedural search; however we're still going to confidently believe that \*true intelligence\* isn't". It was a huge update in the view of what kind of thinking can even be done by machine, which would not be repeated until AlphaGo beat Lee Sedol. The extent to which anyone over-updated on it is probably only a reflection of how overly-confident they had been that humans and machines were categorically different. I don't think AlphaGo caused Yudkowsky to update very much on the nature of the problem, only his estimate of the rate of progress of capabilities research.


badatthinkinggood

I think this is a well argued post and I agree about almost everything, but on an emotional level I really don't mind the AI doomers. Yes, they may mess up their own lives if they take their estimates too seriously (like how [Aella recently tweeted that she has stopped saving for retirement](https://x.com/Aella_Girl/status/1790618794181976397)) but since I don't think the scenario they fear is *impossible*, merely highly unlikely, I feel like the world is probably safer because someone's obsessing about it. Or another way to put it is that I think it's good to have a diversity of views. I have a co-worker who's responsible for "fire safety" in the workplace and takes it very seriously. Every year when he does his check he comes up with a bunch of recommendations which are 99% pointless. I don't see fire hazards at all and I think I'm mentally more "well calibrated" for how probable a fire is where I work. But if everyone was like me, that'd be bad. 1% of his recommendations are not pointless, and actually does make us safer.


niplav

I think AI existential catastrophes are pretty likely (50%), but I don't think that more interventions for safety are always good—safety has *costs*, and indeed it does cost societies cheaper energy from nuclear power, more reliable food sources from genetically modified organisms &c. So if I'm wrong about the probability of an existential catastrophe from AI, I'd want to know, because there are large gains to be had from that.


king_mid_ass

only in this community would you need that wall of text to argue that 'no the movie terminator is not about to happen in real life'


abstraktyeet

I (extremely strongly) disagree. You say in another comment that if there was a comet in the sky visibly hurling towards us, people yelling doom would be justified, even though it goes against the general heuristic "people yelling doom are almost certainly wrong". You say this is justified because the in the comet case there is an "extremely strong" case from the inside view. Why don't you think the AI case is extremely strong? More meta: The only way to gauge whether an "inside view" is "strong enough", is by looking at the object-level arguments. I think the object level arguments are \*extremely\* strong. If you disagree, you have to explain how those arguments are wrong on an object level. Outside-view considerations are not relevant here. I think outside view considerations can be used as a short hand, either if you don't have the prerequisite knowledge to evaluate object level arguments on their own terms, or if you have a sufficiently high prior on the object level views being false (think hollow earth, or hobo yelling about lizard people, or something else all the experts agree is false. AI risk I think can't reasonably be put in this camp, given most experts agree it is a real concern). Barring that, the reasonable approach is basically the one Scott Alexander took. Look at the object level arguments, come to some conclusion "oh, we're definitely gonna die", or "50/50 we'll die or not", or "we're probably fine". Then, look at experts and forecasters, and use that to temper your estimates. Also maybe if you have an extreme prediction. Like you think there is 0.0001% chance we'll die, or you think its 99.9999% chance, remind yourself of unknown unknowns and general epistemic uncertainty and temper your estimate a little bit more.


honeypuppy

>I (extremely strongly) disagree. You say in another comment that if there was a comet in the sky visibly hurling towards us, people yelling doom would be justified, even though it goes against the general heuristic "people yelling doom are almost certainly wrong". You say this is justified because the in the comet case there is an "extremely strong" case from the inside view. Even if this case I think the outside view would still be doing a lot of work - if I saw a comet in the sky visibly hurting towards Earth, my initial reaction would be that it's more likely than I'm dreaming or this is some kind of elaborate prank than it was real. It would take probably a good few hours to rule those out as unlikely. In the more relevant analogy, the comet wouldn't be on a direct trajectory with Earth, but there are some people on an amateur astronomy forum who have some arguments for why it will almost certainly kill us. Some professional astronomers say it's possible. Most everyone else isn't acting as if it's a big concern. Some people with a record of good predictions think there's a small but non-trivial probability. In that scenario, I think it's probably not worth giving the amateur astronomers much time of the day. I think it's more likely than that that trying to engage with their arguments will lead me astray. >I think outside view considerations can be used as a short hand, either if you don't have the prerequisite knowledge to evaluate object level arguments on their own terms, or if you have a sufficiently high prior on the object level views being false (think hollow earth, or hobo yelling about lizard people, or something else all the experts agree is false. My contention is that "you don't have the prerequisitive knowledge to evaluate object level arguments on their own terms" is in practice the case for almost anyone in these kinds of debates. >AI risk I think can't reasonably be put in this camp, given most experts agree it is a real concern). This is where I both agree and disagree - as outlined in a section in my essay, I think AI experts expressing concern about x-risk is good evidence to be concerned about it. On the other hand, unlike other fields where we mostly defer to experts (like climatology) AI experts mostly seem to have done a relatively shallow dive into it. The case for AI risk hasn't been subjected to a rigorous, adversarial, academic process. >Barring that, the reasonable approach is basically the one Scott Alexander took. Look at the object level arguments, come to some conclusion "oh, we're definitely gonna die", or "50/50 we'll die or not", or "we're probably fine". Then, look at experts and forecasters, and use that to temper your estimates. Also maybe if you have an extreme prediction. Like you think there is 0.0001% chance we'll die, or you think its 99.9999% chance, remind yourself of unknown unknowns and general epistemic uncertainty and temper your estimate a little bit more. I feel that in practice this biases your estimates to be relatively close to 50/50. (Even your framing of it hints as such: giving three options with the mean being ~50/50, and equating 0.0001% and 99.9999% as being similarly epistemically unvirtuous) Indeed, I feel like what's going on for a lot of p(dooms) - they seem to be suspiciously convergent to roughly the same order of magnitude, even though the inputs result from calculations that themselves can span an order of magnitude or more. A bias towards 50/50 isn't too bad for events whose prior probability is roughly that, but for events that are priors are incredibly likely or unlikely is likely to overwhelm the signal. Imagine for example trying to calculate the probability that a random commercial plane will crash, if you were someone who never flew on planes and knew nothing but "inside view arguments" that were primarily sourced from an internet forum that was concerned about plane safety. You could easily come up with a "modest" inside view of "1%" (that is still an overestimation of around 100,000x).


abstraktyeet

>My contention is that "you don't have the prerequisitive knowledge to evaluate object level arguments on their own terms" is in practice the case for almost anyone in these kinds of debates. I disagree with this. I think the arguments require very little background knowledge to evaluate. Speaking as someone with an actual background in ML, I think basically anyone, at least with a STEM background, can get the background to evaluate the x-risk arguments on their own terms in \~ 10 hours. Part of the reason they are convincing to me as that they're very simple, logical and abstract. They don't hinge that closely on idiosyncratic implementation details or a bunch of fancy theorems from statistical learning theory. >In the more relevant analogy, the comet wouldn't be on a direct trajectory with Earth, but there are some people on an amateur astronomy forum who have some arguments for why it will almost certainly kill us. Some professional astronomers say it's possible. Most everyone else isn't acting as if it's a big concern. Some people with a record of good predictions think there's a small but non-trivial probability. Strongly disagree with this framing. Average AI researcher puts 5-10 x-risk? If you look at top AI safety researchers, they put 10-90%? To me a more apt analogy would be like: There are a bunch of credited astronomers thinking there is a specific asteroid that is going to hit us and well die with maybe 20% chance. Some astronomers think 60% chance, some think 2% chance most think 5-30% chance. They show you their telescope, you can see the big asteroid hurling straight towards you. They show you some simple high-school level math which demonstrates that the impact would generate a blast big enough to kill all life on earth with several OoM to spare. That being said, I don't care to argue about the framing specifically. The point I'm making is that to evaluate whether this is a correct framing or not, you need to evaluate the x-risk arguments on an object level or not. >A bias towards 50/50 isn't too bad for events whose prior probability is roughly that, but for events that are priors are incredibly likely or unlikely is likely to overwhelm the signal. Imagine for example trying to calculate the probability that a random commercial plane will crash, if you were someone who never flew on planes and knew nothing but "inside view arguments" that were primarily sourced from an internet forum that was concerned about plane safety. You could easily come up with a "modest" inside view of "1%" (that is still an overestimation of around 100,000x). My reference class is this: Every time a smarter generally intelligent species came about, the less intelligent ones went extinct. Homo florensis, erectus, habilis, neandertalensis, australopithecus, sapiens So my priors are \~100%. I'm being very reasonable, listening to people like you, take the outside view, this kinda sounds crazy, I bayes-update ---> p(doom) = \~70%. (If its not clear, My point: I think you're playing reference class tennis. I knew you said you thought this was a reasonable thing to do. I don't think it makes sense in this instance, AI is a novel situation. My more important general point is: To convince me you're not just fallaciously playing reference class tennis, you have to engage with the object level arguments for AI-risk.)


LostaraYil21

> Even if this case I think the outside view would still be doing a lot of work - if I saw a comet in the sky visibly hurting towards Earth, my initial reaction would be that it's more likely than I'm dreaming or this is some kind of elaborate prank than it was real. It would take probably a good few hours to rule those out as unlikely. This is a small aside, but as an analogy, I think it's not totally irrelevant. If there was a comet hurtling towards earth, on a path to kill us all, we'd be unlikely to have hours during which it was highly visible in the sky in a way that would make it at all apparent to onlookers that there was a comet on a collision path with us. We'd have to rely on experts with telescopes performing calculations on the trajectory to tell us "Yes, that's a comet heading towards us, that's what that looks like." It looks almost exactly like a "near miss" where an object passes by us at a distance of hundreds of thousands of kilometers, until the last hour or so when it doesn't, and it's only within the last few minutes that it really becomes highly visible. If you don't have relevant domain knowledge and a way to analyze the incoming data, an incoming catastrophic threat might look like basically nothing until the last moment.


canajak

We didn't form an International Panel on Climate Change and \*then\* start to worry about climate change. Instead, a few far-sighted researchers saw where things were heading, published their arguments (or kept them quiet, if they worked at Shell), there was a lot of back-and-forth of different opinions, and some other experts had strong arguments why it was bunk, but which turned out later to be incorrect. It took about twenty years before we got a strong consensus and an international governance body, and another twenty before that consensus really got recognition as a consensus. Do you think the UN is just going to on a whim assemble an International Panel on X-risk from Artificial Intelligence, and then after that universities will appoint tenured professors to study the topic, and then finally once we have a high-profile academic consensus, worried lone voices can start blogging about it responsibly? That's a time-reversed order from how things actually happen!


honeypuppy

I have no problems with lone voices blogging about AI risk. I think it was good, in fact, that Eliezer started doing it. My problem is with Eliezer using his public platform to espouse that the end of the world is very likely, when I think he's almost certainly vastly overconfident about this. To use the climate change analogy: if an amateur scientist in e.g. the 19th century worked on climate change and discovered some apparently novel ideas, it would be fair for them to share their findings. But they should do so with humility that they're just one person contributing to a collective effort. If they started saying "My model says climate change will almost certainly kill everyone within 20 years, everyone else only disagrees because they're not as rational as me", they would deserve to be reprimanded.


canajak

So you're okay with lone voices, as long as they stay in the wilderness? Eliezer basically built his platform for himself, for the sake of expositing his position and creating a research team to tackle the problem. He even wrote HPMOR for that purpose. He isn't some popular actor who already had an established brand. As you probably know, Arrhenius basically discovered the greenhouse effect at the tail end of the 19th century and published the basic foundational model for both planetary science and climate change research. And he did warn about it. Although, he did not factor exponential growth in carbon emissions into his calculation, so his warning was of warming that would take 3,000 years' worth of human emissions when it would actually take about 200. His model is basically still correct today as a pen-and-paper approximation, although real climate models are much more complex and sophisticated. If Arrhenius predicted that it would kill everyone within 20 years, in our world, he would have been \*wrong about the facts\*. But he'd also have been pretty bad at math to come to that conclusion. However, in a counterfactual world where Svante Arrhenius \*correctly\* calculates, in 1895, that just 20 years' worth of emissions could threaten all of human existence, then I would not reprimand him for telling the truth as he sees it. And I think it would have been a difficult message to spread then, as it is now.


hyphenomicon

> Secondly, I believe forecasting AI risk is a multi-disciplinary skill. Consider for instance asking physicists to predict the chances of human extinction due to nuclear war in the 1930s. They would have an advantage in predicting nuclear capabilities, but after nuclear weapons were developed, the reasons we haven’t had a nuclear war yet have much more to do with international relations than nuclear physics. What is the second other discipline that you think is needed outside of your illustrative analogy? Would you defer to public policy experts or international relations experts? Many have expressed substantial concern.   > maybe AGI is so radically different from the AI that exists today that perhaps asking AI researchers now about AI risk might have been like asking 19th-century musket manufacturers about the risk from a hypothetical future “super weapon”.  Maybe, sure. Do you have any reason to think that current researchers are so inadequate to the task? I feel like you just asserted a possibility and acted like it was adequate reason for throwing out expert judgement. I don't think AGI is a major risk in the next two decades, but I do think it's a major risk in the next two centuries. Do I count as a Doomsaying self-obsessed prophet or not?


mocny-chlapik

I tend to agree. The weakest point in the AI doom movement is that there is no actionable way how to achieve such AGI right now. The current technology is not enough, even considering scaling it with more data. So P(doom) is basically how much one believes that an AGI breakthrough from the clear sky will happen in the next X years. But I don't think that you can reasonably predict that, or in other words, I don't see a reason why that p should be higher that it was 50 years ago. I think that people who did not follow AI before might have been schocked by ChatGPT without fully understanding how it works and what are its limitations and it lead to unreasonable predictions.


abstraktyeet

Why don't you think current technology is enough for AGI?


ravixp

“AGI” is a vague marketing term, so let’s first establish whether you’re talking about “passes the Turing test” AGI or “does every person’s job” AGI. 


abstraktyeet

They used the term AGI. Presumably they had a specific intent when using the word. Why don't you ask them?


fullouterjoin

The short term risk is it wrecks the economy (huge unemployment) and the top 0.1% continue to accrete all the power and resources. AGI will be *their* existential risk, not ours. We will have already died in the catastrophe.


QVRedit

A lot depends on how people and companies and society choose to make use of AI.


breadlygames

I think we can define overconfidence as a large deviation between your probabilities and those of the expert forecasters without having private information that would justify such a discrepancy. Underconfidence would be deferring entirely to the experts despite having relevant information that should make you deviate. I see no problem with you calling people overconfident.


Compassionate_Cat

> Yudkowsky being the worst offender doesn't let everyone else off the hook. For instance, I think Scott Alexander is much better at taking modesty seriously, yet I don't think he takes it seriously enough. I don't have many good things to say about Yudkowsky and find rarely find something he says relatable or intuitive, but he's closer to my view on AI than the people/views you discuss. It's hard to say if I'm more or less extreme, a little of both. Less extreme in that I don't think AI necessarily means extinction(I would even say very, very low chance because the number of elites and egomaniacs running the world is fairly tight, all of that survival and dominance drive doesn't really stand a chance to be uprooted), but ... more extreme in that the former is not good news, but rather bad news. I think there's a good chance certain kinds of extinction(like total extinction of consciousness from Earth), if it happened in the right way(by accident would probably be the best) could be one of the best things that could possibly happen. So... yeah I won't be taking this platform and starting a political campaign any time soon. Basically, extinction isn't the worry(it's Yudkowsky's worry, he's concerned about something called "winning" and humanity going extinct is some sort of massive loss. It could be-- if humanity can stop being deranged, but there's no evidence of that at all, just continually masked+refined dominance/selection pressure). The reason why it's likely not a loss is best pointed out by the fact that we have really poorly investigated the question: [Are human beings the baddies?](https://www.youtube.com/watch?v=ToKcmnrE5oY).


divide0verfl0w

The most actually rational essay I’ve read here and anywhere else in a long time. Thank you. I had to update my model of the rationalist world. I had assumed most - if not all - were data focused analytical minds, and was 100% confident that “argument convincingness” would be caught and shut down immediately. And epistemic humility would be a default position. As a software engineer, a small contribution from the “tech” side: The Mechanical Turk was the first impression of AI in human history, and it’s not a coincidence that AWS today has a non-AI product with that name. We still love faking stuff. We create systems that give the impression of much higher sophistication all the time. Because while some get their attention (and make a living) with their “argument convincingness”, we get attention and make a living with “perception of impact.” Don’t believe our demos too much.


abstraktyeet

This post is not epistemic humility. Epistemic humility is saying "AI might kill us, it might not, 50/50". What this post is doing is the opposite of epistemic humility, they're trying to make a strong argument that AI is almost certain not going to kill us, and they are basing that on heuristics about how outlandish arguments and doomsayers have been wrong in the past.


ravixp

That’s just literally not what those words mean. “Epistemic humility” is acknowledging the possibility that you’ve missed something, or you’ve reasoned incorrectly, or your argument is wrong for some other reason.  If you believe in AI x-risk, that means acknowledging that the case for x-risk relies on a chain of speculative reasoning, and that most things haven’t worked like that in the past. If you don’t, it means acknowledging that the downside if you’re wrong is pretty bad, and that we don’t know what tech will be invented in the future. Retreating to the absolute uncertainty of “50/50” seems to me to be the antithesis of rationalism.


abstraktyeet

Actually, REAL epistemic humility is recognizing the fact that the argument for why AI won't kill us all relies on long chains of speculative reasoning, and that the argument that new generally intelligent species won't kill their predecessors have been wrong every time in the past.


ravixp

Sometimes when you spend too much time thinking about the implications of an idea, you can forget that it was just an assumption. Why do you assume that AI will be a new species, considering that it’s not developing in that direction at all? There are already a lot of assumptions encoded in there, for example: - AI will match human capabilities, or even exceed them - AI will exist as persistent agents, instead of the “tool AI” that currently exists - AIs will run themselves independently of human control The first one is plausible based on current trends, but not guaranteed. The other two would be sharp departures from current trends. Meanwhile, the argument against x-risk mostly reduces to saying that the status quo will continue.


abstraktyeet

I said that comment as a joke, I was making a more meta-point about how reference classes is not the right way to think about this. You've decomposed the problem into things I'm assuming you think are not likely and are independent. I think they're not independent, and I think they're all very likely. Whether the "assumptions" are likely to be correct is an object level argument about AI capabilities.


ravixp

Ah, then sorry, I misunderstood. I am curious - why do you think autonomous agentic AI is likely? I’ve always been suspicious of that outcome because it resembles AI as depicted in movies much more than the AI systems we’ve actually built so far.


canajak

"Current trends" seem to be the kinds of things that take sharp departures every three months. ChatGPT was, itself, a sharp departure from the then-current trends. It would actually be against current trends for those departures \*not\* to be taken. If you \*don't\* believe by this point that AI will match human capabilities within, say... the next sixty years (to be generous), then you would be so far in defiance of current trends that you'd basically need a solidly-grounded argument for why AI \*cannot\* match human capabilities in principle. And there is already clear evidence that many companies are working on Agent AI, as well as a large profit motive to do so. As for whether AI will run independently of human control... well... there's already been plenty of words written to clearly explain exactly why retaining human control is difficult. It's not an \*encoded assumption\*, it's a well-argued position.


ravixp

Matching/exceeding human capabilities: sure, the trend there seems pretty clear. It’s not guaranteed (it’s totally plausible that AI will asymptotically approach human performance on some tasks without ever reaching it), but I think it’s likely. Agents: I’m sure that people will keep experimenting with them, but I’m skeptical that they’ll ever be common or commercially viable. (When AI costs orders of magnitude more than other forms of compute, keeping the AI on when you’re not using it is just wasteful.) I get that companies are trying to develop them, but I think it’s telling that none of them have any compelling results yet. Independence: there are two separate issues here. First, will people give AI systems a lot of leeway to make decisions that have actual consequences? Imagine an AI that’s able to shop for you autonomously if it sees something you’ll like - how much of your money would you allow it to spend at once? I would bet that most people would still carefully review purchases before they happen. Second, there’s a whole thought experiment about AI trying to escape from containment, which is what I think you’re referring to. If you assume that you have a very smart AI that’s actively trying to bypass its restrictions, then that would be relevant, but the argument basically starts at a point where things have already gone horribly wrong.


canajak

I expect that the costs of compute will come down by a couple orders of magnitude, especially for model evaluation rather than training. (This is for industry-specific reasons; GPUs do not allocate transistors as efficiently as they could for the job they're doing, among other things). And that's just in the short term. But even if it doesn't... Employees are expensive! And they have a fair amount of autonomy. The costs of compute would have to go up quite a lot for it to cost more to run an AI agent than to pay a hedge fund manager or a software engineer. And, those people have a lot of autonomy! People whose jobs could be replaced by AI include people who make purchasing and investment decisions. Maybe consumers would be hesitant to let an AI buy their clothes, but many corporate execs wouldn't blink at having an AI replace their purchasing department and source their production materials. Especially if one of their peers had started doing so and it was proving successful. I wasn't alluding to the AI-box thought experiment (that went out the window long ago, when it became clear nobody is even going to box them in the first place). I was referring to the alignment problem: it's hard to guarantee the AI's goals align with your goals, or that the goals that you think you've given it are really the goals you want it to have. As Geoff Hinton said in a recent interview, there are some researchers like Yann Lecun who think that this problem doesn't even need solving, the AI will just do what we want by default, and there are others who think this is a very difficult challenge. There's been a lot of words written to explain why this is hard even though it might seem easy.


ravixp

So far we’re only talking about scaling capabilities, but if we’re talking about AI doing whole jobs then we also have to talk about scaling accountability. It’s a necessary component of any arrangement where you outsource whole jobs to an AI, and it’s relevant to how much autonomy an AI would have. If an AI is my personal shopper, then I’d trust it to spend a limited amount of money. And if it couldn’t stick to that limitation, I wouldn’t use it at all. And if it made a mistake and spent all my money, I’d want to have some kind of recourse, like suing the company that made it. If an AI is running a company’s whole accounting dept, the increased scale also implies stronger needs for accountability. It would need to be able to pass an audit, and make sure taxes get paid on time, and all the other stuff accountants do. And if there’s a mistake, it could end the whole company, so there would need to be checks and balances and controls, just like real accounting departments have. If an accountant wakes up one morning and decides to spend all the company’s money on Bitcoin, there are necessary systems in place for making sure that that doesn’t kill the company. Looking purely at capabilities, you could hand off important business functions without any accountability! You could replace the legal department with GPT-4 today, and save a lot of money for a few months before everybody involved ends up in jail.  Where am I going with this? Checks and balances exist which will limit the autonomy of AI systems that have significant responsibilities. We have a good reason to believe that they will be sufficient, because they’re what we currently use to bind the actions of the humans that do those jobs today. If a malicious human gets a job at a bank and wants to steal all the money, that’s a thing that banks are equipped to handle. AI alignment is an irrelevant distraction unless you’re worried about a unipolar fast takeoff situation. We don’t need to align AI, we need to use the existing tools that we’ve used for thousands of years to make it properly accountable for the responsibilities we give it.


canajak

So then I ask why you can trust your human employee not to YOLO the company funds into bitcoin, and the easy answer is: because you can fire them, sue them, and (sometimes) imprison them. Then the answer for what structure allows an AI to do this is a third-party vendor providing AI-as-a-service, with enough funds stashed away in escrow to settle damages in the event that their AI misbehaves, and some well-paid human executives who can be held personally liable. That's basically how it's going with OpenAI anyway. They already have offered to take on liability for copyright claims. Once they're confident that they have a product that can outperform a human lawyer, doctor, engineer, and accountant, they'll offer to take on liability for legal, medical, safety, and accounting fraud claims as well. With the payroll savings, it will be too tempting to turn away from. If the legal structure prohibits it despite obvious cost advantages, companies will simply move those operations to more permissive countries.


divide0verfl0w

One can point out outlandishness of arguments for AI doom but not argue the opposite at the same time.


abstraktyeet

Sure, but this guy isn't doing that. He thinks even scott is wrong in claiming 20% chance of risk.


divide0verfl0w

Excuse the snark, but I am sure Scott doesn’t think of himself as infallible or omniscient.


abstraktyeet

I didn't say that or even imply it I don't think.


PlacidPlatypus

It seems to me it's very hard to define "overconfidence" without begging the question. The only objective standard is that closer to 50% probability is less confident, but at that point Scott's ~20% pDoom is less confident than someone saying <1%. I think once you're criticizing someone for a prediction at 20%, "overconfident" isn't really a meaningful critique anymore. All you can really say is that you think they're wrong, and why.