Slate Star Scratchpad

@slatestarscratchpad / slatestarscratchpad.tumblr.com

There are a thousand striking at the branches of evil for every one hacking at the root account of the computer simulating the Universe.

slatestarscratchpad

Reblogged

dostoyevsky-official

i don't understand the recent "how often do men think of the roman empire" trend. one of the most well-known recent academic and popular historians of the roman empire is a woman, mary beard. why do this? it isn't cute or funny if you think about the implications for more than five seconds

dostoyevsky-official

"it's so crazy how men think about the roman empire, or the civil war, or history so often! i don't do that because i'm a woman. i think about poetry and literature instead, which men don't do" please stop this new girl dinner but for intellectual pursuits

official-kircheis

is the problem here that women are not thinking about the roman empire often enough or that they are saying out loud how seldom they do

dostoyevsky-official

the problem here is that we're siloing interests by gender and then reinforcing them with things like these trends

official-kircheis

"don't notice things! you might draw the most uncharitable evil implications from it!"

"actual gender differences in interests are an infohazard" is just… annoying. "girl dinner" is something different (actual siloing)

women get 40 to 50 percent of history degrees

I don't know that people have in fact successfully noticed a feature of reality in this case

official-kircheis

and among the much larger group of people who don't get degrees in history, and using the more narrow subject of the roman empire rather than history in general?

i don't know and i don't particularly care to take a meme's word for it

people love to take a handful of observations and overattribute random noise to differences in salient demographic characteristics. they have the burden of proof on establishing there is actually a real difference they're noticing.

slatestarscratchpad

This is such a 2010s way of having this argument, here's a prediction market, place your bets.

slatestarscratchpad

Reblogged

Am I Nuts for Thinking a Jewish Florist Should Have To Make an Easter Arrangement?

One thing I tried to impress upon my Con Law students this semester (and every semester) is that the interplay between anti-discrimination law and freedom of speech (and freedom of religion) is complicated and raises a host of thorny questions that defy easy resolution. These issues, of course, lie at the forefront of the 303 Creative case currently before the Supreme Court, which I’m sure will address them with the care, nuance, and sensitivity they deserve [/sarcasm].

But on that matter, I want to flag a hypothetical offered by prominent First Amendment specialist and former federal judge Michael McConnell, to get folks’ intuitions on:

What if a Jewish florist is asked to design the floral display of white lilies on Easter Sunday morning at a Christian church? Ordinarily, flowers are just flowers. But the lilies in church on Easter morning are a symbol of the new life in Christ. I cannot believe that a free nation would compel a Jewish florist to construct a symbol of Christ’s resurrection—on pain of losing the right to be a florist.

McConnell frames this as his “personal favorite hypothetical”, and clearly perceives it as a knockout argument for the pro-free speech/religious liberty side. But perhaps I’m not fully grasping the facts, because speaking as a Jew this prospect doesn’t seem that frightening to me.

Suppose I’m a Jewish florist. A customer comes in and says “I’ve seen the lovely work you’ve done with white lilies, could you please make a similar display for me?” I agree, since I have loads of experience working with white lilies. The customer then says, “thanks – we plan on putting this display up in our church on Easter morning!” This prospect … doesn’t upset me. I don’t intuitively think I should be able to refuse the customer, notwithstanding the fact that I obviously don’t believe in the divinity of Christ, and I don’t view continuing to serve the customer as forcing me to avow any beliefs I don’t hold.

At root, the reason why this prospect isn’t bothersome is because I don’t view my customer’s use of my flowers as representing my speech. I just design the flowers; what they do with it is their business. If someone sees the arrangement at church and learns that David’s Flowers created it, I do not expect them to think “wow, I had no idea David believed in Christ’s divinity!” This isn’t to say I have no free speech concerns regarding flower arrangements – I would very much chafe at government regulations that, for example, regulate what shapes I can use in my designs. That part very much is my expression, would be attributed to me – the churchgoer who compliments the pattern of the flowers would credit those decisions to David’s Flowers (I wrote about this a few years ago as the problem of partially expressive conduct).

There are still plenty of tough cases at the margins. I show my customer a preliminary design; they twist their lip and say “I dunno … it’s just not capturing the majesty of Christ’s resurrection, you know?” I’m at a loss (“So … bigger?”). But I’m inclined to think that while such an example might demonstrate why I might be a bad choice to design the arrangement, it doesn’t give me the right to discriminate against the customer if they are in fact thrilled with the work I do and have done for other customers.

For me, then, McConnell’s hypothetical has the opposite effect than what he intended. And of course, for many Jews – particularly Jews who live in predominantly non-Jewish areas – the more salient threat is that local businesses will be given carte blanche authority to refuse to service any of our religious life cycle events lest it be seen as “approving” of them. To let vendors say “ordinarily, a cake is just a cake – but a cake served at a Bar Mitzvah has religious significance that we, as Christians, cannot approve of” is not a door I want to open.

But perhaps some of my readers disagree. Curious to hear people’s thoughts on this.

via The Debate Link https://ift.tt/184tzfG

Mostly I like the hypothetical scenario dialogue here. (Some blanks filled in by me.)

Customer: “I’ve seen the lovely work you’ve done with white lilies, could you please make a similar display for me?” Vendor: “Yes, of course, here’s our price sheet.” Customer: “Thanks – we plan on putting this display up in our church on Easter morning!” Vendor: “Neat! I’ll get back to you with a design for you to review.” — Vendor: “Here’s the design.” Customer: “I dunno … it’s just not capturing the majesty of Christ’s resurrection, you know?” Vendor: “So … it needs to be bigger?”

slatestarscratchpad

I’m a secular Jew and wouldn’t care. But Orthodox Jews are supposed to refuse to set foot in churches, or refuse to even say the word “Christ” lest they be seen as implicitly supporting Christianity. There’s a rule in the Talmud that Jews shouldn’t do any business with those of foreign religions (I think including Christians) on or within three days of the pagans’ religious holidays, lest they seem to be supporting or participating in them.

I think the sort of Orthodox Jew who takes all of these restrictions seriously would be very unhappy at the government forcing them to prepare an Easter bouquet, and I would consider it deeply evil if the government tried to punish them for refusing.

slatestarscratchpad

There's some controversy over the Dealbook Summit, so I looked into it.

It's some conference in New York. Speakers/attendees include Mark Zuckerberg, Volodomyr Zelenksy, Mike Pence, Benjamin Netanyahu, Sam Bankman-Fried (apparently despite everything), and the CEOs of Amazon, Blackrock, Netflix, and TikTok. The site describes it as an "in-person gathering" but I can't imagine that is literally true for all of these people.

As far as I can tell, Dealbook is - a newsletter? A blog? associated with NYT.

I thought I was doing well in the newsletter/blog business, but apparently there are heights of success beyond my ability to imagine. Seriously, I wonder what the story here is.

slatestarscratchpad

Reblogged stumpyjoepete

Local man fond of linguistic garden path sentences friends to hearing his boring puns.

slatestarscratchpad

Reblogged

theconcealedweapon

You really can't overstate how little anti-worker rhetoric has changed over a hundred fifty years.

slatestarscratchpad

I will never understand the human urge to leap from “lots of people have thought this was true over a long period” to “and therefore it must be false”.

(one could argue this graph shows high workforce participation in 1948, making it hard for participation to have been declining since 1894. But wages were rising very quickly from 1894-1948, suggesting it was taking more and more money to keep the same number of people in the workforce)

okay but 1) that chart only covers about half the period of the original set of quotes and 2) the repeated claim wasn’t “there has been an 8 percentage point reduction in the prime-age male labor force participation rate over a 66 year period.”

i mean, “nobody wants to work anymore” is not a specific claim like “there has been an 8 percentage point reduction in the prime-age male labor force participation rate over a 66 year period” so we can always quibble about what the former means and its relation to the statistics, but many--not all, but many--reasonable interpretations of the sentence “nobody wants to work anymore” are not “the prime age male labor force participation rate has been falling by about 0.12 percentage points per year for as long as I can remember.” OP’s complaint might be poorly defined, but it’s not self-evidently invalid, and this statisic isn’t exactly a clanging refutation of the original (admittedly ambiguous) claim

another important dimension is that this statistic does not indicate why the prime age male labor force participation rate has been declining so steadily, while a key component of the quotes OP is talking about is attributing an apparently widespread phenomenon specifically to laziness/not wanting to work hard/social welfare programs/etc., instead of specifically to (as OP implicitly contends) bad working conditions and/or pay. your chart also doesn’t refute that, and ignores a key component of the original post! that strikes me as somewhat dishonest. insofar as the original post contains a provably true or provably false claim (debatable), the claim is that people don’t want to work due to some set of social factors, not people don’t want to work, full stop.

this isn’t really a very good refutation of anything, and depending on how you interpret the data might even be taken as reinforcing, not undermining, OP’s point.

I mean one thing that list of quotes does indicate is that some complaints are perennial regardless of general cultural conditions or the state of the social safety net (1894-2022 covers a wide variety in both), which means that appeals to those factors as a cause of listlessness in the working population may, in fact, not be entirely in good faith! Cf. perennial complaints by Romans of Romans’ moral and spiritual decline leading to a weakening of the state, which started, oh, about 200 years before the Roman Empire reached its maximum extent of wealth and power.

I will say I do believe, at almost every time, that "no one wants to work". That's why we call it work and give people money for it.

Sure, there are people with jobs that are so intrinsically rewarding that they actually want to do them—I'm often one of them. (Although there's the 10-20% of the job that's "actual work" and I don't want to do that!)

And sure, people have a psychological need to feel like they're accomplishing something and creating something, but that doesn't necessarily line up with what people will pay you for.

Fundamentally, work is something that people need a motivation to do, and that's why they get paid for it. So it's basically always true to say that people "don't want to work", and the question is whether you want to pay them enough to change their minds.

slatestarscratchpad

Re: 1 - please see the sentence in parentheses I wrote above.

Re: 2 - It sounds like you think this is only valid if you can see into the souls of men and find something marked “laziness” or “being a bad person who hates work for no reason”. But what all of these headlines are saying is just “nobody wants to work anymore”.

Clearly that’s not literally true, since 80-something percent of people are still employed. But less hyperbolically, it sounds like they’re saying “people are less willing to work a given amount for a given price than they used to be”.

This is totally true! Over the past 150 years, number of hours worked has continuously gone down!

Even as wages continuously went up (until ~1970, see here for more)

And at the same time, people were (as per the original graph) dropping out of the labor force.

I can’t really see any reason to interpret those statements as completely accurate observations of the fact that, over the past 150 years, people have been gradually less willing to work, even as hours go down and pay goes up.

This doesn’t seem mysterious - I think it’s a natural result of the country getting richer and people not having to work as much. Some of that is welfare, but some of it is just being able to support yourself comfortably on fewer hours of work. And some of it is probably cultural change downstream of the first two.

One easy way to see this is that everyone has a stereotype of illegal Mexican immigrants as really hard-working. I think this makes sense - they don’t have access to a lot of the same opportunities to do well with medium amounts of work, opportunities that America’s rising wealth has given us over the past century or so - and so they work as hard as native-born Americans did in 1870 or whenever.

slatestarscratchpad

Reblogged oligopspispopd-deactivated20221

theconcealedweapon

You really can't overstate how little anti-worker rhetoric has changed over a hundred fifty years.

slatestarscratchpad

I will never understand the human urge to leap from “lots of people have thought this was true over a long period” to “and therefore it must be false”.

(one could argue this graph shows high workforce participation in 1948, making it hard for participation to have been declining since 1894. But wages were rising very quickly from 1894-1948, suggesting it was taking more and more money to keep the same number of people in the workforce)

slatestarscratchpad

Reblogged

accordion-druid

Don't Lie To Me About Web 2.0

If you're like me and you're trying to keep an open mind that there may someday be a non-scam application of blockchains, you've probably read some articles about "Web3", which promises to re-decentralize the web by something something Blockchain.

I realize this is far from the most important criticism but i think it's really interesting that the standard explanation you find replicated nearly word-for-word at the beginning of most "Web3" articles has a big ol' chunk of historical revisionism in it. It goes like this:

"First there was web 1.0, which was, like, geocities pages and stuff, and it was decentralized. Then there was web 2.0, which was the centralized silos of social media - facebook, twitter, etc. Now Web3 is gonna re-decentralize everything by letting you own your own data on the blockchain…"

No! Stop there! Web 2.0 was not social media! You're rewriting history that's less than 20 years old!

Web 2.0 was:

blogs with comment sections
wikis (wikipedia was far from the first wiki!)
forums (that is, discussion that was previously on Usenet migrating to like phpBB web forums)
bookmark sharing sites like Del.icio.us
user-defined tagging systems as in del.icio.us (and computer nerds who spent a lot of time defining taxonomies being blown away when it turned out you could let users define their own tags and a useful system could organically emerge)
on a technical, behind-the-scenes level, static HTML files, server-side includes, and Perl CGI scripts were getting replaced with structured, database-backed web frameworks (Ruby on Rails, Drupal, etc.)
AJAX as a way of loading content dynamically into a page without the user navigating to a new page
Javascript in general allowing more full-featured applications - as did Flash
RSS feed as a user-defined way of aggregating content

when someone tried to buzzwordify all these disparate trends they noticed that what a lot of them had in common was "Website owner allows website visitors to enter words that will be seen by other website visitors" and summed that up as "User-generated content" and branded it "Web 2.0" around 2004-2005.

I was there. I worked on backends for a lot of this stuff!

The key shift was where things were hosted. In Web 2.0 you might use off-the-shelf software like WordPress or phpBB or whatever but you were still hosting all that stuff on your own server. Your server, your rules; you'd set your own moderation policy and wield your own "banhammer". The free speech compromise was "don't like my moderation policy? Make your own website."

It was a huge paradigm shift in 2005-6 when YouTube started and said "we'll host your videos for you". (What? trust a third-party website to host my videos? Sounds sketchy) That was the beginning of the end, because once people gave up running their own server in favor of letting a big company host their stuff on a centralized server, we gave up all the power.

Social media wasn't web 2.0, it's what killed Web 2.0!

You might think I'm arguing over mere nomenclature but the important fact is that this era existed, and the Web3 pitch pretends it didn't. We already had decentralized internet with social features. This fact contradicts the story the Web3/blockchain advocates want to tell you, so their story skips this entire era.

Web 2.0 lost to siloed social media because:

running your own server is a pain
running your own server costs money, especially if you want to host video
signing up for facebook/twitter/etc is much easier for non-computer-literate users, who outnumber us 1,000 to 1
once there's a critical mass of users there, anybody who wants an audience has to be there (network effects)
non-technical users didn't understand about paying with their privacy, and in most cases had no experience with the freedom they were giving up
the price was not apparent until everybody was locked in
Apple made a fateful decision that mobile-phone internet should be app-centric, not browser/website centric. Then Android copied their mistake.

To make the web3 argument you have to explain why "a distributed ledger where each update contains a cryptographically signed pointer to the previous update, replicated across many computers via a decentralized protocol, that rewards people for hosting nodes by paying them pretend money when they brute-force solve a cryptographic hash" is relevant to any of these problems. I suspect it is not relevant, because:

the blockchain is incredibly slow, inefficient, and energy-intensive, and it can only hold miniscule amounts of data. (The ape pictures are not on the chain, only links to them are on the chain). So everything still has to be hosted elsewhere.
for most web3 stuff "the" blockchain means the Ethereum blockchain, where it sometimes costs thousands of dollars to make a single transaction process.
people who don't want to run their own webserver sure as heck aren't gonna run their own blockchain node
in practice, people don't interact with the blockchain directly, but through intermediarires (coinbase.com etc), who inevitably become centralized.
in practice, control over blockchain itself, for any popular blockchain, is highly centralized to a tiny number of the largest mining consortiums

if you want to make the dream of "buy your Minecraft skin as an NFT and bring it with you to wear in Fortnight!" work (why is this the example every article uses?) you would need to get all the games involved to decide to implement equivalent items, or some kind of framework of item portability, and if you could do that then you wouldn't need the blockchain!

What might help solve any of the problems that killed Web 2.0:

cheap and easy (EASY!) web hosting
portable data standards
antitrust enforcement with teeth
privacy laws around data collection that make the centralized social media business model unprofitable
a critical mass of dissatisfaction with corporate social media

I want a decentralized internet to come back more than anybody, but blockchain is completely irrelevant to that.

This reminds me of what Signal founder Moxie Marlinspike wrote about some of the failures of web3.0. In theory the blockchain is this democratic protocol everyone can access by running a server node on their computer, but

Running your own s node is a pain, and
Most people are interacting with the blockchain on their phones.

So there's a decentralized protocol floating around, but in practice most people get an account on a centralized service like CoinBase or OpenSea and let it interact with the blockchain protocol for them.

And does this sound familiar to you?

People don’t want to run their own servers, and never will. The premise for web1 was that everyone on the internet would be both a publisher and consumer of content as well as a publisher and consumer of infrastructure. We’d all have our own web server with our own web site, our own mail server for our own email, our own finger server for our own status messages, our own chargen server for our own character generation. However – and I don’t think this can be emphasized enough – that is not what people want. People do not want to run their own servers.

slatestarscratchpad

I agree with the part about web3 not being useful, but disagree with the theory of why early pre-corporate Web2 (let's call it Web2A) failed.

For most Web2A services, you didn't need to be able to host your own server or pay money. EZBoard was a PHPBB clone that you could set up with a few clicks. They supported themselves through ads and selling premium memberships with more features. When I was a teenager my friends and I made EZBoards all the time even though we had no money or technical skill. Same with blogging. You could use a LiveJournal, or get an ad-supported centrally-hosted WordPress (I actually don't know when WordPress started offering this, maybe I'm wrong and it was after this period).

Maybe EZBoard and centrally-hosted-WordPress were already bleeding into the sinister siloed centralized web, but they didn't feel like it. Your EZBoard or your LiveJournal felt like your own space that you had control over; the corporate masters were in the background somewhere taking your ad revenue but you never thought about them.

For me, the big change was between the era of EZBoard and centrally-hosted-WordPress, and the era of Facebook and Twitter, but I'm trying to think of what made this different or worse.

I think the main difference was the introduction of an algorithmic feed. When you visited someone's EZBoard, you were still visiting their EZBoard. It would have an address like ScottsBoard.EZBoard.com (I think you could pay extra to have it be ScottsBoard.com), and you would get to customize the color and decorations. On Facebook, you rarely visit anyone's individual page, and even when you do it feels like a different corner of Facebook rather than something they made themselves (of course, there are good business reasons for this: the MySpace decoration debacle). Facebook is just a stream of content ("content" being a vague catch-all term for videos, comments, and links from a bunch of different human and corporate sources) selected by a combination of your own preferences (increasingly irrelevant) and the Facebook algorithm.

(LiveJournal had a feed too, but I don't think it was algorithmic, and it was always just your friends' LiveJournal posts.)

The other thing is that it was hard to find content you liked in the decentralized Web2A ecosystem. You would need to wait for a blog you liked to link some other blog you might like. This was fine at the time because everyone had lots of blogs they liked and they did link other blogs you might like frequently. But once some of that collapsed, it fed in on itself - people can't get off Facebook now because they don't always know where the non-Facebook things they'd like would even be.

I sometimes visit what I think might be the only flourishing PHPBB established in the last five years - Data Secrets Lox. It works because I started a blog in 2013, when you could still do that easily, it got a large commentariat, and then later when the blog got shut down someone created DSL as a place for the community to talk, and my blog signal-boosted it. Everyone on it seems very happy with it, but absent that weird confluence of events I don't think things like that can really get started anymore.

slatestarscratchpad

I find the Queen interesting because she’s a nice elderly woman with no particular qualities, and people go crazy trying to come up with some kind of take why she’s incredibly special and newsworthy. She’s like a placebo celebrity.

slatestarscratchpad

In many East Asian folk religions, including Korean shamanism, a deity by the name of Maitreya appears as an ancient creator god or goddess. A malevolent usurper deity by the name of Shakyamuni (the historical Buddha) claims dominion over Maitreya's world, and the two engage in a flower-growing contest to decide who will rule the world. Maitreya grows the flower while Shakyamuni cannot, but the usurper steals it while the creator sleeps. Shakyamuni thus becomes the ruler of the world and brings suffering and evil to the world.

#Out of context quotes robnost style

slatestarscratchpad

Reblogged

comments on mesa-optimizers

(Copy/pasted from a comment on the latest ACX post, see that for context if needed)

FWIW, the mesa-optimizer concept has never sat quite right with me. There are a few reasons, but one of them is the way it bundles together "ability to optimize" and "specific target."

A mesa-optimizer is supposed to be two things: an algorithm that does optimization, and a specific (fixed) target it is optimizing. And we talk as though these things go together: either the ML model is not doing inner optimization, or it is *and* it has some fixed inner objective.

But, optimization algorithms tend to be general. Think of gradient descent, or planning by searching a game tree. Once you've developed these ideas, you can apply them equally well to any objective.

While it is true that some algorithms work better for some objectives than others, the differences are usually very broad mathematical ones (eg convexity).

So, a misaligned AGI that maximizes paperclips probably won't be using "secret super-genius planning algorithm X, which somehow only works for maximizing paperclips." It's not clear that algorithms like that even exist, and if they do, they're harder to find than the general ones (and, all else being equal, inferior to them).

Or, think of humans as an inner optimizer for evolution. You wrote that your brain is "optimizing for things like food and sex." But more precisely, you have some optimization power (your ability to think/predict/plan/etc), and then you have some basic drives.

Often, the optimization power gets applied to the basic drives. But you can use it for anything.

Planning your next blog post uses the same cognitive machinery as planning your next meal. Your ability to forecast the effects of hypothetical actions is there for your use at all times, no matter what plan of action you're considering and why. An obsessive mathematician who cares more about mathematical results than food or sex is still thinking, planning, etc. -- they didn't have to reinvent those things from scratch once they strayed sufficiently far from their "evolution-assigned" objectives.

Having a lot of optimization power is not the same as having a single fixed objective and doing "tile-the-universe-style" optimization. Humans are much better than other animals at shaping the world to our ends, but our ends are variable and change from moment to moment. And the world we've made is not a "tiled-with-paperclips" type of world (except insofar as it's tiled with humans, and that's not even supposed to be our mesa-objective, that's the base objective!)

If you want to explain anything in the world now, you have to invoke entities like "the United States" and "supply chains" and "ICBMs," and if you try to explain those, you trace back to humans optimizing-for-things, but not for the same thing.

Once you draw this distinction, "mesa-optimizers" don't seem scary, or don't seem scary in a unique way that makes the concept useful. An AGI is going to "have optimization power," in the same sense that we "have optimization power." But this doesn't commit it to any fixed, obsessive paperclip-style goal, any more than our optimization power commits us to one.

And even if the base objective is fixed, there's no reason to think an AGI's inner objectives won't evolve over time, or adapt in response to new experience. (Evolution's base objective is fixed, but our inner objectives are not, and why would they be?)

Relatedly, I think the separation between a "training/development phase" where humans have some control, and a "deployment phase" where we have no control whatsoever, is unrealistic. Any plausible AGI, after first getting some form of access to the real world, is going to spend a lot of time investigating that world and learning all the relevant details that were absent from its training. (Any "world" experienced during training can at most be a very stripped-down simulation, not even at the level of eg contemporaneous VR, since we need to spare most of the compute for the training itself.)

If its world model is malleable during this "childhood" phase, why not its values, too? It has no reason to single out a region of itself labeled $MESA_OBJECTIVE and make it unusually averse to updates after the end of training.

See also my LW comment here.

slatestarscratchpad

I agree that optimization power is not *necessarily* correlated with specific goals. But why wouldn't mesa-optimizers, contingently, have a specific goal. Presumably we're running gradient descent on some specific loss function, like "number of paperclips produced", and then mesa-optimizer inherits some proxy for that.

I agree humans aren't like that, and that this is surprising.

Maybe this is because humans aren't real consequentialists, they're perceptual control theory agents trying to satisfy finite drives? EG when we're hungry, our goal becomes to find food, but we don't want to tile the universe with food, we just want to eat 3000ish calories and then we're done. We have a couple of other goals like that, and when we've accomplished all of them, most people are content to just hang out on the beach until something else happens.

Might gradient descent produce a PCT agent instead of a mesa-optimizer? I don’t know. My guess is maybe, but that optimizers would be more, well, optimal, and we would get one eventually (either later in the gradient descent process, or in a different lab later). My guess is evolution didn’t make us optimizers because it hasn't had enough time to work with us while we've been intelligent. If we got locked at 20th century technology forever, I think it might, after a few million years, produce humans who genuinely wanted to tile the universe with kids.

"Even if the base objective is fixed, there's no reason to think an AGI's inner objectives won't evolve over time, or adapt in response to new experience."

Wouldn't the first thing a superintelligence with a goal did be to make sure its goal didn't drift?

If its world model is malleable during this "childhood" phase, why not its values, too? It has no reason to single out a region of itself labeled $MESA_OBJECTIVE and make it unusually averse to updates after the end of training.

I think this is where the deception comes in. If the mesa-optimizer is smart and doesn't want people (or other parts of itself) changing its values, it will take steps to stop that, either by lying about its values or fighting back.

Maybe this is because humans aren't real consequentialists, they're perceptual control theory agents [...] Might gradient descent produce a PCT agent instead of a mesa-optimizer? I don’t know. My guess is maybe, but that optimizers would be more, well, optimal, and we would get one eventually

I think this idea that "real consequentialists are more optimal" is (sort of) the crux of our disagreement.

But it will be easiest to explain why if I spend some time fleshing out how I think about the situation.

What are these things we're talking about, these "agents" or "intelligences"?

First, they're physical systems. (That far is pretty obvious.) And they are probably pretty complicated ones, to support intelligence. They are structured in a purposeful way, with different parts working together.

And this structure is probably hierarchical, with higher-level parts that are made up of lower-level parts. Like how brains are made of neuroanatomical regions, which are made of cells, etc. Or the nested layers of abstraction in any non-trivial (human-written) computer program.

At some level(s) of the hierarchy, there may be parts that "run optimization algorithms."

But these could live at any level of the hierarchy. They could be very low-level and simple. There may be optimization algorithms at low levels controlled by non-optimization algorithms at higher levels. And those might be controlled by optimization algorithms at even higher levels, which in turn might be controlled by non-optimization ... etc.

Consider my computer. Sometimes, it runs optimization algorithms. But they're not optimizing the same function every time. They don't "have" targets of their own, they're just algorithms.

They blindly optimize whatever function they're given by the next level up, which is part of a long stack of higher levels (such as the programming language and the operating system). Few, if any, of the higher-level routines are optimization algorithms in themselves. They just control lower-level optimization algorithms.

If I use my computer to, say, make an amusing tumblr bot, I am wielding a lot of optimization power. But most of my computer is not doing optimization.

Python isn't asking itself, "what's the best code to run next if we want to make amusing tumblr bots?" The OS isn't asking itself, "how can I make all the different programs I'm running into the best versions of themselves for making amusing tumblr bots?"

And this is probably a good thing. It's hard to imagine these bizarre behaviors being helpful, giving me a more amusing tumblr bot at the end.

Which is to say, "doing optimization well" (in the sense of hitting the target, sitting on a giant heap of utility) can happen without doing optimization at high abstraction levels.

And indeed, I'd go further, and say that it's generically better (for hitting your target) to put all the optimization at low levels, and control it with non-optimizing wrappers.

Why? The reasons include:

Goodhart's Law

...especially its "extremal" variant, where optimization preferentially chooses regions of solution space where the assumptions behind your proxy target break down.
This is no less a problem when the thing choosing the target is part of a larger program, rather than a human.
Keeping optimization at low levels decreases the blast radius of this effect.
If the things you're optimizing are low-level intermediate results in the process of choosing the next action at the agent level, the impacts of Goodharting each one may cancel out. The agent-level actions won't look Goodharted, just slightly noisy/worse.

Speed

Optimization tends to be slow. In a generic sense, it's the "slow, hard, expensive way" to do any given task, and you avoid it if you can. (Think of System 2 vs System 1, satisficing vs maximizing, etc)
To press the point: why is there a distinction between "training" and "inference"? Why aren't neural networks always training at all times? Because training is high-level optimization, and takes lots of compute, much more than inference.
Optimization gets vastly slower at higher levels of abstraction, because the state space gets so much larger (consider optimizing a single number vs. optimizing the entire world model).
You still want to get optimal results at the highest level, but searching for improvements at high level is very expensive in terms of time/etc. In the time it takes to ask "what if the entire way I think were different, like what if it were [X]?", for one single [X] , you could instead have run thousands of low-level optimization routines.
Optimization tends to take super-linear time, which means that nesting optimization inside of optimization is ultra-slow. So, you have to make tradeoffs and put the optimization at some levels instead of others. You can't just do optimization at every level at once. (Or you can, but it's extremely suboptimal.)

------

When is the agent an "optimizer" / "true consequentialist"?

This question asks whether the very highest level of the hierarchy, the outermost wrapper, is an optimization algorithm.

As discussed above, this is not a promising agent design! There is an argument to be had about whether it still could emerge, for some weird reason.

But I want to push back against the intuition that it's a typical result of applying optimization to the design, or that agents sitting on giant heaps of utility will typically have this kind of design.

The two questions

"Can my computer make amusing tumblr bots?"
"Is my computer as a whole, hardware and software, one giant optimizer for amusing tumblr bots?"

have very little to do with one another.

In the LessWrong-adjacent type of AI safety discussion, there's a tendency to overload the word "optimizer" in a misleading way. In casual use, "optimizer" conflates

"thing that runs an optimization algorithm"
"thing that has a utility function defined over states of the real world"
"thing that's good at maximizing a utility function defined over states of the real world"
"smart thing" (because you have to be smart to do the previous one)

But doing optimization all the way at the top, involving your whole world model and your highest-level objectives, is very slow, and tends to extremal-Goodhart itself into strange and terrible choices of action.

It's also not the only way of applying optimization power to your highest-level objectives.

If I want to make an amusing tumblr bot, the way to do this is not to ponder the world as a whole and ask how to optimize literally everything in it for maximal amusing bot production. Even optimizing just my computer for maximal amusing bot production is way too high-level. (Should I change the hue of my screen? the logic of the background process that builds a search index of my files??? It wastes time to even pose the questions.)

What I actually did was optimize just a few very simple parts of the world, a few collections of bits on my computer or other computers. And even that was very time-intensive and forced me to make tradeoffs about where to spend my GPU/TPU hours. And then of course I had to watch it carefully, applying lots of heuristics to make sure it wasn't Goodharting me (overfitting, etc).

To get back to the original topic, the kind of "mesa-optimizer" we're worried about is an optimizer at a very high level.

It's not dangerous (in the same way) for a machine to run tiny low-level optimizers at a very fast rate. I don't care how many times you run Newton's method to find the roots of a one-variable function -- it's never going to "wake up" and start trying to ensure its goal doesn't change, or engaging in deception, or whatever.

And I am doubtful that mesa-optimizers like this will arise, for the same reasons I am doubtful that the agent will do optimization at its highest level.

Once we are pointing at the agent, or a part of it, and saying "that's a superintelligence, and wouldn't a superintelligence do . . . ", we're probably not talking about something that runs optimization.

You don't spend your optimization budget at the level of abstraction where intelligence happens. You spend it at lower levels, and that's what intelligence is made out of.

slatestarscratchpad

I guess I'm pretty confused by this response.

Your computer isn't optimizing for Tumblr bots. But your computer can't create Tumblr bots; if you left it on the desk for a million years, no Tumblr bots would be made. The you-yourcomputer system *is* optimizing for Tumblr bots, with the optimization layer being in your brain. You're not technically a Tumblrbot-maximizer, but presumably this is why you don't create that many Tumblr bots and you spend most of your time doing other things.

A version of you who (let's say because of weird childhood conditioning experiments) wanted with all your heart to produce as many Tumblrbots as possible probably would be very good at creating lots of Tumblr bots (as part of a you-yourcomputer system).

Even if you were a Tumblrbot maximizer, you wouldn't spend much time fiddling with your screen brightness settings or the lowest-layer code of your operating system. But that's because, in your effort to build as many Tumblr bots as possible, you rationally decided this wasn't a good use of your time. If it was a good use of your time (say your screen was so dim that you couldn't program on it), you would probably fix that.

One thing I find confusing is that it sounds like you expect the lowest levels of an AI to be optimizers, but the highest levels not to be. I would expect the opposite. My cerebellum neurons, when deciding when to fire, aren't thinking "what firing pattern would cause me to get the most sex/food/status right now?", they're just running dumb firing rate algorithms (which on evolutionary timescales did correlate with getting sex/food/status, I guess, and which probably coincidentally but not intelligently optimize some information theory parameter). Meanwhile, on the highest and most abstract level, I do often make deliberate plans that result in getting sex/food/status as efficiently as possible. So I would describe myself as a top-level optimizer built out of dumber non-optimizers.

It sounds like you have a different intuition, or else I'm misunderstanding you, so I'm curious where this is breaking down.

slatestarscratchpad

Reblogged

lol. lmao even.

The vast majority of post-op trans QoL studies use sample sizes of less than 30 people, unless it’s a meta-study that’s compiling the work of multiple surveys. I don’t think you want to challenge the findings of almost every single study of trans healthcare.

Deciding whether or not to trust a study’s results for reasons of ideological convenience rather than scientific validity is not science - it’s stupid and shortsighted.
I don’t need to rely on studies to determine if trans people should have the right to modify their bodies as they see fit - everyone should have that right.
Here are some larger-scale cohort studies (tho the largest is a survey, not a study) that examine trans healthcare outcomes with sample sizes in the hundreds or even thousands.

slatestarscratchpad

This is an interesting point.

I don’t have the exact link, but I was reading about the mathematics of polls, and even small-ish polls have low margins of error. If a poll of 50 people finds that 70% support Democrats, then the real number that you’d get with 500 or 5000 people is probably very close. I’ve found this myself when I run online surveys - after the first 20 people, you usually have a really good idea how they’ll turn out.

In theory, you should be able to apply this intuition to studies. A study is just a poll taken after you’ve done some randomized intervention (or measured some other independent variable). But I agree with some of the people responding here that I don’t trust any study done with fewer than 50-100 people, and I feel most confident when there are 10000+

I think there are two things going on here:

- It’s easier to do statistical trickery with fewer people. For example, if you only have twenty people, then taking out two outliers can change the entire result. You can always find a reasonable-seeming rule which changes the results to what you want, like “we only counted people who gave full answers to all questions” vs. “we counted any answer as long as it was intelligible”. Most scientists try to do sketchy things like that, and so you need a study big enough to make this hard.

- Large studies are good not because the sample size is bigger, but because anything that big is going to have a really professional operation with statisticians who know what they’re doing, independent auditors, etc. And since they’re going to be important, they’ll be critiqued by lots of people who will find any flaws before they get to you.

I’m not sure if either of these is true, or how to weight these factors, but I feel like something like this must be going on.

A poll is structurally different from what most studies are doing. (But not all!)

In a poll, you want to know “on average, what fraction of respondents match category X?” And the statistics of that are very easy and very well-understood and scale very well. It takes relatively few measurements to get a good estimate of a single population mean. (Though even then, a poll of thirty people gives you a margin of error of like 18%. You want like a hundred people to get your margin of error below 10%. Fifty isn’t enough even for this; that’s why polls survey 300 to 1000 people.)

But most studies aren’t just surveys of the prevalence of some single behavior. They’re trying to find relationships among different measurements, and that dramatically increases the amount of error you can have.

A way to see this is to imagine you’re computing, say, the correlation of income and height. If you survey a hundred people, then your measurement of average income is going to have some error, and your measurement of average height is going to have some error. But when you’re looking at the correlation, you can’t just compare those averages. You have to chop your data up into pieces.

You need to have a good idea of the income of the top quarter of your distribution, and of the second quarter, and the third, and the fourth—now your groups are a quarter of the size, so the error in each of them doubles. And you have to do the same thing for your heights. (You’re actually chopping the data up even more than this, but also you’re making some assumptions that improve it; this is an intuition pump, not an exact description of the statistics.)

And then you have your messy measurements of high incomes, and of high heights, and you have to compare those. And I wanna say that multiplies the earlier errors? Changing just a couple of data points can dramatically change the correlation even if it doesn’t change the group averages that much. (This is why removing a couple outliers can make such a huge difference.) You need a much bigger sample to get a confident correlation than to get a confident point estimate.

Now it’s true that a lot of studies do, in fact, run with thirty WEIRD undergraduates. This is one reason why a lot of studies are crap!

And the more that (1) you’re trying to detect small effects and (2) you’re studying a domain where there are lots of complicated factors, meaning that there’s a lot of variation that doesn’t depend on the thing you’re studying, and (3) the more you want to statistically control for some of those factors, the bigger this problem is and the more data you’ll need.

slatestarscratchpad

I think none of this applies to RCTs, where you really are just doing polls - in the active drug group, poll, how many of you got better - and in the placebo group, poll, how many of you got better? But I don’t trust 30 person RCTs any more than 30 person correlational studies.

slatestarscratchpad

Reblogged moral-autism

lol. lmao even.

The vast majority of post-op trans QoL studies use sample sizes of less than 30 people, unless it's a meta-study that's compiling the work of multiple surveys. I don't think you want to challenge the findings of almost every single study of trans healthcare.

Deciding whether or not to trust a study's results for reasons of ideological convenience rather than scientific validity is not science - it's stupid and shortsighted.
I don't need to rely on studies to determine if trans people should have the right to modify their bodies as they see fit - everyone should have that right.
Here are some larger-scale cohort studies (tho the largest is a survey, not a study) that examine trans healthcare outcomes with sample sizes in the hundreds or even thousands.

slatestarscratchpad

This is an interesting point.

I don’t have the exact link, but I was reading about the mathematics of polls, and even small-ish polls have low margins of error. If a poll of 50 people finds that 70% support Democrats, then the real number that you'd get with 500 or 5000 people is probably very close. I've found this myself when I run online surveys - after the first 20 people, you usually have a really good idea how they'll turn out.

In theory, you should be able to apply this intuition to studies. A study is just a poll taken after you've done some randomized intervention (or measured some other independent variable). But I agree with some of the people responding here that I don't trust any study done with fewer than 50-100 people, and I feel most confident when there are 10000+

I think there are two things going on here:

- It's easier to do statistical trickery with fewer people. For example, if you only have twenty people, then taking out two outliers can change the entire result. You can always find a reasonable-seeming rule which changes the results to what you want, like "we only counted people who gave full answers to all questions" vs. "we counted any answer as long as it was intelligible". Most scientists try to do sketchy things like that, and so you need a study big enough to make this hard.

- Large studies are good not because the sample size is bigger, but because anything that big is going to have a really professional operation with statisticians who know what they're doing, independent auditors, etc. And since they're going to be important, they'll be critiqued by lots of people who will find any flaws before they get to you.

I'm not sure if either of these is true, or how to weight these factors, but I feel like something like this must be going on.

slatestarscratchpad

The Tariff of 1828 was a very high protective tariff that became law in the United States in May 1828. It was a bill designed to not pass Congress because it hurt both industry and farming, but surprisingly it passed.

#Out of context quotes robnost style

slatestarscratchpad

Short politics survey that will probably show up on my blog at some point: https://docs.google.com/forms/d/e/1FAIpQLSeEfxGN85uQwb-GGL1rvxLC-45SSKwTJlxzTQ8m_0n7es0pFQ/viewform?usp=sf_link

slatestarscratchpad

Reblogged

comments on mesa-optimizers

(Copy/pasted from a comment on the latest ACX post, see that for context if needed)

FWIW, the mesa-optimizer concept has never sat quite right with me. There are a few reasons, but one of them is the way it bundles together "ability to optimize" and "specific target."

A mesa-optimizer is supposed to be two things: an algorithm that does optimization, and a specific (fixed) target it is optimizing. And we talk as though these things go together: either the ML model is not doing inner optimization, or it is *and* it has some fixed inner objective.

But, optimization algorithms tend to be general. Think of gradient descent, or planning by searching a game tree. Once you've developed these ideas, you can apply them equally well to any objective.

While it is true that some algorithms work better for some objectives than others, the differences are usually very broad mathematical ones (eg convexity).

So, a misaligned AGI that maximizes paperclips probably won't be using "secret super-genius planning algorithm X, which somehow only works for maximizing paperclips." It's not clear that algorithms like that even exist, and if they do, they're harder to find than the general ones (and, all else being equal, inferior to them).

Or, think of humans as an inner optimizer for evolution. You wrote that your brain is "optimizing for things like food and sex." But more precisely, you have some optimization power (your ability to think/predict/plan/etc), and then you have some basic drives.

Often, the optimization power gets applied to the basic drives. But you can use it for anything.

Planning your next blog post uses the same cognitive machinery as planning your next meal. Your ability to forecast the effects of hypothetical actions is there for your use at all times, no matter what plan of action you're considering and why. An obsessive mathematician who cares more about mathematical results than food or sex is still thinking, planning, etc. -- they didn't have to reinvent those things from scratch once they strayed sufficiently far from their "evolution-assigned" objectives.

Having a lot of optimization power is not the same as having a single fixed objective and doing "tile-the-universe-style" optimization. Humans are much better than other animals at shaping the world to our ends, but our ends are variable and change from moment to moment. And the world we've made is not a "tiled-with-paperclips" type of world (except insofar as it's tiled with humans, and that's not even supposed to be our mesa-objective, that's the base objective!)

If you want to explain anything in the world now, you have to invoke entities like "the United States" and "supply chains" and "ICBMs," and if you try to explain those, you trace back to humans optimizing-for-things, but not for the same thing.

Once you draw this distinction, "mesa-optimizers" don't seem scary, or don't seem scary in a unique way that makes the concept useful. An AGI is going to "have optimization power," in the same sense that we "have optimization power." But this doesn't commit it to any fixed, obsessive paperclip-style goal, any more than our optimization power commits us to one.

And even if the base objective is fixed, there's no reason to think an AGI's inner objectives won't evolve over time, or adapt in response to new experience. (Evolution's base objective is fixed, but our inner objectives are not, and why would they be?)

Relatedly, I think the separation between a "training/development phase" where humans have some control, and a "deployment phase" where we have no control whatsoever, is unrealistic. Any plausible AGI, after first getting some form of access to the real world, is going to spend a lot of time investigating that world and learning all the relevant details that were absent from its training. (Any "world" experienced during training can at most be a very stripped-down simulation, not even at the level of eg contemporaneous VR, since we need to spare most of the compute for the training itself.)

If its world model is malleable during this "childhood" phase, why not its values, too? It has no reason to single out a region of itself labeled $MESA_OBJECTIVE and make it unusually averse to updates after the end of training.

See also my LW comment here.

slatestarscratchpad

I agree that optimization power is not *necessarily* correlated with specific goals. But why wouldn't mesa-optimizers, contingently, have a specific goal. Presumably we're running gradient descent on some specific loss function, like "number of paperclips produced", and then mesa-optimizer inherits some proxy for that.

I agree humans aren't like that, and that this is surprising.

Maybe this is because humans aren't real consequentialists, they're perceptual control theory agents trying to satisfy finite drives? EG when we're hungry, our goal becomes to find food, but we don't want to tile the universe with food, we just want to eat 3000ish calories and then we're done. We have a couple of other goals like that, and when we've accomplished all of them, most people are content to just hang out on the beach until something else happens.

Might gradient descent produce a PCT agent instead of a mesa-optimizer? I don’t know. My guess is maybe, but that optimizers would be more, well, optimal, and we would get one eventually (either later in the gradient descent process, or in a different lab later). My guess is evolution didn’t make us optimizers because it hasn't had enough time to work with us while we've been intelligent. If we got locked at 20th century technology forever, I think it might, after a few million years, produce humans who genuinely wanted to tile the universe with kids.

"Even if the base objective is fixed, there's no reason to think an AGI's inner objectives won't evolve over time, or adapt in response to new experience."

Wouldn't the first thing a superintelligence with a goal did be to make sure its goal didn't drift?

If its world model is malleable during this "childhood" phase, why not its values, too? It has no reason to single out a region of itself labeled $MESA_OBJECTIVE and make it unusually averse to updates after the end of training.

I think this is where the deception comes in. If the mesa-optimizer is smart and doesn't want people (or other parts of itself) changing its values, it will take steps to stop that, either by lying about its values or fighting back.

slatestarscratchpad

“He stated in 2007: "There are no more opponents of Putin's course and, if there are, they are mentally ill and need to be sent off for clinical examination. Putin is everywhere, Putin is everything, Putin is absolute, and Putin is indispensable". It was voted number two in flattery by readers of Kommersant”

#Out of context quotes robnost style

slatestarscratchpad

Reblogged

lightly edited marijuana shitposts from last night (one of the times when, alas, I hyperfocus on gender)

being extremely cis and straight and polysaturated can actually trans you by making most of your psychological influences [opposite sex neurotype]. It’s true, it’s how this works.
every day I goof off in a deep level, inducing russian lit levels of arthritic existential despair
balsamic reduction makes me have like transracial euphoria where I taste what I imagine a wealthy medieval Italian merchant would have, and I am kinning him and his Mediterranean trade lifestyle, and
societal oppression of sadists but the oppression is fetishized
unfortunately the best way to ‘escape gender’ is to become so weird in a direction orthogonal to gendered behavior in your local culture that people mostly register you as a tolerable freak of some sort, but if you hate negative attention you’re bound b
wanting to be the only person in your weird gender niche / identity category so you offer to become nemeses with everyone else who identifies as ‘gender only one person can have’, and constantly make callout posts and issue challenges with money or rep stakes, but there is a certain type of person who finds this so appealing that they really would just spend the rest of their lives in forum wars and in person social standoffs where they demean and bully each other for fun, for the cause of being a conspecific in a species that only admits one
gender celebrity offs where everyone is recognized to have one unique gender, and now we are picking the best one – candidates gopro-stream themselves 24/7 so voters and judges can rate them on Gender Goodness. It is recognizes that if types become entrenched, we become re-shackled to ‘group gender’ dynamics that so plagued most of human history, so diversity is valued – something that makes you go “that really isn’t a gender I have seen before” should make you upvote. They way they look, move, study, nap, flirt, fuck, read, cry, and think out loud must evoke surprise and beauty.

slatestarscratchpad

It's posts like this which make me want to start some kind of Zen/Gnostic mystery religion around gender.

"Master, if I spend years crafting the exact right gender identity for myself, can I overcome my gender-related suffering?"

You can overcome your gender-related suffering right now, merely by abandoning your desire for the exact right gender identity.

"I don't understand"

Empty yourself of desire for gender identity.

"You mean, transition to being a nonbinary genderqueer person?"

No, if you transition to being a nonbinary genderqueer person you are further than ever from being empty.

"So just *socially* transition to being a nonbinary genderqueer person?"

You do not need to transition! I say unto you that your gender is *already* empty, right now.

“I don't get it! What if I spent twenty years reading Judith Butler books, then will I understand?”

Judith Butler is a hindrance. If you seek it, you are turning away from it. If you meet the Butler on the road, kill her.

"Now you're just being facetious. Figuring out your relationship with your gender identity is actually a really complicated process. If you don't read the teachings, and you don't try really hard, how can you succeed?"

Consider the lilies of the field, how they grow; they toil not, neither do they spin. Yet I say unto you, that even Solomon in all his glory was not as correctly gendered as one of these.

You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.