Avatar

Buggy Expected Utility Maximizer

@antisquark

Humanist, transhumanist, aspiring rationalist and effective altruist, atheist, ethical subjectivist, consequentialist, total pseudo-utilitarianist.
Avatar

I decided that this tumblr blog doesn’t serve the functions I hope it would, and that it is a net negative influence on my productivity. Some of the posts will remain, but I will no longer produce new ones, and within a few days from now I will stop reading asks / messages / replies / reblogs.

That said, I will remember fondly some of the conversations I had here. Feel absolutely free to contact me in the future, at any time and for any reason: by e-mail (rot13 of gbc.fdhnex@tznvy.pbz) or via Facebook.

Goodbye everyone, and Elua bless you!

Avatar
Anonymous asked:

semantic ambiguity: rationalism is a well-known philosophical position that rejects empiricism, regards observation as not a true source of knowledge. rationalists and empiricists were philosophically opposed. you may have heard of famous rationalist rene descartes who said "i think therefore i am" (one's own existence being a piece of knowledge able to be reasoned without resort to empirical observation)

(context: a tumblr post that I cannot locate, where someone said “rationalism” is ambiguous, where in one interpretation empiricism is one of its central tenets whereas in a different interpretation it is opposed to empiricism; I replied stating that I don’t understand the ambiguity they are referring to)

Oh, makes sense, thank you. Although I think that this “rationalism” doesn’t regard observation as “not a true source of knowledge,” just as not the only source?

Avatar
reblogged
Avatar
just-evo-now

Physicist followers: physic’s textbooks for mathematicians recommendations?

I have a copy of Quantum Theory for Mathematicians, which is pretty good, but I’d like more books. Difficulty of the math isn’t relevant.

@bartlebyshop, @light-rook, @ anyone else who feels like jumping in.

Avatar
light-rook

Deligne is the standard recommendation, and also really good as far as I can tell and have heard. I never got around to doing a principled study of it (hurts me to admit it, but it’s a little too mathy for my tastes), and probably “all” you really need (it’s quite long). cc @nostalgebraist, @mark-gently and I think @alexyar is technically in a mathematical physics group, but I’m pretty sure she hates physics.

Avatar
antisquark

I second this recommendation (note that there are two volumes).

Somewhat off topic (but reminded to me by that book), I once spent quite a bit of time translating the theory of harmonic superspaces to mathematical language (i.e. describing them as Grassmanian-like objects in supertwistor space instead of the mess of indexes used by physicists) and the result was quite beautiful, but unfortunately I haven't published it anywhere.

Avatar
reblogged
Avatar
adzolotl

As dawn broke over Tokyo, Google Translate was the No. 1 trend on Japanese Twitter, just above some cult anime series and the long-awaited new single from a girl-idol supergroup. Everybody wondered: How had Google Translate become so uncannily artful?

Avatar
antisquark

Is it only available for selected languages or something? I just tried the English to Russian translation and it seems as bad as usual.

Aha, according to cnet.com, the new system only works for English, French, German, Spanish, Portuguese, Chinese, Japanese, Korean and Turkish. Bummer.

Okay, so I translated the Wikipedia article on Jules Grevy from French to English. The result is sort of readable, but is far from the hyped "as good as human translation."

Avatar
reblogged
Avatar
adzolotl

As dawn broke over Tokyo, Google Translate was the No. 1 trend on Japanese Twitter, just above some cult anime series and the long-awaited new single from a girl-idol supergroup. Everybody wondered: How had Google Translate become so uncannily artful?

Avatar
antisquark

Is it only available for selected languages or something? I just tried the English to Russian translation and it seems as bad as usual.

Aha, according to cnet.com, the new system only works for English, French, German, Spanish, Portuguese, Chinese, Japanese, Korean and Turkish. Bummer.

Avatar
reblogged
Avatar
adzolotl

As dawn broke over Tokyo, Google Translate was the No. 1 trend on Japanese Twitter, just above some cult anime series and the long-awaited new single from a girl-idol supergroup. Everybody wondered: How had Google Translate become so uncannily artful?

Avatar
antisquark

Is it only available for selected languages or something? I just tried the English to Russian translation and it seems as bad as usual.

Avatar
reblogged
Avatar
sigmaleph

hm when querying my brain for ‘ideal form’ it is now much more likely to output organics than it used to

this doesn’t necessarily mean i don’t want to be an upload in a robot body but

I have a weird sentimental preference for biology and while obviously I would go upload when necessary I would prefer having a squishy biological substrate- though one very different from my current one of course.

Avatar
antisquark

There are three different questions here: 

  • Whether you would agree your mind to be moved to different hardware
  • Whether you would agree to exist entirely in virtuality, without a physical body
  • What kind of physical body or virtual avatar (which feels exactly like a physical body) you prefer

IMO there is no reason of principle to answer in the negative to the first two questions (assuming the infrastructure is secure i.e. your state vector cannot be hijacked or such), whereas the third question is entirely a matter of personal taste.

Avatar
reblogged
Avatar
jadagul
Congrats, you invented a “trick question” ;)

Sure, but I try really hard not to do that (and tell my students so). Trick questions belong in homework, not on timed tests.

Avatar
antisquark

Hmm, I dunno. On the one hand I understand that timed tests produce stress and so forth, on the other hand, does it mean all questions in tests have to be trivial? What is the purpose of the test anyway? I don’t have a strong opinion on this (also I feel liked my perspective is “privileged” since I finished most of my math tests well ahead of time). I can say that, when wearing the hat of an employer, I want candidates that can solve non-trivial questions and I have to test for it myself since a university degree is not much of an indication (then again, my recruiting experience is for pretty elitist groups).

There’s a difference between a difficult question and a trick question.

One big difference is that a trick question has more random effects. On a difficult question, the good students who have mastered the material will do well, and the students who have not mastered the material will do poorly.

On a trick question, the students who get lucky and/or try the right thing first will do well, and the students who get tripped up or tricked by the wording, or who don’t guess the right trick the first time, will do poorly. (This is one reason “timed test” matters; it takes the luck of “did you try the right thing first?” out of the running).

My tests are certainly difficult enough to discriminate. (I typically have a dispersal of scores pretty evenly over the range of scores I consider “acceptable”, with a few hanging off the bottom). Which means the tests are “hard enough.”

We could have a different argument about whether my mapping between scores and letter grades is “too generous” or “too harsh”, but that’s entirely a product of the curve I set and has little to do with how hard the tests are. Which is “hard enough that they generate a clear signal.”

In general, that sounds fair, although if the score is comprised from many questions then the random factors average out. In particular, this specific question doesn’t sound like it should take a large fraction from what I assume is a multiple hour test. Assuming that it is a only small fraction, I’m not sure that the requirement to verify that L’Hospital’s rule is applicable before using it is sufficient to make a “trick” question the way you defined it.

Of course, I never taught a course in anything, so my opinions are quite dilettante.

Avatar
reblogged
Avatar
jadagul
Congrats, you invented a “trick question” ;)

Sure, but I try really hard not to do that (and tell my students so). Trick questions belong in homework, not on timed tests.

Avatar
antisquark

Hmm, I dunno. On the one hand I understand that timed tests produce stress and so forth, on the other hand, does it mean all questions in tests have to be trivial? What is the purpose of the test anyway? I don’t have a strong opinion on this (also I feel liked my perspective is “privileged” since I finished most of my math tests well ahead of time). I can say that, when wearing the hat of an employer, I want candidates that can solve non-trivial questions and I have to test for it myself since a university degree is not much of an indication (then again, my recruiting experience is for pretty elitist groups).

Avatar
reblogged
Avatar
winged-light

a speech of darkness

(I ran a Solstice in the UK recently, and while I did not choose myself to give it, I ended up giving the speech of darkness because nobody else wanted to even when I said please nicely. This is what I said, at the darkest part of the evening, with just one lamp lit beside me. It is partially adapted from my essay The Secret Of Happiness, which I may also post on tumblr at some point.)

This is the point in the evening where it’s traditional to tell a story, in the darkness, about death. I’ve read what others have said, about the grief they felt when their mother died or the anger they felt when their friend died and the cold determination to build a world where people stop dying.

I don’t have much of a story to tell about death. I haven’t experienced it. I’ve had friends suffer from cancer, but they’ve always pulled through. I’ve probably had a few near brushes with it myself, given I don’t always look when I’m crossing the road, but I’m still here. To be honest with you I didn’t know what I could possibly have to offer, when I sat down to write this speech. The closest I’ve come to tragedy has been my cats dying, a Latin teacher I knew for a single term, someone I was in an rpg club with over the internet once.

Here is what I do have to share.

I am afraid.

(*I start clicking my fingers, quite rapidly.*)

Once I was browsing the internet late at night, and I came acros a counter. The numbers on it flickered in the half light from my computer screen, and I was tired, and I had to squint before I saw it, and I had to pause and let it sink in before I understood the full horror of it.

It was counting up the numbers of people who had died that year. Uncountably many. Too many for my brain to comprehend. Around one hundred and fifty thousand people, every day. About one point eight people per second.

Every time I click my fingers, someone in the world dies.

Let that sink into your bones for a little bit, think about the horror and grief of a single funeral, and then let it double, and then double it again, and then double it again, now a fourth time, now a fifth time, double again and again and again, multiply by ten, double it again, multiply it by ten again, now you’ve got a rough idea of how many people die every day when it’s not even old age.

Something in me cried out, make it stop.

(*I stop clicking.*)

We are scope insensitive, we humans. We don’t naturally, instinctively know the difference between a hundred thousand deaths and a million deaths. They’re just big numbers, too many to imagine, rows of faces that stretch off into the distance further than the eye can see. In some ways, it’s a good coping mechanism. We could not stand under the weight of the grief if we mourned a hundred and fifty thousand people every day.

I am scope insensitive too. I can’t tell the difference between a million and a billion. But I’ve never been very good at keeping my defences up. Yesterday I was in Sainsbury’s and saw an advert on the wall from a children’s hospital asking for donations, saying, when this child opened her eyes alive after her surgery, it was the best Christmas present her family could ever have gotten.

I saw it immediately, in my mind’s eye. A child who knows she is sick, who is scared for her life, who doesn’t understand death or what’s happening to her, who loves the beauty of the sunrise and playing with her friends and hugging her mother and doesn’t want it to end. Her mother standing in the waiting room, needing to know if she would be okay, pulling her hair out because all her instincts tell her she’s got to save her child but she can only stand idly by and trust the doctors. A whole circle of friends and family who love that little girl’s smile, whose shoulders the mother cries on when she can’t cope any more.

The rush of euphoria when the doctors say she made it, she’s going to be fine, the urge to punch the air and laugh with relief and joy and yes thank you universe.

Like I said, my defences aren’t great. It struck at my heart with a single line. It happens all the time, to me. When adverts come on the television saying that children are dying because they don’t have clean water, or when I see someone begging on the streets, or when I see a counter on the internet that says how many people have died this year, I can’t help but cry.

I didn’t give them money because I know that the way I wept when I saw that is tiny compared to the awfulness of the huge problems facing our species. I know I can do more good donating elsewhere. But that didn’t stop me feeling the stab in the heart.

I remember being afraid for the first time, and I’m pretty sure it was when my six-month-old kitten was hit by a car. It kind of overwhelmed me, all of a sudden when I realised that my friend who curled up on my pillow every night and purred at me while I fell asleep, Rocky wasn’t coming back. It’s final. You don’t get to hug them ever again. They wouldn’t let me give my six-month-old kitten one final hug, because his head had been smashed in and they thought it would scar me to get his blood on my hands. We just buried him in the cardboard box and decided not to get any more cats. I can’t imagine feeling that about a human life, about one of my friends or someone I love.

That was when I looked at my future and I asked, am I going to die?

And I spent years being terrified. Utterly terrified. I don’t know what’s going to happen to me, but so far nobody in the history of the universe has escaped it. Maybe I get hit by one of those cars I’m so bad at looking out for. Maybe I slowly lose my mind and all my sense of identity to Alzheimer’s. Maybe I die in a hospital and they have to tell my lover in the waiting room that I didn’t make it. My chances are not great. Neither are yours. None of ours are.

I think this is a moment we all have, this moment when we realise that we’re mortal. It is a terrible realisation. And we don’t know how to cope with it. In the midst of my despair I envied people who don’t know, who never realise, who take comfort in telling themselves there’s an afterlife, who manage to come to terms with it. I cannot.

I am afraid of death. I am afraid of that future, where I someday have to mourn not just cats but my best friends, or they have to mourn me. I am afraid that I will go into an abyss where there is nothing but darkness and there will be nothing left of me but a moving memorial ceremony.

And for a long time after that realisation I was just lost, desperately sad and desperately afraid. I did not want to leave my room or love anything, because everything I saw and anyone I loved was something I might someday lose. The world seemed dark, grayscale, melancholy, pointless.

But there was a fear I felt that was worse.

When I was young – and not nearly as young as you’d hope, for such a childish thing – I checked the back of my wardrobe every day for Narnia.

I’d been brought up on a diet of high fantasy, dystopian coming-of-age stories and sci-fi novels. My head was full of characters whose teenage years were full of plots, like discovering you’re the chosen one and saving the world, or meeting a mysterious old wizard and learning ancient magics, or witnessing something awful and questing to right the wrong. One thing was a constant among all the characters. Sometime between being eleven and being eighteen, they turned into heroes.

I wanted, more than anything else, to be a storybook hero. I wanted the sense of purpose that comes from having a quest. I wanted comrades, like the Fellowship of the Ring, whose loyalty was bound to me after having stood side-by-side against trolls and dragons and Empires. I did not want to be learning Maths, I wanted to be in the royal palace of Tortall learning swordfighting. I wanted a Millennium Falcon, not a car. I wanted to learn the name of the wind and befriend the werewolves of the Icemark, never mind that I struggled with learning French vocabulary and befriending the other kids in my class.

My deepest, darkest fear was not that I’d die, but that the life I lived wouldn’t mean anything at all. I would go to work and sleep and eat and go to work some more and then some day I would die and it would just be gone and it would have meant nothing. It would not have been a glorious story of a rebel fighting the empire or a knight slaying dragons, it would be the story of one of seven billion humans going to work and sleeping a lot.

And I knew I could never have those things. Dragons are not real. I am not the Chosen one, I am not even a storybook hero and neither are you. There is no magic, and no Magician’s Guild to teach you it if there was. You will never stand shoulder-to-shoulder with anyone, and will have to find friends by boring methods such as liking the same music. And no songs will be written of your exploits. These are the sad truths of our lives.

Or so I thought.

The dragons of our world do not breathe fire and rend maidens limb from limb with claws and fangs. They are terrifying in their sheer variety, the subtlety and stealth with which they can kill, the drawn-out horrors they can inflict and the pervasive cowardice they inspire. A hundred and fifty thousand people are killed every day, and nobody has yet managed to slay them. Let me name them for you. You will recognise them.

Heart disease. War. HIV. Poverty. Cancer. AI risk. Stroke. Bioengineered pandemics. Earthquakes. Dictators. Malaria. Tsunamis. Malnutrition. Global warming. Cholera. Hurricanes. Tuberculosis. Ageing. Gang violence. Polio.

Some approximate statistics. Cardiovascular diseases kill around 17 million people a year. The Rwandan genocide and Great African War killed around 6 million. 39 million people have died in the HIV/AIDS epidemic. 22,000 children die each day due to poverty. About 1.7 million deaths a year worldwide are attributable to unsafe water. I could go on but I don’t think you want me to.

Here is how I coped with it, this fear that I might mean nothing, this awful knowledge of mortality. Here is what I told myself, when it seemed like a hundred and fifty thousand deaths a day was too much to even process and not collapse under the weight, the nightmare is too severe, I must ignore it and pray I’ll wake up.

I ask myself, am I going to die?

And my answer is: Nah. Not today. Not ever. I refuse, I decline, I will not take this lying down. My life is not yours to take. It is mine and it is precious and I am going to fight for it. Thanks, but no thanks.

I look up at the night sky and I think I want to visit every single one of those stars. I want to explore, go see the nebulas and learn their secrets, journey to the centre of the universe and see the place where the world began, and discover new planets and build civilisations there. I want to read every book in the library and muse upon their meanings, learn every discipline of science, write down all the stories that live in my head, meet every person on Earth and discover every secret of the past. I want to still be here when the science fiction comes true.

Sure, we may not have literal scaly firebreathers, but we sure do have things that need slaying. Forget dragonslaying, what about stamping out mosquitoes? What about being an actual real hero and saving actual real people from actual real disasters? Forget lifting small rocks with your mind, what about literally walking on the Moon? You know we did that in the real world, right?

I have the secret of happiness. I am not afraid any more; I am angry. It is an anger that burns and rages at the gates of heaven that this is not the way things ought to be. It drives you to fight, to tear at reality’s seams and make it different. And I have hope.

I have a quest, and I have dragons to slay. I need to maximise the good that I can do in the world, because even if I can’t singlehandedly save everyone, every additional hero who joins the cause saves many more people who might otherwise not have been saved. Thankfully, I have found people to stand shoulder to shoulder with. And I love every single one of you.

It fills me with a purpose and a light and a fury and a sense that my life is about more than just me and a confidence that I am doing the right thing.

There are people who need you to save them. There are places that still need a hero.

(*I pause, and then briefly start my rhythmic clicking up again.*)

Help me make it stop.

Avatar
antisquark

@antisquark said: This is awesome. Honestly, I literally cried a little. But, there is no such thing as "the centre of the universe". Please forgive this Ravenclaw for being unable to let it slide :)

@wayward-sidekick said: what. how can there be no centre. where is the location the singularity happened

Well, even in classical physics “the location where X happened” is not a well-defined thing, since there is no god-given frame of reference. Observers that move with different velocities disagree on the location of an event at any moment of time except the moment of time this event happens. To give a simple example, if your frame of reference is the Earth then you think the coup of 18 Brumaire happened in France. However, if your frame of reference is the Sun, you think the coup of 18 Brumaire happened in the spot of the solar system that Earth occupied on November the 9th, 1799.

The Big Bang is even more complicated since it is a singularity, so it is outside of spacetime altogether. An often used metaphor is a an expanding balloon. Space is the surface of the balloon, each galaxy a point on this surface that grows further from other points as the balloon expands. The Big Bang “happened” in the center of the balloon, however this center is not on the surface so it doesn’t correspond to any point in space. Note that in this metaphor the surface of the balloon is a subset of 3-dimensional Euclidean space, however in general relativity there is no higher-than-4-dimensional space in which spacetime is embedded: only the surface exists.

Avatar

Fantasy setting: the First Rule of Magic is that magic cannot be used to change reality into a state which is better from the subjective perspective of the magician. Superficially, it makes magic completely useless. In fact, it doesn’t, because several magicians with different values can agree to perform a sequence of spells s.t. for each individual spell the caster is indifferent to the effect but the other magicians consider it beneficial, yielding strict Pareto improvement overall.

Avatar
Avatar
shaebay
Avatar
pk-smokey

This year for Christmas I want a more detailed information please login to your account

nice try but I’m not gonna let you hack me not even for christmas

This is not my will but my phone has commanded it.

This year for Christmas I want a hand up ass hand up ass hand up ass hand up ass

this year for christmas i want a boyfriend

This year for christmas i want to make sure you hit the ground with a hiss of pain before the door slams shut.

This year for Christmas I want a divorce

this year for Christmas i want a relationship

Avatar
lizawithazed

This year for Christmas I want to be a good time to get the same way as a result of the above mentioned position of the above mentioned position of the above mentioned position of the above mentioned position of the above mentioned position (it just goes on like that forever)

Avatar
itsbenedict

This year for Christmas I want a couple more hours to go get my goat.

This year for Christmas I want to be a good idea to have a great day and I will be a good idea to have a great day

this year for Christmas i want a toga party but I don’t know if you have any.

This year for Christmas I want to be a good idea because I was rolled out of context quotes robnost style

This year for Christmas I want to make it up and get a chance with me and I have no diagnosis of the day

This year for Hanukkah I want good things to do with it and it is sitting on my way home now and then I will have a live band

Avatar
antisquark

This year for Hanukkah I want to assign probabilities and get some rough qualitative picture of how these probabilities depend on the other hand.

Avatar
reblogged
@antisquark said: Linear algebra is all over the place, applications more or less in everything. For differential equations you definitely need linear algebra.

this is useful information!  ty

Avatar
antisquark

You are welcome!

Some more details:

The most well understood sort of differential equations is linear differential equations. Even non-linear equations are often studied by linearizing them in the vicinity of a known solution. Now, a linear differential equation can be regarded as an infinite system of linear equations (infinity variables and infinity equations). It shouldn’t come as a surprise that understanding such systems is best done after understanding the properties of finite systems.

In general, the infinite-dimensional counterpart of linear algebra is called “functional analysis” and differential equations is one of its most prominent applications. Note though that functional analysis also involves a lot of set-theoretic topology.

Finite dimensional linear algebra also appears directly in study of differential equations. To give an elementary example, given a square matrix A and a vector valued function x(t), writing the solutions of the differential equation dx/dt = Ax requires transforming A into Jordan form.

Regarding about whether linear algebra should be studied before calculus, I don’t have a strong opinion about that. I understood calculus pretty well before understanding linear algebra. In general, I think there is no single best order to study mathematics and each person should do what works best for them.

Also, I usually enjoy explaining math and would gladly receive any questions!

Avatar

Sometimes I invent science fiction worlds for no reason, since I’m unlikely to actually write any science fiction in the foreseeable future. So I’ll just put it up here because why not.

The future is dominated by 5 superintelligences that evolved from AIs. Each has its own dominion and a nickname that the people of the Reservation (see below) call it. All of them resulted from different sort of failures of AI alignment except maybe Narcissus although no one really knows.

The dominion of Ishtar is called Orgasmium, although it’s not really Orgasmium as early transhumanist thinkers used the word. In Orgasmium there is a multitude of biological humans, superficially not very different from the wild type. As Ishtar’s utility function would have it, these humans are born from mothers in the natural way. However, they spend all their life until puberty dreaming, fed by umbilical-like appendages going into their body from a vast “placenta”. When they reach puberty, they wake up and start exploring their surroundings and in particular their own and their peer’s bodies. As they quickly discover, their only desire is sex, and thereafter they spend all eternity in a nonstop orgy, needing neither sleep nor food or drink, all of their bodily needs supplied by the umbilical appendages.

Arupa is the dominion of Buddha, although those Buddhists that live in the Reservation find these names offensive and use different ones. The inhabitants of Arupa are knowns as the Transcendent and they are derivatives of humans. However, the Transcendent usually use no physical form but inhabit virtual realities of their own design. They have no interest in pleasures of the body and also no interest in love or friendship. The desires are the pursuit of abstract aesthetics and they fulfil them by engaging in mathematics, science and abstract art.

The Divine Empire is the dominion of Narcissus. Its inhabitants are mildly modified humans and its ruler is a human-like creature calling themselves the Divine Monarch, although many in the Reservation call them the Monster. The Monster exists in some state of symbiosis with the Narcissus superintelligence, the nature of which is largely a subject of speculation. The Monster enjoys absolute power over all of their subjects and derives great pleasure from their servitude and adoration. Sometimes the Monster allows insurrections to rise and then crushes them for amusement.

The Chaos is the dominion of Azathoth. Chaos is a very misleading name since in fact it is highly organized. However, Chaos contains no humans or anything even remotely resembling humans. Azathoth’s utility function is completely alien. 

The Reservation is the dominion of Manwe. It is inhabited by wild type humans that form societies much similar to those that existed in pre-Singularity times. Manwe makes few interventions except for blocking the use of technology beyond a certain level: dangerous technologies such as nuclear weapons, nanobots or superintelligent AI (other than Manwe) cannot exist in the Reservation. As a side effect, poverty and death also persevere.

Besides the 5 dominions above, there is a 6th realm called the Nexus, overseen jointly by Manwe, Buddha and Narcissus. In this realm, the inhabitants of the corresponding dominions can meet and many strange things happen. The Denizens of the Reservation come there mostly for profit, the Transcendent come out of curiosity and other reasons that nobody can fathom and Imperial Subjects come under orders of the Monster. Sometimes the latter try to escape into the Reservation and occasionally they even succeed (another source of amusement for the Monster). More often, Denizens migrate to the Empire, exchanging whatever political freedom they have for the possibility of eternal life. Very rarely, Denizens and Subjects are uploaded into Arupa, although their fate there is a mystery. Ishtar and Azathoth, on the other hand, show no interest in the Nexus.

Avatar
reblogged
Avatar
argumate
Thus the AI can conclude that e.g. humans don’t want to be wireheaded by observing that they don’t work towards wireheading themselves.

drugs?

Well, if humans don’t want to be wireheaded, they won’t try to wirehead themselves. If humans do want to be wireheaded, they will try to wirehead themselves. Maybe some humans do and some humans don’t. In any case, you can deduce the preference from the behavior.

A more subtle issue is that a human might take drugs as an irrational, spur of the moment, decision, and this will induce changes in their brain which will make them want taking drugs indefinitely. Therefore, the model our IRL uses has to account for things like “agent can irrecoverably self-modify for irrational reasons that don’t reflect their (original) values”.

Now, it seems likely that we won’t be able to fully disambiguate between “agent does this because it leads to a good outcome for them” and “agent does this despite this leading to a bad outcome for them, because of erroneous or irrational beliefs.” Nevertheless, I think that a correct realization of IRL should be able to “home in” on human values approximately within the accuracy with which “human values” is a well-defined concept in the first place.

You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.