Philosophy

Morality, thy discount is hyperbolic

One well known failure mode of Utilitarian ethics is a thing called a “utility monster” — for any value of “benefit” and “suffering”, it’s possible to postulate an entity (Bob) and a course of action (The Plan) where Bob receives so much benefit that everyone else can suffer arbitrarily great pain and yet you “should” still do The Plan.

That this can happen is often used as a reason to not be a Utilitarian. Never mind that there are no known real examples — when something is paraded as a logical universal ethical truth, it’s not allowed to even have theoretical problems, for much the same reason that God doesn’t need to actually “microwave a burrito so hot that even he can’t eat it” for the mere possibility to be a proof that no god is capable of being all-powerful.

I have previously suggested a way to limit this — normalisation — but that was obviously not good enough for a group. What I was looking for then was a way to combine multiple entities in a sensible fashion, and now I’ve found one already existed: hyperbolic discounting.

Hyperbolic discounting is how we all naturally think about the future: the further into the future a reward is, the less important we regard it. For example, if I ask “would you rather have $15 immediately; or $30 after three months; or $60 after one year; or $100 after three years?” most people find those options equally desirable, even though nobody’s expecting the US dollar to lose 85% of its value in just three years.

Most people do a similar thing with large numbers, although logarithmic rather than hyperbolic. There’s a cliché with presenters describing how big some number X is by saying X-seconds is a certain number of years and acting like it’s surprising that “while a billion seconds is 30 years, a trillion is 30 thousand years”. (Personally I am so used to this truth that it feels weird that they even need to say it).

Examples of big numbers, in the form of money. Millionaire, Billionaire, Elon Musk, and Bill Gates.

Examples of big numbers, in the form of money. Millionaire, Billionaire, Elon Musk, and Bill Gates. Wealth estimates from Wikipedia, early 2020.

So, what’s the point of this? Well, one of the ways the Utility Monster makes a certain category of nerd look like an arse is the long-term future of humanity. The sort of person who (mea culpa) worries about X-risks (eXtinction risks), and says “don’t worry about climate change, worry about non-aligned AI!” to an audience filled with people who very much worry about climate change induced extinction and think that AI can’t possibly be a real issue when Alexa can’t even figure out which room’s lightbulb it’s supposed to switch on or off.

To put it in concrete terms: If your idea of “long term planning” means you are interested in star-lifting as way to extend the lifetime of the sun from a few billion years to a few tens of trillions, and you intend to help expand to fill the local supercluster with like-minded people, and your idea of “person” includes a mind running on a computer that’s as power-efficient as a human brain and effectively immortal, then you’re talking about giving 5*10^42 people a multi-trillion year lifespan.

If you’re talking about giving 5*10^42 people a multi-trillion year lifespan, you will look like a massive arsehole if you say, for example, that “climate change killing 7 billion people in the next century is a small price to pay for making sure we have an aligned AI to help us with the next bit” — the observation that a non-aligned AI is likely to irreversibly destroy everything it’s not explicitly trying to preserve isn’t going to change anyone’s mind, because even if you get past the inferential distance that means most people hearing about this say “off switch!” (both as a solution to the AI and to whichever channel you’re broadcasting on), at scales like this, utilitarianism feels wrong.

So, where does hyperbolic discounting come in?

We naturally discount the future as a “might not happen”. This is good. No matter how certain we think we are on paper, there is always the risk of an unknown-unknown. This risk doesn’t go away with more evidence, either — the classic illustration of the problem of induction is a turkey who observes that every morning they are fed by their farmer and so decides that the farmer has their best interests at heart; the longer this goes on for, the more certain they are of this, yet each day brings them closer to being slaughtered for thanksgiving. The currently-known equivalents in physics would be things like false vacuums or brane collisions triggering a new big bang or Boltzmann brains.

Because we don’t know the future, we should discount it. There are not 5*10^42 people. There might never be 5*10^42 people. To the extent that there might be, they might turn out to all be Literally Worse Than Hitler. Sure, there’s a chance of human flourishing on a scale that makes Heaven as described in the Bible seem worthless in comparison — and before you’re tempted to point out that Biblical heaven is supposed to be eternal, which is infinitely longer than any number of trillions of years: that counter-argument presumes Utilitarian and therefore doesn’t apply — but the further away those people and that outcome is from you, the less weight you should put on them really becoming and it really happening.

Instead: Concentrate on the here-and-now. Concentrate on ending poverty. Fight climate change. Save endangered species. Campaign for nuclear disarmament and peaceful dispute resolution. Not because there can’t be 5*10^42 people in our future light cone, but because we can falter and fail at any and every step on the way from here.

Standard
AI, Futurology, Opinion, Philosophy

Memetic monocultures

Brief kernel of an idea:

  1. Societies deem certain ideas “dangerous”.
  2. If it possible to technologically eliminate perceived dangers, we can be tempted to do so, even when we perceived wrongly.
  3. Group-think has lead to catastrophic misjudgments.
  4. This represents a potential future “great filter” for the Fermi paradox. It does not apply to previous attempts at eliminating dissenting views, as they were social, not technological, in nature, and limited in geographical scope.
  5. This risk has not yet become practical, but we shouldn’t feel complacent just because brain-computer-interfaces are basic and indoctrinal viruses are fictional, as universal surveillance is sufficient and affordable, limited only by sufficiently advanced AI to assist human overseers (perfect AI not required).
Standard
Philosophy

Dot product morality

The first time I felt confused about morality was as a child. I was about six, and saw a D&D-style role-playing magazine; On the cover, there were two groups preparing to fight, one dressed as barbarians, the other as soldiers or something1. I asked my brother “Which are the goodies and which are the baddies?”, and I couldn’t understand him when he told me neither of them were.

Boolean.

When I was 14 or so, in the middle of a Catholic secondary school, I discovered neopaganism; instead of the Bible and the Ten Commandments, I started following the Wiccan Rede (if it doesn’t hurt anyone, do what you like). Initially I still suffered from the hubris of black-and-white thinking, even though I’d witnessed others falling into that trap and thought poorly of them for it, but eventually my exposure to alternative religious and spiritual ideas made me recognise that morality is shades of grey.

Float.

Because of the nature of the UK education system, between school and university I spent 2 years doing A-levels, and one of the subjects I studied was philosophy. Between repeated failures to prove god exists, we covered ethics, specifically emotivism, AKA the hurrah/boo theory, which claims there are no objective morals, and that claims about them are merely emotional attitudes — the standard response at this point is to claim that “murder is wrong” is objective, at which point someone demonstrates plenty of people disagree with you about what counts as murder (abortion, execution, deaths in war, death by dangerous driving, meat, that sort of thing). I don’t think I understood it at that age, any more than I understood my brother saying “neither” when I was six; it’s hard to be sure after so much time.

Then I encountered complicated people. People who could be incredibly moral in one axis, and monsters in another. I can’t remember the exact example that showed it first, but I have plenty to choose from now — on a national scale, the British empire did a great deal to end slavery, yet acted in appalling ways to many of the people under it’s rule; on an individual scale, you can find scandals for Gandhi and Churchill, not just obvious modern examples of formerly-liked celebrities like Kevin Spacey and Rolf Harris. In all cases, saying someone is “evil” or “not evil”, or even “0.5 on the 0-1 evil axis” is misleading — you can trust Churchill 100% to run 1940 UK while simultaneously refusing to trust him (0% trust) to care about anyone who wasn’t a white Protestant, though obviously your percentages might be different.

I’ve been interested in artificial intelligence and artificial neural networks for longer than I’ve been able to follow the maths. When you, as a natural neural network, try to measure something, you do so with a high-dimensional vector-space of inputs (well, many such spaces, each layered on top of each other, with the outputs of one layer being the inputs of the next layer) and that includes morality.

When you ask how moral someone else is, how moral some behaviour is, what you’re doing is essentially a dot-product of your moral code with their moral code. You may or may not filter that down into a single “good/bad” boolean afterwards — that’s easy for a neural network, and makes no difference.

1 I can’t remember exactly, but it doesn’t matter.

Standard
Philosophy, Psychology

A life’s work

There are 2.5 billion seconds in a lifetime and (as of December 2018) 7.7 billion humans on the planet.

If you fight evil one-on-one, if you refuse to pick your battles, if only 1% of humans are sociopaths, you’ve got 21 waking seconds per opponent — and you’ll be fighting your whole life, from infancy to your dying breath and from when you wake to when you sleep, with no holiday, no weekends, no retirement.

Conversely, if you are a product designer, and five million people use your stuff once per day, every second you save them saves a waking lifetime of waiting per year. If you can relieve a hundred thousand people of just 5 minutes anxiety each day (say, about social media notifications), you’re saving six and a half waking lifetimes of anxiety every year.

When people complained about the cost of the Apollo programs, someone said Americans spent more on haircuts in the same time. How many Apollo programs of joy are wasted tapping on small red dots or waiting for them?

Standard
Minds, Philosophy, Psychology

One person’s nit is another’s central pillar

If one person believes something is absolutely incontrovertibly true, then my first (and demonstrably unhelpful) reaction is that even the slightest demonstration of error should demolish the argument.

I know this doesn’t work.

People don’t make Boolean-logical arguments, they go with gut feelings that act much like Bayesian-logical inferences. If someone says something is incontrovertible, the incontrovertibility isn’t their central pillar — when I treated it as one, I totally failed to change their minds.

Steel man your arguments. Go for your opponent’s strongest point, but make sure it’s what your opponent is treating as their strongest point, for if you make the mistake I have made, you will fail.

If your Bayesian prior is 99.9%, you might reasonably (in common use of the words) say the evidence is incontrovertible; someone who hears “incontrovertible” and points out a minor edge case isn’t going to shift your posterior odds by much, are they?

They do? Are we thinking of the same things here? I don’t mean things where absolute truth is possible (i.e. maths, although I’ve had someone argue with me about that in a remarkably foolish way too), I mean about observations about reality which are necessarily flawed. Flawed, and sometimes circular.

Concrete example, although I apologise to any religious people in advance if I accidentally nut-pick. Imagine a Bible-literalist Christian called Chris (who thinks only 144,000 will survive the apocalypse, and no I’m not saying Chris is a Jehovah’s Witness, they’re just an example of 144k beliefs) arguing with Atheist Ann, specifically about “can God make a rock so heavy that God cannot move it?”:

P(A) = 0.999 (Bayesian prior: how certain Chris’s belief in God is)
P(B) = 1.0 (Observation: the argument has been made and Ann has not been struck down)
P(B|A) = 0.99979 (Probability that God has not struck down Ann for blasphemy, given that God exists — In the Bible, God has sometimes struck down non-believers, so let’s say about 21 million deaths of the 100 billion humans that have ever lived to cover the flood, noting that most were not in the 144k)

P(A|B) = P(B|A)P(A)/P(B) = 0.99979×0.999/1.0 = 0.99879021

Almost unchanged.

It gets worse; the phrase “I can’t believe what I’m hearing!” means P(B) is less than 1.0. If P(B) is less than 1.0 but all the rest is the same:

P(B) = 0.9 → P(A|B) = P(B|A)P(A)/P(B) = 0.99979×0.999/0.9 = 1.1097669

Oh no, it went up! Also, probability error, probability can never exceed 1.0! P>1.0 would be a problem if I was discussing real probabilities — if this was a maths test, this would fail (P(B|A) should be reduced correspondingly) — but people demonstrably don’t always update all their internal model at the same time: if we did, cognitive dissonance would be impossible. Depending on the level of the thinking (I suspect direct processing in synapses won’t do this, but that deliberative conscious thought can) we can sometimes fall into traps, so this totally explains another observation: some people can take the mere existence of people who disagree with them as a reason to believe even more strongly.

Standard
Philosophy

Mathematical Universe v. Boltzmann Brains

I’m a fan of the Mathematical Universe idea. Or rather, I was. I think I came up with the idea independently of (and before) Max Tegmark, based on one of my old LiveJournal blog-post dated “2007-01-12” (from context, I think that’s YYYY-MM-DD, not YYYY-DD-MM).

Here’s what I wrote then, including typos and poor rhetorical choices:

Ouch, my mind hurts. I've been thinking about The Nature of Reality again. This time, what I have is the idea that from the point of view of current science, the universe can be described as a giant equation: each particle obeys the laws of physics, which are just mathematical formula. Add to this that an mathematical system can exist before anyone defines it (9*10 was still equal to 90 before anybody could count that high), and you get reality existing because its underlying definitions do not contradict each-other.

This would mean that there are a lot of very simple, for lack of a better word, "universes" along the lines of the one containing only Bob and Sarah, where Sarah is three times the age of Bob now, and will be twice his age in 5 years' time. But it would also mean that there are an infinite number of universes which are, from the point of view of an external observer looking at the behaviour of those within them, completely indistinguishable from this one; this would be caused by, amongst other things, the gravitational constant being represented by an irrational number, and the difference between the different universes' gravitational constants varies by all possible fractions (in the everyday sense) of one divided by Graham's number.

Our universe contains representations of many more simple ones (I've described a simple one just now, and you get hundreds of others "universes" of this type in the mathematics books you had at school); you cannot, as an outside observer, interfere with such universes, because all you end up with is another universe. The original still exists, and the example Sarah is still 15. In this sense of existence, the Stargate universe is real because it follows fundamental rules which do not contradict themselves. These rules are of course not the rules the characters within it talk about, but the rules of the Canadian TV industry. There may be another universe where the rules the characters talk about do apply, but I'm not enough of a Stargate nerd to know if they are consistent in that way.

The point of this last little diversion, is that there could be (and almost certainly is) a universe much more complex than this one, which contains us as a component. The question, which I am grossly unqualified to contemplate but tried anyway (hence my mind hurting), is what is the most complex equation possible? (Apart from "God" in certain senses of that word). All I feel certain of at the moment, is that it would "simultaneously" (if you can use that word for something outside of time but containing it) contain every possible afterlife for every possible subset of people.

Tomorrow I will be in Cambridge.

Since writing that, I found out about Boltzmann brains. Boltzmann brains are a problem, because if they exist at all then it is (probably) overwhelmingly likely that you are one, and if you are one then it’s overwhelmingly likely that the you’re wrong about everything leading up to the belief that they exist, so any belief in them has to be irrational even if it’s also correct.

Boltzmann brains appear spontaneously in systems which are in thermal equilibrium for long enough (“long enough” being 101050 years from quantum fluctuations), but if you have all possible universes then you have a universe, an infinite number of universes, where Boltzmann brains are the most common form of brain — Therefore, all the problems that apply to Boltzmann brains must also apply to the Mathematical Universe.

Standard
AI, Minds, Philosophy, Politics

A.I. safety with Democracy?

Common path of discussion:

Alice: A.I. can already be dangerous, even though it’s currently narrow intelligence only. How do we make it safe before it’s general intelligence?

Bob: Democracy!

Alice: That’s a sentence fragment, not an answer. What do you mean?

Bob: Vote for what you want the A.I. to do 🙂

Alice: But people ask for what they think they want instead of what they really want — this leads to misaligned incentives/paperclip optimisers, or pathological focus on universal instrumental goals like money or power.

Bob: Then let’s give the A.I. to everyone, so we’re all equal and anyone who tells their A.I. to do something daft can be countered by everyone else.

Alice: But that assumes the machines operate on the same speed we do. If we assume that an A.G.I. can be made by duplicating a human brain’s connectome in silicon — mapping synapses to transistors — then even with no more Moore’s Law an A.G.I. would be out-pacing our thoughts by the same margin a pack of wolves outpaces continental drift (and the volume of a few dozen grains of sand).

Because we’re much too slow to respond to threats ourselves, any helpful A.G.I. working to stop a harmful A.G.I. would have to know what to do before we told it; yet if we knew how to make them work like that, then we wouldn’t need to, as all A.G.I. would stop themselves from doing anything harmful in the first place.

Bob: Balance of powers, just like governments — no single A.G.I can get too big, because all the other A.G.I. want the same limited resource.

Alice: Keep reading that educational webcomic. Even in the human case (and we can’t trust our intuition about the nature of an arbitrary A.G.I.), separation of powers only works if you can guarantee that those who seek power don’t collude. As humans collude, an A.G.I. (even one which seeks power only as an instrumental goal for some other cause) can be expected to collude with other similar A.G.I. (“A.G.I.s”? How do you pluralise an initialism?)


There’s probably something that should follow this, but I don’t know what as real conversations usually go stale well before my final Alice response (and even that might have been too harsh and conversation-stopping, I’d like to dig deeper and find out what happens next).

I still think we ultimately want “do what I meant not what I said“, but at the very least that’s really hard to specify and at worst I’m starting to worry that some (too many?) people may be unable to cope with the possibility that some of the things they want are incoherent or self-contradictory.

Whatever the solution, I suspect that politics and economics both have a lot of lessons available to help the development of safe A.I. — both limited A.I. that currently exists and also potential future tech such as human-level general A.I. (perhaps even super-intelligence, but don’t count on that).

Standard