AI, Minds, Philosophy, Politics

A.I. safety with Democracy?

Common path of discussion:

Alice: A.I. can already be dangerous, even though it’s currently narrow intelligence only. How do we make it safe before it’s general intelligence?

Bob: Democracy!

Alice: That’s a sentence fragment, not an answer. What do you mean?

Bob: Vote for what you want the A.I. to do 🙂

Alice: But people ask for what they think they want instead of what they really want — this leads to misaligned incentives/paperclip optimisers, or pathological focus on universal instrumental goals like money or power.

Bob: Then let’s give the A.I. to everyone, so we’re all equal and anyone who tells their A.I. to do something daft can be countered by everyone else.

Alice: But that assumes the machines operate on the same speed we do. If we assume that an A.G.I. can be made by duplicating a human brain’s connectome in silicon — mapping synapses to transistors — then even with no more Moore’s Law an A.G.I. would be out-pacing our thoughts by the same margin a pack of wolves outpaces continental drift (and the volume of a few dozen grains of sand).

Because we’re much too slow to respond to threats ourselves, any helpful A.G.I. working to stop a harmful A.G.I. would have to know what to do before we told it; yet if we knew how to make them work like that, then we wouldn’t need to, as all A.G.I. would stop themselves from doing anything harmful in the first place.

Bob: Balance of powers, just like governments — no single A.G.I can get too big, because all the other A.G.I. want the same limited resource.

Alice: Keep reading that educational webcomic. Even in the human case (and we can’t trust our intuition about the nature of an arbitrary A.G.I.), separation of powers only works if you can guarantee that those who seek power don’t collude. As humans collude, an A.G.I. (even one which seeks power only as an instrumental goal for some other cause) can be expected to collude with other similar A.G.I. (“A.G.I.s”? How do you pluralise an initialism?)


There’s probably something that should follow this, but I don’t know what as real conversations usually go stale well before my final Alice response (and even that might have been too harsh and conversation-stopping, I’d like to dig deeper and find out what happens next).

I still think we ultimately want “do what I meant not what I said“, but at the very least that’s really hard to specify and at worst I’m starting to worry that some (too many?) people may be unable to cope with the possibility that some of the things they want are incoherent or self-contradictory.

Whatever the solution, I suspect that politics and economics both have a lot of lessons available to help the development of safe A.I. — both limited A.I. that currently exists and also potential future tech such as human-level general A.I. (perhaps even super-intelligence, but don’t count on that).

Advertisements
Standard
AI, Philosophy

Unfortunate personhood tests for A.I.

What if the only way to tell if a particular A.I. design is or is not a person, is to subject it to all the types of experience — both good and harrowing — that we know impact the behaviour of the only example of personhood we all agree on, and seeing if it changes in the same way we change?

Is it moral to create a digital hell for a thousand, if that’s the only way to prevent carbon chauvinism/anti-silicon discrimination for a billion?

Standard
Futurology, Science, AI, Philosophy, Psychology

How would you know whether an A.I. was a person or not?

I did an A-level in Philosophy. (For non UK people, A-levels are a 2-year course that happens after highschool and before university).

I did it for fun rather than good grades — I had enough good grades to get into university, and when the other A-levels required my focus, I was fine putting zero further effort into the Philosophy course. (Something which was very clear when my final results came in).

What I didn’t expect at the time was that the rapid development of artificial intelligence in my lifetime would make it absolutely vital that humanity develops a concrete and testable understanding of what counts as a mind, as consciousness, as self-awareness, and as capability to suffer. Yes, we already have that as a problem in the form of animal suffering and whether meat can ever be ethical, but the problem which already exists, exists only for our consciences — the animals can’t take over the world and treat us the way we treat them, but an artificial mind would be almost totally pointless if it was as limited as an animal, and the general aim is quite a lot higher than that.

Some fear that we will replace ourselves with machines which may be very effective at what they do, but don’t have anything “that it’s like to be”. One of my fears is that we’ll make machines that do “have something that it’s like to be”, but who suffer greatly because humanity fails to recognise their personhood. (A paperclip optimiser doesn’t need to hate us to kill us, but I’m more interested in the sort of mind that can feel what we can feel).

I don’t have a good description of what I mean by any of the normal words. Personhood, consciousness, self awareness, suffering… they all seem to skirt around the core idea, but to the extent that they’re correct, they’re not clearly testable; and to the extent that they’re testable, they’re not clearly correct. A little like the maths-vs.-physics dichotomy.

Consciousness? Versus what, subconscious decision making? Isn’t this distinction merely system 1 vs. system 2 thinking? Even then, the word doesn’t tell us what it means to have it objectively, only subjectively. In some ways, some forms of A.I. looks like system 1 — fast, but error prone, based on heuristics; while other forms of A.I. look like system 2 — slow, careful, deliberative weighing all the options.

Self-awareness? What do we even mean by that? It’s absolutely trivial to make an A.I. aware of it’s own internal states, even necessary for anything more than a perceptron. Do we mean a mirror test? (Or non-visual equivalent for non-visual entities, including both blind people and also smell-focused animals such as dogs). That at least can be tested.

Capability to suffer? What does that even mean in an objective sense? Is suffering equal to negative reinforcement? If you have only positive reinforcement, is the absence of reward itself a form of suffering?

Introspection? As I understand it, the human psychology of this is that we don’t really introspect, we use system 2 thinking to confabulate justifications for what system 1 thinking made us feel.

Qualia? Sure, but what is one of these as an objective, measurable, detectable state within a neural network, be it artificial or natural?

Empathy or mirror neurons? I can’t decide how I feel about this one. At first glance, if one mind can feel the same as another mind, that seems like it should have the general ill-defined concept I’m after… but then I realised, I don’t see why that would follow and had the temporarily disturbing mental concept of an A.I. which can perfectly mimic the behaviour corresponding to the emotional state of someone they’re observing, without actually feeling anything itself.

And then the disturbance went away as I realised this is obviously trivially possible, because even a video recording fits that definition… or, hey, a mirror. A video recording somehow feels like it’s fine, it isn’t “smart” enough to be imitating, merely accurately reproducing. (Now I think about it, is there an equivalent issue with the mirror test?)

So, no, mirror neurons are not enough to be… to have the qualia of being consciously aware, or whatever you want to call it.

I’m still not closer to having answers, but sometimes it’s good to write down the questions.

Standard
AI, Software

Speed of machine intelligence

Every so often, someone tries to boast of human intelligence with the story of Shakuntala Devi — the stories vary, but they generally claim she beat the fastest supercomputer in the world in a feat of arithmetic, finding that the 23rd root of

916,748,676,920,039,158,098,660,927,585,380,162,483,106,680,144,308,622,407,126,516,427,934,657,040,867,096,593,279,205,767,480,806,790,022,783,016,354,924,852,380,335,745,316,935,111,903,596,577,547,340,075,681,688,305,620,821,016,129,132,845,564,805,780,158,806,771

was 546,372,891, and taking just 50 seconds to do so compared to the “over a minute” for her computer competitor.

Ignoring small details such as the “supercomputer” being named as a UNIVAC 1101, which wildly obsolete by the time of this event, this story dates to 1977 — and Moore’s Law over 41 years has made computers mind-defyingly powerful since then (if it was as simple as doubling in power every 18 months, it would 241/1.5 = 169,103,740 times faster, but Wikipedia shows even greater improvements on even shorter timescales going from the Cray X-MP in 1984 to standard consumer CPUs and GPUs in 2017, a factor of 1,472,333,333 improvement at fixed cost in only 33 years).

So, how fast are computers now? Well, here’s a small script to find out:

#!python

from datetime import datetime

before = datetime.now()

q = 916748676920039158098660927585380162483106680144308622407126516427934657040867096593279205767480806790022783016354924852380335745316935111903596577547340075681688305620821016129132845564805780158806771

for x in range(0,int(3.45e6)):
	a = q**(1./23)

after = datetime.now()

print after-before

It calculates the 23rd root of that number. It times itself as it does the calculation three million four hundred and fifty thousand times, repeating the calculation just to slow it down enough to make the time reading accurate.

Let’s see what how long it takes…

MacBook-Air:python kitsune$ python 201-digit-23rd-root.py 
0:00:01.140248
MacBook-Air:python kitsune$

1.14 seconds — to do the calculation 3,450,000 times.

My MacBook Air is an old model from mid-2013, and I’m already beating by more than a factor of 150 million someone who was (despite the oddities of the famous story) in the Guinness Book of Records for her mathematical abilities.

It gets worse, though. The next thing people often say is, paraphrased, “oh, but it’s cheating to program the numbers into the computer when the human had to read it”. Obviously the way to respond to that is to have the computer read for itself:

from sklearn import svm
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Find out how fast it learns
from datetime import datetime
# When did we start learning?
before = datetime.now()

clf = svm.SVC(gamma=0.001, C=100.)
digits = datasets.load_digits()
size = len(digits.data)/10
clf.fit(digits.data[:-size], digits.target[:-size])

# When did we stop learning?
after = datetime.now()
# Show user how long it took to learn
print "Time spent learning:", after-before

# When did we start reading?
before = datetime.now()
maxRepeats = 100
for repeats in range(0, maxRepeats):
	for x in range(0, size):
		data = digits.data[-x]
		prediction = clf.predict(digits.data[-x])

# When did we stop reading?
after = datetime.now()
print "Number of digits being read:", size*maxRepeats
print "Time spent reading:", after-before

# Show mistakes:
for x in range(0, size):
	data = digits.data[-x]
	target = digits.target[-x]
	prediction = clf.predict(digits.data[-x])
	if (target!=prediction):
		print "Target: "+str(target)+" prediction: "+str(prediction)
		grid = data.reshape(8, 8)
		plt.imshow(grid, cmap = cm.Greys_r)
		plt.show()

This learns to read using a standard dataset of hand-written digits, then reads all the digits in that set a hundred times over, then shows you what mistakes it’s made.

MacBook-Air:AI stuff kitsune$ python digits.py 
Time spent learning: 0:00:00.225301
Number of digits being read: 17900
Time spent reading: 0:00:02.700562
Target: 3 prediction: [5]
Target: 3 prediction: [5]
Target: 3 prediction: [8]
Target: 3 prediction: [8]
Target: 9 prediction: [5]
Target: 9 prediction: [8]
MacBook-Air:AI stuff kitsune$ 

0.225 seconds to learn to read, from scratch; then it reads just over 6,629 digits per second. This is comparable with both the speed of a human blink (0.1-0.4 seconds) and also with many of the claims* I’ve seen about human visual processing time, from retina to recognising text.

The A.I. is not reading perfectly, but looking at the mistakes it does make, several of them are forgivable even for a human. They are hand-written digits, and some of them look, even to me, more like the number the A.I. saw than the number that was supposed to be there — indeed, the human error rate for similar examples is 2.5%, while this particular A.I. has an error rate of 3.35%.

* I refuse to assert those claims are entirely correct, because I don’t have any formal qualification in that area, but I do have experience of people saying rubbish about my area of expertise — hence this blog post. I don’t intend to make the same mistake.

Standard
AI, Philosophy

Nietzsche, Facebook, and A.I.

“If you stare into The Facebook, The Facebook stares back at you.”

I think this fits the reality of digital surveillance much better than it fits the idea Nietzsche was trying to convey when he wrote the original.

Facebook and Google look at you with an unblinking eye; they look at all of us which they can reach, even those without accounts; two billion people on Facebook, their every keystroke recorded, even those they delete; every message analysed, even those never sent; every photo processed, even those kept private; on Google maps, every step taken or turn missed, every place where you stop, becomes an update for the map.

We’re lucky that A.I. isn’t as smart as a human, because if it was, such incomprehensible breadth and depth of experience would make Sherlock look like an illiterate child raised by wild animals in comparison. Even without hypothesising new technologies that a machine intelligence may or may not invent, even just a machine that does exactly what its told by its owner… this dataset alone ought to worry any who fear the thumb of a totalitarian micro-managing your life.

Standard
AI, Futurology

The end of human labour is inevitable, here’s why

OK. So, you might look at state-of-the-art A.I. and say “oh, this uses too much power compared to a human brain” or “this takes too many examples compared to a human brain”.

So far, correct.

But there are 7.6 billion humans: if an A.I. watches all of them all of the time (easy to imagine given around 2 billion of us already have two or three competing A.I. in our pockets all the time, forever listening for an activation keyword), then there is an enormous set of examples with which to train the machine mind.

“But,” you ask, “what about the power consumption?”

Humans cost a bare minimum of $1.25 per day, even if they’re literally slaves and you only pay for food and (minimal) shelter. Solar power can be as cheap as 2.99¢/kWh.

Combined, that means that any A.I. which uses less than 1.742 kilowatts per human-equivalent-part is beating the cheapest possible human — By way of comparison, Google’s first generation Tensor processing unit uses 40 W when busy — in the domain of Go, it’s about 174,969 times as cost efficient as a minimum-cost human, because four of them working together as one can teach itself to play Go better than the best human in… three days.

And don’t forget that it’s reasonable for A.I. to have as many human-equivalent-parts as there are humans performing whichever skill is being fully automated.

Skill. Not sector, not factory, skill.

And when one skill is automated away, when the people who performed that skill go off to retrain on something else, no matter where they are or what they do, there will be an A.I. watching them and learning with them.

Is there a way out?

Sure. All you have to do is make sure you learn a skill nobody else is learning.

Unfortunately, there is a reason why “thinking outside the box” is such a business clichĂ©: humans suck at that style of thinking, even when we know what it is and why it’s important. We’re too social, we copy each other and create by remixing more than by genuinely innovating, even when we think we have something new.

Computers are, ironically, better than humans at thinking outside the box: two of the issues in Concrete Problems in AI Safety are there because machines easily stray outside the boxes we are thinking within when we give them orders. (I suspect that one of the things which forces A.I. to need far more examples to learn things than we humans do is that they have zero preconceived notions, and therefore must be equally open-minded to all possibilities).

Worse, no matter how creative you are, if other humans see you performing a skill that machines have yet to master, those humans will copy you… and then the machines, even today’s machines, will rapidly learn from all the enthusiastic humans who are so gleeful about their new trick to stay one step ahead of the machines, the new skill they can point to and say “look, humans are special, computers can’t do this” right up until the computers do it.

Standard
AI

Would this be a solution to the problem of literal-Genie omniscient AIs?

[stupivalent: neither malevolent nor benevolent, just doing exactly what it was told without awareness that what you said isn’t what you meant]

Imagine an AI that, as per [Robert Mile’s youtube videos], has a perfect model of reality, that has absolutely no ethical constraints, and that is given the instruction “collect as many stamps as possible”.

Could the bad outcome be prevented if the AI was built to always add the following precondition, regardless of what it was tasked by a human to achieve?

“Your reward function is measured in terms of how well the person who gave you the instruction would have reacted if they had heard, at the moment they gave you the instruction, what you were proposing to do.”

One might argue that Robert Miles’ stamp collector AI is a special case, as it is presupposed to model reality perfectly. I think such an objection is unreasonable: models don’t have to be perfect to cause the problems he described, and models don’t have to be perfect to at least try to predict what someone would have wanted.

How do you train an AI to figure out what people will and won’t approve of? I’d conjecture having the AI construct stories, tell those stories to people, and learn through story-telling what people consider to be “happy endings” and “sad endings”. Well, construct and read, but it’s much harder to teach a machine to read than it is to teach it to write — we’ve done the latter, the former might be Turing-complete.

Disclaimer: I have an A-level in philosophy, but it’s not a good one. I’m likely to be oblivious to things that proper philosophers consider common knowledge. I’ve also been spending most of the last 18 months writing a novel and only covering recent developments in AI in my spare time.

Standard