AI, Minds, Philosophy, Politics

A.I. safety with Democracy?

Common path of discussion:

Alice: A.I. can already be dangerous, even though it’s currently narrow intelligence only. How do we make it safe before it’s general intelligence?

Bob: Democracy!

Alice: That’s a sentence fragment, not an answer. What do you mean?

Bob: Vote for what you want the A.I. to do 🙂

Alice: But people ask for what they think they want instead of what they really want — this leads to misaligned incentives/paperclip optimisers, or pathological focus on universal instrumental goals like money or power.

Bob: Then let’s give the A.I. to everyone, so we’re all equal and anyone who tells their A.I. to do something daft can be countered by everyone else.

Alice: But that assumes the machines operate on the same speed we do. If we assume that an A.G.I. can be made by duplicating a human brain’s connectome in silicon — mapping synapses to transistors — then even with no more Moore’s Law an A.G.I. would be out-pacing our thoughts by the same margin a pack of wolves outpaces continental drift (and the volume of a few dozen grains of sand).

Because we’re much too slow to respond to threats ourselves, any helpful A.G.I. working to stop a harmful A.G.I. would have to know what to do before we told it; yet if we knew how to make them work like that, then we wouldn’t need to, as all A.G.I. would stop themselves from doing anything harmful in the first place.

Bob: Balance of powers, just like governments — no single A.G.I can get too big, because all the other A.G.I. want the same limited resource.

Alice: Keep reading that educational webcomic. Even in the human case (and we can’t trust our intuition about the nature of an arbitrary A.G.I.), separation of powers only works if you can guarantee that those who seek power don’t collude. As humans collude, an A.G.I. (even one which seeks power only as an instrumental goal for some other cause) can be expected to collude with other similar A.G.I. (“A.G.I.s”? How do you pluralise an initialism?)

There’s probably something that should follow this, but I don’t know what as real conversations usually go stale well before my final Alice response (and even that might have been too harsh and conversation-stopping, I’d like to dig deeper and find out what happens next).

I still think we ultimately want “do what I meant not what I said“, but at the very least that’s really hard to specify and at worst I’m starting to worry that some (too many?) people may be unable to cope with the possibility that some of the things they want are incoherent or self-contradictory.

Whatever the solution, I suspect that politics and economics both have a lot of lessons available to help the development of safe A.I. — both limited A.I. that currently exists and also potential future tech such as human-level general A.I. (perhaps even super-intelligence, but don’t count on that).

Scams, Software

Anatomy of a scam, and LiveJournal’s lost passwords

LiveJournal seems to have leaked plain-text passwords.

I found this out because I’ve just received three scam emails that are trying to blackmail me for bitcoin worth [$1600, $1100, $1100].

Here is one of the emails; the others look similar, but each one is phrased slightly differently in a way that suggests a template filled with randomly selected phrases:

It appears that, («REDACTED BUT ACCURATE»), 's your password. You might not know me and you are probably wondering why you are getting this e-mail, right?

in fact, I setup a trojans on the adult vids (adult) web-site and you know what, you visited this website to have fun (you know very well what I mean). When you were watching videos, your internet browser started out functioning like a RDP (Team Viewer) which gave me accessibility of your screen and web cam. and then, my software programs obtained your complete contacts out of your Messenger, Outlook, Facebook, along with emails.

What did I really do?

I made a double-screen video clip. 1st part shows the video you're watching (you have a good taste haha . . .), and 2nd part shows the recording of your web cam.

exactly what should you do?

Well, I think, $1100 is really a fair price for your little hidden secret. You'll make the payment by Bitcoin (if you do not know this, search "how to buy bitcoin" in Google).

(It's case sensitive, so copy and paste it)

Very important:
You've some days to make the payment. (I've a completely unique pixel within this e-mail, and at this moment I am aware that you've read through this email message). If I don't get the BitCoins, I will certainly send out your video recording to all of your contacts including family, coworkers, and so forth. Having said that, if I receive the payment, I'll destroy the video immidiately. If you need evidence, reply with "Yes!" and i'll definitely send out your videos to your 6 contacts. It is a non-negotiable offer, that being said don't waste my personal time and yours by responding to this message.

Here are some of the headers:

X-Spam-Flag: NO
X-Spam-Score: 3.875
X-Spam-Level: ***
X-Spam-Status: No, score=3.875 required=10 tests=[INVALID_MSGID=1.167,
TO_IN_SUBJ=0.1] autolearn=disabled


Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=UTF-8

There are several clues here that it’s a toothless scam, but I suspect some people will fall for it if I don’t blog about it:

  1. There’s duct tape covering my webcam
  2. I don’t use Outlook
  3. There’s duct tape covering my webcam
  4. I block Facebook on my computer, and only use it on my mobile, because it’s a massive time-waster that stops me getting things done
  5. Did I mention there’s duct tape covering my webcam?
  6. I’m not into videos. My teenage years were the dial-up days, where everyone had still pictures or plain text and that was good enough. (Hashtag Four Yorkshiremen Humblebrag). Plus, I’m a furry, so the stuff I like tends to be met with blank stares and the words “I can’t even parse this image”, not “you have a good taste haha”.
  7. See #1
  8. The email is plain text and cannot contain a tracking pixel
  9. There’s still duct tape covering my webcam

Now, I’m saying LiveJournal in particular is the source of that leaked password, because that password is one I only ever used for LiveJournal. Never anywhere else. (In case you’re wondering, that LiveJournal blog has now been deleted owing to it being totally pointless).

I have confirmed via Troy Hunt’s Have I Been Pwned? that the password is in publicly known databases of leaked passwords. To my surprise, Have I Been Pwned? thinks that password is in use in two places, not one. My own list of personal passwords says I only use it in one place, and the nature of the password does not lend itself to reuse (it’s what you get if you mash a keyboard at random for 13 characters, not anything easily memorised).

Slightly more worrying is that when I duckduckgo’ed (Google found nothing) for the bitcoin addresses to see if they were known, one gave a single result for the domain, and another gave a single result for — I have no reason to suspect either of those domains wittingly contained these bitcoin addresses, but this may be connected to a recent-ish Cryprojacking attack where many reputable websites included a third-party javascript library which had itself been hacked to mine bitcoin on the computers of unsuspecting users of unsuspecting websites.

When I’ve figured out the appropriate authorities, I’ll be reporting these emails to them.


Dynamic range of Bayesian thought

We naturally use something close to Bayesian logic when we learn and intuit. Bayesian logic doesn’t update when the prior is 0 or 1. Some people can’t shift their opinions, no matter what evidence they have. This is compatible with them having priors of 0 or 1.

It would be implausible for humans to store neural weights with â„ťeal numbers. How many bits (base-2) do we use to store our implicit priors? My gut feeling says it’s a shockingly small number, perhaps 4.

How little evidence do we need to become trapped in certainty? Is it even constant (or close to) for all humans?


Old predictions, and how they’ve been standing up

The following was originally posted to a (long-since deleted) Livejournal account on 2012-06-05 02:55:27 BST. I have not edited this at all. Some of these predictions from 6 years ago have stood up pretty well, other predictions have been proven impossible.

Predicting the future is, in retrospect, hilarious. Nonetheless, I want to make a guess as to how the world will look in ten years, even if only to have a concrete record of how inaccurate I am. Unless otherwise specified, these predictions are for 2022:

Starting with the least surprising: By 2024, solar power will be the cheapest form of electricity for everyone closer to the equator than the north of France. Peak solar power output will equal current total power output from all sources, while annual average output will be 25%. Further progress relies on developments of large-scale energy storage systems, which may or may not happen depending on electric cars.

By 2022, CPU lithography will either reach 4nm, or everyone will decide it’s too expensive to keep on shrinking and stop sooner. There are signs that the manufacturers may not be able to mass produce 4nm chips due to their cost, even though that feature size is technically possible, so I’m going to say minimum feature size will be larger than you might expect from Moore’s law. One might assume that they can still get cheaper, even if not more powerful per unit area, but there isn’t much incentive to reduce production cost if you don’t also reduce power consumption; currently power consumption is improving slightly faster than Moore’s law, but not by much.

LEDs will be the most efficient light source; they will also be a tenth of their current price, making compact fluorescents totally obsolete. People will claim they take ages to get to full brightness, just because they are still energy-saving.

Bulk storage will probably be spinning magnetic platers, and flash drives will be as obsolete in 2022 as the floppy disk is in 2012. (Memristor based storage is an underdog on this front, at least on the scale of 10 years.)

Western economies won’t go anywhere fast in the next 4 years, but might go back to normal after that; China’s economy will more than double in size by 2022.

In the next couple of years, people will have realised that 3D printers take several hours to produce something the size of a cup and started to dismiss them as a fad. Meanwhile, people who already know the limitations of 3D printers have already, in 2011 used them for organ culture — in 10 to 20 years, “cost an arm and a leg” will fall out of common use in the same way and for the same reason that “you can no more XYZ than you can walk on the Moon” fell out of use in 1969 — if you lose either an arm or a leg, you will be able to print out a replacement. I doubt there will be full self-replication by 2022, but I wouldn’t bet against it.

No strong general A.I., but the problem is software rather than hardware, so if I’m wrong you won’t notice until it’s too late. (A CPU’s transistors change state a hundred million times faster than your neurons, and the minimum feature size of the best 2012 CPUs is 22nm, compared to the 200nm thickness of the smallest dendrite that Google told me about).

Robot cars will be available in many countries by 2020, rapidly displacing human drivers because they are much safer and therefore cheaper to insure; Taxi drivers disappear first, truckers fight harder but still fall. Human drivers may be forbidden from public roads by 2030.

Robot labour will be an even more significant part of the workforce. Foxconn, or their equivalent, will use more robots than there are people in Greater London.

SpaceX and similar companies lower launch costs by at least a factor of 10; these launch costs combine with standardised micro-satellites allow at least one university, 6th form, or school to launch a probe to the moon.

Graphene proves useful, but does not become a wonder material. Cookware coated in synthetic diamond is commonplace, and can be bought in Tesco. Carbon nanotube rope is available in significant lengths from specialist retailers, but still very expensive.

In-vitro meat will have been eaten, but probably still be considered experimental by 2020. There will be large protests and well-signed petitions against it, but these will be ignored.

Full-genome sequencing will cost about a hundred quid and take less than 10 hours.

3D television and films will fail and be revived at least once more.

E-book readers will be physically flexible, with similar resolution to print.

Hydrogen will not be developed significantly; biofuels will look promising, but will probably lose out to electric cars because they go so well with solar power (alternative: genetic engineering makes a crop that can be burned in existing power stations, making photovoltaic and solar-thermal plants redundant while also providing fuel for petrol and diesel car engines); fusion will continue to not be funded properly; too many people will remain too scared of fission for it to improve significantly; lots of people will still be arguing about wind turbines, and others will still be selling snake-oil “people-powered” devices.

Machine vision will be connected to every CCTV system that gets sold in 2020, and it will do a better job than any combination of human operators could possibly manage. The now-redundant human operators will argue loudly that “a computer could never look at someone and know how they are feeling, it could never know if someone is drunk and about to start a fight”; someone will put this to the test, and the machine will win.

High-temperature superconductivity currently seems to be developing at random, so I can’t say if we will have any progress or not. I’m going to err on the side of caution, and say no significant improvements by 2022.

Optical-wavelength cloaking fabric will be available by the mid 2020s, but very expensive and probably legally restricted to military and law enforcement.

Most of Kepler’s exoplanet candidates will be confirmed in the next few years; by 2022, we will have found and confirmed an Earth-like planet in the habitable zone of it’s star (right now, the most Earth-like candidate exoplanet, Gliese 581 g, in unconfirmed while the most Earth-like confirmed exoplanet, Gliese 581 d, is only slightly more habitable than Mars). We will find out if there is life on that world, but the answer will make no difference to most people’s lives.

OpenStreetMap will have replaced all other maps in almost every situation; Facebook will lose it’s crown as The social network; The comments section of most websites will still make people loose faith in humanity; English Wikipedia will be “complete” for some valid definition of the word.

Obama will win 2012, the Republicans will win 2016; The Conservatives will lose control of the UK regardless of when the next UK general election is held, but the Lib Dems might recover if Clegg departs.

Errors and omissions expected. It’s 3am!.



How many things are there, which one cannot learn? No matter how much effort is spent trying?

I’m aware that things like conscious control of intestinal peristalsis would fit this question (probably… I mean, who would’ve tried?) but I’m not interested in purely autonomous stuff.

Assuming the stereotypes are correct, I mean stuff like adults not being able to fully cross the Chinese-English language barrier in either direction if they didn’t learn both languages as children (if you read out The Lion-Eating Poet in the Stone Den I can tell that the Shis are different to each other, but I can’t tell if the difference I hear actually conveys a difference-of-meaning or if it is just natural variation of the sort I produce if I say “Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo”, and I’m told this difficulty persists even with practice; in reverse, the ‘R’/’L’ error is a common stereotype of Chinese people speaking English). Is there something like that for visual recognition? Some people cannot recognise faces, is there something like that but for all humans? Something where no human can recognise which of two things they are looking at, even if we know that a difference exists?

Languages in general seem to be extremely difficult for most adults: personally, I’ve never been able to get my mind around all the tenses of irregular verbs in French… but is that genuinely unlearnable or something I could overcome with perseverance? I found German quite straightforward, so there may be something else going on.

Are there any other possibilities? Some people struggle with maths: is it genuinely unlearnable by the majority, or just difficult and lacking motive? Probability in particular comes to mind, because people can have the Monty Hall problem explained and not get it.

One concept I’ve only just encountered, but which suddenly makes sense of a lot of behaviour I’ve seen in politics, is called Morton’s demon by analogy with Maxwell’s demon. A filter at the level of perception which allows people to ignore and reject without consideration facts which ought to change their minds. It feels — and I recognise with amusement the oddity of using system 1 thinking at this point — like a more powerful thing than Cherry picking, Cognitive dissonance, Confirmation bias, etc., and it certainly represents — with regard to system 2 thinking — the sort of “unlearnable” I have in mind.

AI, Philosophy

Unfortunate personhood tests for A.I.

What if the only way to tell if a particular A.I. design is or is not a person, is to subject it to all the types of experience — both good and harrowing — that we know impact the behaviour of the only example of personhood we all agree on, and seeing if it changes in the same way we change?

Is it moral to create a digital hell for a thousand, if that’s the only way to prevent carbon chauvinism/anti-silicon discrimination for a billion?