Professional, Software

It’s far from a groundbreaking invention, but there’s something very satisfying about this way of finding the smallest value in a Swift array of Integers:

var minimum = someArray.reduce(Int.max, min)

 

Advertisements
Aside
Psychology, Software, Technology

Social media compulsion

He flashed up a slide of a shelf filled with sugary baked goods. “Just as we shouldn’t blame the baker for making such delicious treats, we can’t blame tech makers for making their products so good we want to use them,” he said. “Of course that’s what tech companies will do. And frankly: do we want it any other way?”The Guardian (website); ‘Our minds can be hijacked’: the tech insiders who fear a smartphone dystopia

I can, in fact, blame bakers. It’s easy: I do it in the same way I blame cigarette manufacturers. In all three cases (sugar/fat/flavour combinations, nicotine, social rewards) they exploit chemical pathways in our brains to get us to do something not in our best interests. They are supernormal stimuli — and given how recent the research is, I can forgive the early tobacconists and confectioners, but tech doesn’t get the luxury of ignorance-as-an-excuse.

I want my technology to be a tool which helps me get stuff done.

A drill is something I pick up, use to make a hole, then put down and forget about until I want to make another hole.

I don’t want a drill which is cursed so that if I ever put it down, I start to feel bad about not making more holes in things, and end up staying up late at night just to find yet one more thing I can drill into.

If I saw in a shop a drill which I knew would do that, I wouldn’t get it even if it was free, never broke, the (included) battery lasted a lifetime, etc. — the cost to the mind wouldn’t be worth it.

The same is true for the addictive elements of social media: I need to be connected to my friends, but I’d rather spend money than risk addiction.

Standard
Scams, Software

Anatomy of a scam, and LiveJournal’s lost passwords

LiveJournal seems to have leaked plain-text passwords.

I found this out because I’ve just received three scam emails that are trying to blackmail me for bitcoin worth [$1600, $1100, $1100].

Here is one of the emails; the others look similar, but each one is phrased slightly differently in a way that suggests a template filled with randomly selected phrases:


It appears that, («REDACTED BUT ACCURATE»), 's your password. You might not know me and you are probably wondering why you are getting this e-mail, right?

in fact, I setup a trojans on the adult vids (adult) web-site and you know what, you visited this website to have fun (you know very well what I mean). When you were watching videos, your internet browser started out functioning like a RDP (Team Viewer) which gave me accessibility of your screen and web cam. and then, my software programs obtained your complete contacts out of your Messenger, Outlook, Facebook, along with emails.

What did I really do?

I made a double-screen video clip. 1st part shows the video you're watching (you have a good taste haha . . .), and 2nd part shows the recording of your web cam.

exactly what should you do?

Well, I think, $1100 is really a fair price for your little hidden secret. You'll make the payment by Bitcoin (if you do not know this, search "how to buy bitcoin" in Google).

Bitcoin Address: «REDACTED BY ME IN CASE PUBLISHING IT AFFECTS REPORTING TO THE AUTHORITIES»
(It's case sensitive, so copy and paste it)

Very important:
You've some days to make the payment. (I've a completely unique pixel within this e-mail, and at this moment I am aware that you've read through this email message). If I don't get the BitCoins, I will certainly send out your video recording to all of your contacts including family, coworkers, and so forth. Having said that, if I receive the payment, I'll destroy the video immidiately. If you need evidence, reply with "Yes!" and i'll definitely send out your videos to your 6 contacts. It is a non-negotiable offer, that being said don't waste my personal time and yours by responding to this message.

Here are some of the headers:


X-Spam-Flag: NO
X-Spam-Score: 3.875
X-Spam-Level: ***
X-Spam-Status: No, score=3.875 required=10 tests=[INVALID_MSGID=1.167,
RCVD_IN_MSPIKE_BL=0.01, RCVD_IN_MSPIKE_L5=2.599, SPF_PASS=-0.001,
TO_IN_SUBJ=0.1] autolearn=disabled

and


Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=UTF-8

There are several clues here that it’s a toothless scam, but I suspect some people will fall for it if I don’t blog about it:

  1. There’s duct tape covering my webcam
  2. I don’t use Outlook
  3. There’s duct tape covering my webcam
  4. I block Facebook on my computer, and only use it on my mobile, because it’s a massive time-waster that stops me getting things done
  5. Did I mention there’s duct tape covering my webcam?
  6. I’m not into videos. My teenage years were the dial-up days, where everyone had still pictures or plain text and that was good enough. (Hashtag Four Yorkshiremen Humblebrag). Plus, I’m a furry, so the stuff I like tends to be met with blank stares and the words “I can’t even parse this image”, not “you have a good taste haha”.
  7. See #1
  8. The email is plain text and cannot contain a tracking pixel
  9. There’s still duct tape covering my webcam

Now, I’m saying LiveJournal in particular is the source of that leaked password, because that password is one I only ever used for LiveJournal. Never anywhere else. (In case you’re wondering, that LiveJournal blog has now been deleted owing to it being totally pointless).

I have confirmed via Troy Hunt’s Have I Been Pwned? that the password is in publicly known databases of leaked passwords. To my surprise, Have I Been Pwned? thinks that password is in use in two places, not one. My own list of personal passwords says I only use it in one place, and the nature of the password does not lend itself to reuse (it’s what you get if you mash a keyboard at random for 13 characters, not anything easily memorised).

Slightly more worrying is that when I duckduckgo’ed (Google found nothing) for the bitcoin addresses to see if they were known, one gave a single result for the https://www.sec.gov domain, and another gave a single result for https://www.panasonic.com/I have no reason to suspect either of those domains wittingly contained these bitcoin addresses, but this may be connected to a recent-ish Cryprojacking attack where many reputable websites included a third-party javascript library which had itself been hacked to mine bitcoin on the computers of unsuspecting users of unsuspecting websites.

When I’ve figured out the appropriate authorities, I’ll be reporting these emails to them.

Standard
AI, Software

Speed of machine intelligence

Every so often, someone tries to boast of human intelligence with the story of Shakuntala Devi — the stories vary, but they generally claim she beat the fastest supercomputer in the world in a feat of arithmetic, finding that the 23rd root of

916,748,676,920,039,158,098,660,927,585,380,162,483,106,680,144,308,622,407,126,516,427,934,657,040,867,096,593,279,205,767,480,806,790,022,783,016,354,924,852,380,335,745,316,935,111,903,596,577,547,340,075,681,688,305,620,821,016,129,132,845,564,805,780,158,806,771

was 546,372,891, and taking just 50 seconds to do so compared to the “over a minute” for her computer competitor.

Ignoring small details such as the “supercomputer” being named as a UNIVAC 1101, which wildly obsolete by the time of this event, this story dates to 1977 — and Moore’s Law over 41 years has made computers mind-defyingly powerful since then (if it was as simple as doubling in power every 18 months, it would 241/1.5 = 169,103,740 times faster, but Wikipedia shows even greater improvements on even shorter timescales going from the Cray X-MP in 1984 to standard consumer CPUs and GPUs in 2017, a factor of 1,472,333,333 improvement at fixed cost in only 33 years).

So, how fast are computers now? Well, here’s a small script to find out:

#!python

from datetime import datetime

before = datetime.now()

q = 916748676920039158098660927585380162483106680144308622407126516427934657040867096593279205767480806790022783016354924852380335745316935111903596577547340075681688305620821016129132845564805780158806771

for x in range(0,int(3.45e6)):
	a = q**(1./23)

after = datetime.now()

print after-before

It calculates the 23rd root of that number. It times itself as it does the calculation three million four hundred and fifty thousand times, repeating the calculation just to slow it down enough to make the time reading accurate.

Let’s see what how long it takes…

MacBook-Air:python kitsune$ python 201-digit-23rd-root.py 
0:00:01.140248
MacBook-Air:python kitsune$

1.14 seconds — to do the calculation 3,450,000 times.

My MacBook Air is an old model from mid-2013, and I’m already beating by more than a factor of 150 million someone who was (despite the oddities of the famous story) in the Guinness Book of Records for her mathematical abilities.

It gets worse, though. The next thing people often say is, paraphrased, “oh, but it’s cheating to program the numbers into the computer when the human had to read it”. Obviously the way to respond to that is to have the computer read for itself:

from sklearn import svm
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Find out how fast it learns
from datetime import datetime
# When did we start learning?
before = datetime.now()

clf = svm.SVC(gamma=0.001, C=100.)
digits = datasets.load_digits()
size = len(digits.data)/10
clf.fit(digits.data[:-size], digits.target[:-size])

# When did we stop learning?
after = datetime.now()
# Show user how long it took to learn
print "Time spent learning:", after-before

# When did we start reading?
before = datetime.now()
maxRepeats = 100
for repeats in range(0, maxRepeats):
	for x in range(0, size):
		data = digits.data[-x]
		prediction = clf.predict(digits.data[-x])

# When did we stop reading?
after = datetime.now()
print "Number of digits being read:", size*maxRepeats
print "Time spent reading:", after-before

# Show mistakes:
for x in range(0, size):
	data = digits.data[-x]
	target = digits.target[-x]
	prediction = clf.predict(digits.data[-x])
	if (target!=prediction):
		print "Target: "+str(target)+" prediction: "+str(prediction)
		grid = data.reshape(8, 8)
		plt.imshow(grid, cmap = cm.Greys_r)
		plt.show()

This learns to read using a standard dataset of hand-written digits, then reads all the digits in that set a hundred times over, then shows you what mistakes it’s made.

MacBook-Air:AI stuff kitsune$ python digits.py 
Time spent learning: 0:00:00.225301
Number of digits being read: 17900
Time spent reading: 0:00:02.700562
Target: 3 prediction: [5]
Target: 3 prediction: [5]
Target: 3 prediction: [8]
Target: 3 prediction: [8]
Target: 9 prediction: [5]
Target: 9 prediction: [8]
MacBook-Air:AI stuff kitsune$ 

0.225 seconds to learn to read, from scratch; then it reads just over 6,629 digits per second. This is comparable with both the speed of a human blink (0.1-0.4 seconds) and also with many of the claims* I’ve seen about human visual processing time, from retina to recognising text.

The A.I. is not reading perfectly, but looking at the mistakes it does make, several of them are forgivable even for a human. They are hand-written digits, and some of them look, even to me, more like the number the A.I. saw than the number that was supposed to be there — indeed, the human error rate for similar examples is 2.5%, while this particular A.I. has an error rate of 3.35%.

* I refuse to assert those claims are entirely correct, because I don’t have any formal qualification in that area, but I do have experience of people saying rubbish about my area of expertise — hence this blog post. I don’t intend to make the same mistake.

Standard
Futurology, Software, Technology

Hyperinflation in the attention economy: what succeeds adverts?

Adverts.

Lots of people block them because they’re really really annoying. (Also a major security risk that slows down your browsing experience, but I doubt that’s the main reason.)

Because adverts are executable (who thought that was a good idea?), they also get used for cryptocurrency mining. Really inefficient cryptocurrency mining, but still.

Because they cost money, there is a financial incentive to systematically defraud advertisers by showing lots of real, paid-for, adverts to lots of fake users. (See also: adverts are executable. Can one advert download ten more? Even sneakily in the background will do, the user doesn’t need to see them.)

Because of the faked consumption (amongst other reasons), advertisers don’t get good value for money, lowering demand; because of lowered demand, websites get less money than they would under an efficient system; because of something which seems analogous to hyperinflation (but affecting the supply of spaces in which to advertise rather than the supply of money), websites are crowded with adverts; because of the excess of adverts, lots of people block them.

What if there was a better way?

Cut out the middle man, explicitly fund your website with your own cryptocurrency mining? Users see no adverts, don’t have their attention syphoned away.

Challenge: the problem I’m calling hyperinflation of attention (probably inaccurately, but it’s a good metaphor) would still apply with cryptocurrency mining resource supply. This is already a separate problem with cryptocurrency mining — way too many people are spending way too many resources on something which is only counting and storing value but without fundamentally adding value to the system.

Potential solution: a better cryptocurrency, one which actually does something useful. Useful work such as SETI@home or folding@home — if it must be a currency, then perhaps one where each unit of useful work gets exchanged for a token which can be traded or redeemed with the organisation which produced it, in much the same way that banknotes could, for a long time, be taken to a central bank and exchanged for gold. And the token could be redeemed for whatever is economically useful — a user may perform 1e9 operations now in exchange for a token which would given them 2e9 floating point operations in five years (by which time floating point operations should be 10 times cheaper); or the user decodes two human genomes now in exchange for a token to decode one of their choice later; or whatever.

A separate, but solvable, issue is that the only things I can think of which are processing-power-limited right now are research (climate forecasts, particle physics, brain simulation, simulated drug testing, AI), or used directly by the consumer (video game graphics), or are a colossal waste of resources (bitcoin, spam) — I’ll freely admit this list may be just down to ignorance on my part — so far as I can see, the only one of those which pairs website visitors with actual income would be the video games… but even then it would be utter insanity for the paid customers to have their image rendering offloaded onto the non-payers. The clear solution to this is the same sort of mechanism that currently “solves” advertising: automated auction by those who want to buy your CPU time and websites that want to sell access to your CPU time.

Downside: this will kill you batteries if you don’t disable JavaScript.

Standard
Software

I’m updating my six-year-old Runestone code. Objective-C has changed, Cocos2d has effectively been replaced with SpriteKit, and my understanding of the language has improved massively. Net result? It’s embarrassing.

Once this thing is running as it should, I may rewrite from scratch just to see how bad a project has to be for rewrites to be worth it.

Aside
AI, Software, Technology

Automated detection of propaganda and cultural bias

The ability of word2vec to detect relationships between words (for example that “man” is to “king” as “woman” is to “queen”) can already be used to detect biases. Indeed, the biases are so easy to find, so blatant, that they are embarrassing.

Can this automated detection of cultural bias be used to detect deliberate bias, such as propaganda? It depends in part on how large the sample set is, and in part on how little data the model needs to become effective.

I suspect that such a tool would work only for long-form propaganda, and for detecting people who start to believe and repeat that propaganda: individual tweets — or even newspaper articles — are likely to be far too short for these tools, but the combined output of all their tweets (or a year of some journalist’s articles) might be sufficient.

If it is at all possible, it would of course be very useful. For a few hours, until the propagandists started using the same tool the way we now all use spell checkers — they’re professionals, after all, who will use the best tools money can buy.

That’s the problem with A.I., as well as the promise: it’s a tool for thinking faster, and it’s a tool which is very evenly distributed throughout society, not just in the hands of those we approve of.

Of course… are we right about who we approve of, or is our hatred of Them just because of propaganda we’ve fallen for ourselves?

(Note: I’ve seen people, call them Bobs, saying “x is propaganda”, but I’ve never been able to convince any of the Bobs that they are just as likely to fall for propaganda as the people they are convinced have fallen for propaganda. If you have any suggestions, please comment).

Standard