Metal Band or My Little Pony?
Neural networks are algorithms that learn by example, rather than by following a programmer’s set of rules. Although on this blog I’ve mostly been using them to generate new examples of things (like paint colors, halloween costumes, or craft beers), neural networks can do a lot more.
One thing neural networks can do is classify things. Give them a bunch of examples of one kind of thing, and a bunch of examples of another kind of thing, and it will (hopefully) learn to tell the two apart. This is really useful - for identifying obstacles for self-driving cars, for telling diseased tissue from healthy tissue, and even (with mixed success) for identifying spam or troll comments. I wanted to test this kind of algorithm out, so I devised the simplest task I could think of: telling metal bands from My Little Ponies.
I’ve previously trained text-generating algorithms to generate metal bands and My Little Ponies, so I had datasets ready to go. IBM Watson has a very easy-to-use tool for training classifiers (there’s a classroom-friendly version at machinelearningforkids.co.uk). I loaded in all 1,300 of the My Little Pony names I had, and filled the rest of the tool’s memory with metal bands (about 18,700).
Then I entered some new pony names - neural network-generated pony names so they weren’t in the original dataset - to see how it would classify them. The result:
The neural network labeled *everything* as metal. People who have worked with neural network classifiers before will have seen this coming: with a dataset that was 94% metal class and only 6% pony class, I had set myself up with a classic case of something called class imbalance. The neural network found it could achieve 94% accuracy on my training dataset by calling everything metal. Princess Pie? Metal band with 81% confidence. Sweetie Loo? 85% likely to be metal. Sparkle Cheer? 84% sure that’s a metal band. Flutter Buns? So, so metal. 97%. The only names it didn’t label as metal bands were ponies that were the original dataset. So, Twilight Sparkle? 100% pony. Twilight Sprinkle, though? 83% metal.
The fix was easy: I trained the classifier again, this time with equal numbers of ponies and metal bands. This time the results were a lot more believable. And, the classifier network mostly agreed with the generator neural network names. There were some surprises, though.
Generated metal bands:
Dragonred of Blood - 100% metal, 0% pony
Deathhouse - 97% metal, 3% pony
Vermit - 97% metal, 3% pony
Sespessstion Sanicilevus - 97% metal, 3% pony
Stormgarden - 97% metal, 3% pony
Vomberdean - 96% metal, 4% pony
Swiil - 96% metal, 4% pony
Dragorhast - 96% metal, 4% pony
Sun Damage Omen - 96% metal, 4% pony
Squeen - 96% metal, 4% pony
Inhuman Sand - 88% metal, 12% pony
Snapersten - 3% metal, 97% pony
Staggabash - 3% metal, 97% pony
Cherry Curls - 0% metal, 100% pony
Starly Star - 1% metal, 99% pony
Cheese Breeze - 0% metal, 100% pony
Agar Swirl - 1% metal, 99% pony
Sob Dancer - 1% metal, 99% pony
Derdy Star - 1% metal, 99% pony
Princess Sweat - 1% metal, 99% pony
Raspberry Turd - 1% metal, 99% pony
Arple Robbler - 3% metal, 97% pony
Pocky Mire - 6% metal, 94% pony
Cold Sting - 10% metal, 90% pony
Pearlicket - 48% metal, 52% pony
Blue Cuss - 79% metal, 21% pony
Sunsrot - 84% metal, 16% pony
Rade Slime - 84% metal, 16% pony
Flustershovel Aoetel Pakeecuand - 96% metal, 4% pony
When I fed the classifier names that were generated by a neural network trained on BOTH metal bands and ponies, it was not as confused as I had expected. Instead, it classified them with high confidence as one or the other.
Pinky Doom - 99% metal, 1% pony
Strike Berry - 0% metal, 100% pony
Cryptic Mane - 1% metal, 99% pony
Bloody Star - 4% metal, 96% pony
Killy Power - 96% metal, 4% pony
Spectral Apple - 1% metal, 99% pony
Of course, this classifier will also work on any text I give it.
Benedict Cumberbatch - 96% metal, 4% pony
Jane Austen - 17% metal, 83% pony
Dora the Explorer - 55% metal, 45% pony
Aluminum - 17% metal, 83% pony
Aluminium - 96% metal, 4% pony
The Earth’s Core - 99% metal, 1% pony
Lobsters - 18% metal, 82% pony
Opossums - 97% metal, 3% pony
Yogurt - 17% metal, 83% pony
Kumquats - 96% metal, 4% pony
According to this neural network, we may need to rethink Star Wars canon.
Leia Organa - 96% metal, 4% pony
Luke Skywalker - 31% metal, 69% pony
Darth Vader - 19% metal, 81% pony
Kylo Ren - 18% metal, 82% pony
Pony pictures created using General Zoi’s Pony Creator