Yahoo Releases Source Code For Its Porn-Addicted Neural Network

By Gizmodo Australia on at

Search engines have to learn somehow what images are safe for work — and which are not. Yahoo’s approach, which it’s just released for public consumption, relies on bombarding a neural network with inappropriate pixels. The poor thing… I’m not sure if there’s such a thing as rehab for computers.

On a more serious note, Yahoo engineers Jay Mahadeokar and Gerry Pesavento put together a blog post yesterday explaining the company’s decision to open up its efforts to the developer community.

According to Mahadeokar and Pesavento “no … model or algorithm for identifying NSFW images” exists in open source form and “in the spirit of collaboration”, decided to release its work on the subject on GitHub.

The project, aptly named “Open NSFW model”, contains the code necessary to run the “(NSFW) classification deep neural network”, based on Yahoo’s Caffe framework. Caffe is the company’s own deep-learning system, also freely available on GitHub for those interested.

Now, image recognition is nothing new, but that doesn’t make it any less complex. While you or I might easily recognise NSFW image, it’s not something a machine can easily do. Having machines “learn” what’s not appropriate is a logic way to approach the problem.

As to how it works specifically, the post provides a few details:

… our [neural network] takes an image as input and outputs a probability (i.e a score between 0-1) which can be used to detect and filter NSFW images. Developers can use this score to filter images below a certain suitable threshold based on a ROC curve for specific use-cases, or use this signal to rank images in search results.

Yahoo doesn’t pull any punches with the network, in fact, it makes it harder:

While training, the images were resized to 256×256 pixels, horizontally flipped for data augmentation, and randomly cropped to 224×224 pixels, and were then fed to the network.

Understandably, while Yahoo is happy to put the source code out there, it’s not so keen to release the training images, even for — literally — research purposes. [Yahoo, via TechCrunch]


Gizmodo Australia is gobbling up the news in a different timezone, so check them out if you need another Giz fix.