If a trailer distills a 2-hour film into its three-minute essentials, what would it look like to distill the film trailer? Strangely, it would look a lot like object recognition software.
Støj, a Copenhagen coding studio, ran the trailer for The Wolf of Wall Street through an object-detection algorithm that identifies and labels everything on screen. In three separate videos, we essentially see how algorithms watch film: They label the essential — a tie, a wine glass, a chair — but leave the specifics out. It’s like visual ad-libs.
The first video filter uses object masking, so only objects recognised by the software appear. Pretty much every object is classified, although there are a few mistakes. It thinks McConaughey is wearing two ties — which I wouldn’t put past him, but isn’t the case here — and it can’t tell the difference between a wine glass, a water glass, and a martini glass.
The second version blurs the humans so you only see the description boxes. Leo and Matt are still instantly recognisable by their voice, however.
The final version, and the coolest, removes the visuals entirely, essentially creating a filter of what the software “sees” during analysis.
Imagine if we could train algorithms to recognise tropes or casting patterns. Just think of a future in which trailers, or even full films, are distilled purely into blank screens with text boxes that replaced Bruce Willis or Nicolas Cage’s faces with [AGING ACTOR] or the trailer for the next Michael Bay movie with [SERIES OF EXPLOSIONS] or any Adam Sandler “comedy” with [DON’T BOTHER]. Perhaps the future doesn’t look so bleak after all. [Prosthetic Knowledge]