Tag Archives: IMDb

Film editors

14 Apr

Another submission to IMDb, another response it would be fair to characterise as “robust”:

Screen Shot 2015-03-31 at 22.51.51

We first encountered the robot editors of the Internet Movie Database last year, attempting to get an episode summary past its stern battery of automatic parsers. Recently, though, another artificial writing assistant, Grammarly, has come to prominence following a high-profile marketing campaign in which the company attempted to grammar-check EL James’s Fifty Shades Of Grey when the film of the book premiered. It ran James’s text and that of several famous historical authors through the system, and presented its findings in a lively press release.

Grammarly is a hugely ambitious undertaking: an algorithm that attempts to read and parse text like a human editor, and check spellings and punctuation in context. Unfortunately, the marketing push didn’t go as well as might have been hoped. As was widely observed, notably by Jonathon Owen at Arrant Pedantry, the worked examples contained several infelicities or mistakes, including some questionable overpunctuating and a suggestion that The Tempest be corrected to “We are such stuff on which dreams are made on“. As Arrant Pedantry concluded, many of the things Grammarly found in its press release weren’t errors and, where it intervened, “the suggested fixes always worsen the writing”.

The IMDb parser is a much less ambitious undertaking. It doesn’t work on free-form text or purport to “read”: instead, it controls inputs tightly by using step-by-step data entry. And yet somehow, it feels so much more like being edited.

Above all, it’s the tone. As ever, the rejection notice at the top is brisk but not wholly discouraging, like a copy editor intercepting a reporter with a question. Then there’s the fact-checking and the resultant queries, and the automatic corrections for house style (surname, first name), done without a song and dance. Then there’s the institutional memory and the ever-so-slight weariness that goes with it: there are 3,304 attributes like this already – are you sure you want to create a new one? Then there’s the encouragement not to touch the type: “If you don’t understand how the ordering should be formatted, please leave it blank.” And, as we saw last time, it enforces word counts ruthlessly and threatens to reassign material elsewhere if it’s not cut to fit.

Maybe Grammarly wouldn’t have stumbled over the cap ‘W’ in “Written” and asked for guidance; IMDb doesn’t do line-by-line context and only really spellchecks the proper nouns already in its database. But this is big-picture, organisational editing for accuracy and factual consistency. Rather than entering the murky, and often highly debatable, world of comma use and the passive voice, it just aims to get things cross-referred and reliable.

In fact, quite a lot of real editors’ work is like the kind of  “database editing” IMDb does. Is that how we normally spell it? Haven’t I read that paragraph somewhere else? Someone else has written a piece about this: what does that say? The parser may not be able to write a headline, but it can certainly keep control of a multi-contributor encyclopaedia.

Assertive,  detail-oriented, unbending about style, weary but polite – and, as we see from the “override” tickboxes, stoical about the possibility of being ignored: doesn’t that sound just like an editor? These robots are getting more lifelike by the day.

Advertisements

The robots are coming

8 May

I wish I had the nerve to talk like this to the newsdesk:

Screen Shot 2014-05-07 at 10.11.15

If you’ve ever tried submitting anything to the Internet Movie Database, you may recognise this tone. IMDb is a wiki – that is, an aggregation of user contributions – but it has achieved the status of  a semi-official reference tool at the Tribune, much more so than Wikipedia ever will. And I think that may be because of its fearsome army of robot editors, which intercept and scan everything you submit, and more often than not sling it back like Jason Robards growling “You haven’t got it” to Redford and Hoffman.

No diffident pencilled queries in the margin for IMDb: for example, if you have a couple of pieces of casting information you want to add to a TV show, you’d better have chapter and verse to hand.

Screen Shot 2014-05-06 at 20.56.06

So you say this person was in the show? Here are a list of actors with similar names: it’s easy to get confused. If you’re uncertain, click here and we’ll sort it out for you. Or perhaps you’d just like to give up the whole idea? Choose an option, please. (And by the way, you formatted the request wrongly. It has already been corrected: this is merely a notification.)

That’s the spirit. And if you submit anything as ambitious as a three-line episode summary, you get pulled apart like a rookie screenwriter at a pitch meeting:

Screen Shot 2014-05-07 at 10.09.24

There are misspellings. You have written too much: if you insist on overfiling, we will simply move your piece to a different slot inside the site (delicious). And, my favourite bit of all:

“The following fixes have been applied automatically: ‘…’ has been replaced with ‘.’ in accordance with IMDb rules.”

No judicious exceptions, no stretching a point. Ellipses are just banned, rather like the way all semicolons were excised for some years on the Tribune’s sport section. It’s a rule. And I suspect that “surveilling”, even if spelt correctly, will turn out  to be “not in the dictionary”. I’ll just change it now. They won’t like it.

For the first time in my life, I feel like a writer.