Film editors

14 Apr

Another submission to IMDb, another response it would be fair to characterise as “robust”:

Screen Shot 2015-03-31 at 22.51.51

We first encountered the robot editors of the Internet Movie Database last year, attempting to get an episode summary past its stern battery of automatic parsers. Recently, though, another artificial writing assistant, Grammarly, has come to prominence following a high-profile marketing campaign in which the company attempted to grammar-check EL James’s Fifty Shades Of Grey when the film of the book premiered. It ran James’s text and that of several famous historical authors through the system, and presented its findings in a lively press release.

Grammarly is a hugely ambitious undertaking: an algorithm that attempts to read and parse text like a human editor, and check spellings and punctuation in context. Unfortunately, the marketing push didn’t go as well as might have been hoped. As was widely observed, notably by Jonathan Owen at Arrant Pedantry, the worked examples contained several infelicities or mistakes, including some questionable overpunctuating and a suggestion that The Tempest be corrected to “We are such stuff on which dreams are made on“. As Arrant Pedantry concluded, many of the things Grammarly found in its press release weren’t errors and, where it intervened, “the suggested fixes always worsen the writing”.

The IMDb parser is a much less ambitious undertaking. It doesn’t work on free-form text or purport to “read”: instead, it controls inputs tightly by using step-by-step data entry. And yet somehow, it feels so much more like being edited.

Above all, it’s the tone. As ever, the rejection notice at the top is brisk but not wholly discouraging, like a copy editor intercepting a reporter with a question. Then there’s the fact-checking and the resultant queries, and the automatic corrections for house style (surname, first name), done without a song and dance. Then there’s the institutional memory and the ever-so-slight weariness that goes with it: there are 3,304 attributes like this already – are you sure you want to create a new one? Then there’s the encouragement not to touch the type: “If you don’t understand how the ordering should be formatted, please leave it blank.” And, as we saw last time, it enforces word counts ruthlessly and threatens to reassign material elsewhere if it’s not cut to fit.

Maybe Grammarly wouldn’t have stumbled over the cap ‘W’ in “Written” and asked for guidance; IMDb doesn’t do line-by-line context and only really spellchecks the proper nouns already in its database. But this is big-picture, organisational editing for accuracy and factual consistency. Rather than entering the murky, and often highly debatable, world of comma use and the passive voice, it just aims to get things cross-referred and reliable.

In fact, quite a lot of real editors’ work is like the kind of  “database editing” IMDb does. Is that how we normally spell it? Haven’t I read that paragraph somewhere else? Someone else has written a piece about this: what does that say? The parser may not be able to write a headline, but it can certainly keep control of a multi-contributor encyclopaedia.

Assertive,  detail-oriented, unbending about style, weary but polite – and, as we see from the “override” tickboxes, stoical about the possibility of being ignored: doesn’t that sound just like an editor? These robots are getting more lifelike by the day.

Hear me rort

26 Mar

I’m sorry, a what?

Screen Shot 2015-03-19 at 10.15.10

A couple of months ago, we were tripping over a story on the Guardian’s most-read list that gave no clue that it was about Australia until about two paragraphs into the body text. Not much danger of that happening this time: whatever a “rort” is, it brings you up short before you’ve even clicked on the link.

And it turns out that this is a Guardian Australia story too: a “rort” is Australian slang for a “dodge” or a “scam”. As Google puts it:

Screen Shot 2015-03-19 at 10.15.39

Collins suggests that it’s a back-formation from the adjective “rorty” (“boisterous, high-spirited”), which I have come across in the northern hemisphere before (specifically in car magazines); this is a first for me with the noun, though.

It sticks out on a long way on a British web front page, but this is the same phenomenon we encountered last time with regard to multi-newsroom operations. This may have sneaked through the content management system into another hemisphere, but it’s an Australian story for an Australian audience about Australian tax affairs: of course it should be written in Australian English.

Language and locality questions aren’t always as simple as that, though. What if this were an Australia-bureau story ordered by and for the US newsdesk about a presidential visit to Canberra? What if it had been a Super Bowl preview from the New York sports editor, produced in the US but written as a primer for viewers in Britain? Or what if – as with an article I worked on last week – it were a feature by a UK-born but US-based writer about a major American corporation, commissioned by the London desk to brief a British audience, but still with half an eye on readers in the States?

The feature was about Starbucks’s controversial RaceTogether campaign – its attempt to stimulate conversations on the subject of race relations between baristas and customers at the counter. In the end, the idea was seriously rethought less than a week after it started, but not before one of our feature writers was sent out to canvass opinions in the district of Brooklyn where Starbucks’s founder, Howard Schultz, grew up. And one of the people he spoke to lived in – well, that was the problem.

The expatriate reporter, who now writes largely in American English, called it “public housing”. British readers won’t quite be sure what that is – they’ll almost know, but not quite. In the days before the Tribune had an internationally ambitious website, I would have changed this immediately to “housing estate”. But I’m aware that that term isn’t really used in the US, where a more usual synonym would be “housing project”.

Picture 147

So: what to do? Change it, for the British readers for whom the piece was commissioned? Leave it, for the American readers who may ultimately make up the bulk of the article’s readership?

In stories written in, by and for one country, these choices are easy. Even for articles written in one newsroom at the request of another, they aren’t too difficult, as long as it’s the bureau that ordered the piece that then takes responsibility for editing it. But what about stories written for more than one audience?

And it’s not simply a matter of word choice: the audience you have in mind will also dictate the news angle, presentation, tone and style the article ends up with. It will determine which bits of background or explanation you feel confident to cut and which must be left in.

Imagine, let’s say, a funny-photo story in which the US president takes the prime minister to a baseball game, and the photo shows the entire White House entourage leaping to its feet to cheer a triple play while the PM, the only one still seated in the photo, looks bemused. If you’re writing it up for the UK edition, you’re going to need to explain why a triple play is so rare; but a beginner’s explanation would alienate American readers. Even if you choose to skirt around it for the UK edition and just call it “a spectacular play on the field”, US readers are going to be curious about what the play was.

In the earliest days of the Tribune’s international operations, the flow of overnight stories from Australia was so novel that many of them found their way onto the UK front page, until it was gently pointed out that British readers might be seeing slightly too much of Tony Abbott in the morning. With websites as a whole, that’s easy to fix: you can just serve a different homepage to visitors depending on location, and curate their content accordingly.

But you can’t do that with an article: one piece of text is all you get, visible all across the internet. And, linguistically speaking, there’s no neutral ground: you have to pick sides. Maybe that explains the Guardian’s advertising banner for its expanded Major League Soccer coverage this season: “Follow our MLS stories. Soccer or football, whatever you choose to call it.”

Screen Shot 2015-03-24 at 00.10.45

The uses of formality

6 Mar
Photograph: Acción Ortográfica Quito via the Guardian

Photograph: Acción Ortográfica Quito via the Guardian

If all prescriptivists were this cool, descriptivism wouldn’t stand a chance:

In the dead of night, two men steal through the streets of Quito armed with spray cans and a zeal for reform. They are not political activists or revolutionaries: they are radical grammar pedants on a mission to correctly punctuate Ecuador’s graffiti.

Adding accents, inserting commas and placing question marks at the beginning and end of interrogative sentences scrawled on the city’s walls, the vigilante editors have intervened repeatedly over the past three months to expose the orthographic shortcomings of would-be poets, forlorn lovers and anti-government campaigners.

The first images of this guerrilla nitpicking exploded across social networks in December, but despite their global notoriety, the group – Acción Ortográfica Quito – have kept their identities secret and have never given a media interview until now.

Imagine swooping through the neon-lit urban landscape with a spray can and that firm a grasp of Spanish diacritical marks. Imagine graffitising the graffiti of protest itself. Imagine just belonging to an organisation called “Acción Ortográfica”. These are lawyers with punctuation-derived street names in their thirties, on a mission to educate and entertain – and judging by the photograph at the top of the page, with an attitude to ellipses that’s almost as hostile as IMDb’s.

And yet … in the light of the just-departed National Grammar Day, and its gleeful celebration of nitpicking, this also feels like going a little bit too far. Shorn of the wit and the big-city coolness, is this actually any better than Lynne Truss’s grumpy attempts to assault greengrocer’s signs with a felt-tip pen?

Mindful of the tendency for prescriptivist festivities to get out of hand at this time of year, John McIntyre at You Don’t Say wrote this on the eve of National Grammar Day:

Item:  Do not aspire to be a grammar Nazi, and don’t indulge people who use the term. Nazis are not funny unless you are Jerry Seinfeld or Mel Brooks. You are not Jerry Seinfeld or Mel Brooks …

Item: It is not your job to correct misused apostrophes or other errors in signage. Resist the temptation … keep in mind that English has many dialects, each with distinctive properties. Let a hundred flowers bloom.

And that’s the problem. There are, as descriptivists are fond of saying, “many voices”, of which formal English is only one. Indeed, it’s more than one: as You Don’t Say goes on to observe: “Just as there is no one English but a variety of dialects, there is not even one standard written American English, but a spectrum.” Graffiti is written English, but not formal English. It doesn’t need to be entirely correct.

Other things, however, do.

Very formal English – the kind found in our venerated 18th and 19th-century usage guides – is little more than a collection of antiquated grammar, mistakes, Latinate superstitions and quixotic innovations. But however dubious its antecedents – and they are often shaky or even baseless – it has been, and remains, the English of government, the police, the corporate attorney: the voice of those who have power to command. Fowler’s suggestion on “which” and “that” in restrictive clauses has found its way into more than a dozen state legislature drafting manuals. Copies of Strunk and White are, or were, sent out to those newly admitted to the bar of the 11th Circuit Court of Appeals.

Unsplitting infinitives,  moving prepositions away from the end of the sentence, using “whom” in the right place: in the most formal circumstances, these are not just superstitious efforts at “correctness”, but something more: they are a raising of the rhetorical stakes, an appropriation of the register in which the most serious matters are discussed. Relationship therapists teach something called “tone-matching” to help people who struggle to get their point across in arguments. If someone is polite, you are polite. If someone becomes curt, you become curt. If someone raises their voice, you raise it too – not more, not less, but exactly the same amount.

The capacity to detect and respond to changes of tone is an essential part of doing well in disputes, and the same applies to writing as it does in speech. Litigators don’t use slang, and neither do leader writers. The register they use might crumble to dust on close inspection of its antecedents, but it still sends a powerful signal: that the matter is grave, and that gravity is expected in reply.

And this is why people seek guidance from editors, or Fowler, or Strunk and White. Not for advice on informal English: nobody needs help with that. They need help with formal English: they need their tone to match the tone of their interlocutor.  They need to sound as forbidding to the solicitor as the solicitor sounds to them, or as authoritative and competent to a new employer as a successful candidate should.

So, if a friend applying for a job asks you whether it should be “whom” in the sentence “My former employer was Joe Dough & Co, from whom you may obtain references”, the correct answer is not “there are many voices”; the correct answer, this time, is “yes”. In this context, in this tone, at this stage in the relationship, formality is advisable and “whom” is the correct choice. What is nitpicking and tin-eared in one register is resonant and appropriate in another.  As editors, we can weigh audience, tone, register, changing patterns of usage, and still come to a conclusion. We can make those calculations effortlessly: that is why they are asking us.

In that sense, the zombie rules of the 18th and 19th century prescriptivists are almost beyond criticism. They have become embedded in the law and the classroom, and in a generation of usage manuals that have still not been superseded in the common imagination. Like so much language change, they were born out of misapprehension and error, and yet have become part of English nonetheless; they are now as much an inexplicable descriptivist phenomenon as surfer slang or the changing meaning of “iconic”. In an English of many voices, formality is a voice too.

When to delete Luhansk

17 Feb

 

Screen Shot 2015-02-16 at 15.27.20

Friday afternoon, and an email comes in from our stringer in Ukraine, whose article has just gone live:

Hi guys,

I had sent an email earlier about the difference between Luhansk and Luhanske. Sorry for the confusion, but the place where I was today was Luhanske, not Luhansk as it says in the dateline right now.
Also, there is an error in the following graf; it should again be Luhanske, not Luhansk:

Burned-out trucks — some still smoking — lined the cratered highway from Artemivsk to Debaltseve, which remains in contention. Government soldiers who were trying to tow a damaged ambulance out of the partly ruined town of Luhanske admitted that anyone who went further down the highway toward Debaltseve would come under heavy fire from rebel small arms and artillery.

In this graf, however, it should be Luhansk, not Luhanske:

Two people were also killed and six wounded when a shell hit a packed cafe in the Kiev-controlled town of Shchastya near rebel-held Luhansk, a local official said, adding that other shells had struck elsewhere in the town.

In real life, there’s always some inconvenient homophone that would never be allowed to come up in fiction. Luhanske, where the stringer is, is 95 kilometres from Luhansk, right in the heart of the recent fighting around Debaltseve and one transliterated letter away from the much bigger rebel city, itself a scene of conflict in the struggle between east and west in Ukraine. And Luhansk also gives its name to the wider oblast, or province, that has declared itself a People’s Republic alongside Donetsk. (Luhanske itself is in Donetsk oblast, of course, not Luhansk oblast: that would be too easy.)

Saturday afternoon, right on deadline. The level of noise is increasing, the shouted instructions are coming faster and the production editor is handing round the international front page for a rapid press-read. The same stringer has filed a late update on the fighting from nearby Artemivsk, and it’s been hustled through the editing process and onto the page.

Although rebels have been able to virtually surround Debaltseve and pound it with rockets and artillery, the road connecting the city with Ukrainian forces in Artemivsk is not fully under either side’s control. Pro-Russia forces shelled the city 15 times and attempted to storm it early yesterday …

Yesterday a military ambulance delivered the body of a soldier killed in the village of Paschnya, which is in the no-man’s-land between Luhansk and Debaltseve, to the mortuary in Artemivsk.

Hang on. Luhansk. Is that … does he mean Luhansk? If he means the city, it’s miles away. Can there really be a no-man’s-land stretching 95 kilometres into another oblast?

Another hasty skim through the article, and there’s no sign of any reportage or sourcing from that far east: all the quotes and accounts come from forces and officials around Debaltseve. A quick check on Google Maps reveals that, yes, Debaltseve, Luhanske and Artemivsk are all close, linked by the E40 road; on the other hand, there’s absolutely no sign of a village called Paschnya anywhere. And the distraction is increased by the locator map on the page, right next to the paragraph in question: Debaltseve is marked, Donetsk is marked, and so is Luhansk, off to the east; but there’s no sign of Luhanske or Artemivsk. But then a check through the stored revisions of the article reveal that, inadvertently,  the ‘e’ was indeed deleted off “Luhanske” at an earlier stage.

The problem with journalism, or at least with newspapers, is that there’s never enough time to sort everything out properly. The fast read, panic over Luhansk, Googling and hasty conferring with a colleague has taken about two minutes. The best thing to do would be to reinstate the “e” in Luhanske, add a few lines to explain away confusion, recut the article to fit, and redraw the map at a slightly larger scale so that the town can be added to it (at its current scale, the blob for Luhanske would be right on top of the blob for Debaltseve).

But there isn’t time for that. All there’s time for is to reinstate the “e”, and, as a prophylactic against possible confusion, hurry over to the graphics desk and ask them to delete Luhansk, the city, off the map altogether, and reoutput it. There’s just enough time for it to auto-update on the page before it’s sent: at least it won’t look like a typo or lead readers astray.

Locator map

And then it was gone: the page was sent and ran like that for the first three editions. Looking back at it now, the single reference to Luhanske is a bit baffling without explanation, and, on the map, I see I completely overlooked that we’d referred to a nearby city as Horlivka in the text (which is correct Tribune style) and Gorlovka on the map (which is not).

But the stringer refiled after midnight, with a new top that explained clearly where Luhanske was: new quotes, new facts, rewritten all the way through. As the story acquired momentum through the night and into the next morning, the online version, updated regularly, was shared more than 500 times and drew more than 3,000 comments. The problems of the initial version were completely swept away.

It was just a first take; just a holding story for the early edition, before the ceasefire agreement took hold and the story really began. Some articles take a lot of effort and then only last for five hours. But you never know which ones will last and which ones will end up on the spike.

And if anyone finds Paschnya on the map, I’d be interested to know.

From our own correspondent

26 Jan

This has always seemed a bit odd of the BBC, don’t you think?

Picture 22

A while ago, we caught the Daily Mail using claim quotes for an assertion that only the writer of the story had made. This is almost the opposite: the quoting of a fellow journalist at the same  news organisation as though he were a third party voicing an unproven opinion.

It’s a widespread phenomenon on the BBC website:

Picture 24

Picture 23

Picture 25

A large part of the problem is that, unlike TV news reports with their time-honoured verbal signoffs (“Joshua Rozenberg, at the high court, for BBC News”), most BBC web stories carry no byline. That means there is no implied authorial expertise in the piece itself, so the writers have little choice but to rely on the broadcast arm’s brand-name correspondents. But quoting a journalist that readers recognise does draw attention to the fact that the author of the article is not that journalist, and leaves a puzzle as to whose voice the article is actually speaking in.

In a print environment, the contribution would simply be folded, unattributed, into the body copy and the legal correspondent’s name would be added to the byline. In an anonymous article, that’s not possible. But done like this, the reporter seems reduced to third-party status: just another interviewee like the foreign secretary or the copyright lawyer; just another participant in the debate with a point of view.

The BBC, as a high-profile public body with an intrinsically political source of funding, has a long tradition of having to report on itself. Normally, it does this with fearsome impartiality, even in the most existential of crises, such as the row over the Hutton report. When an official inquiry condemned its reporting on the intelligence dossier that played a central role in the Iraq war, the chairman and director-general both departed within a week. But BBC News, responding to the story, reported on BBC News, the instigator of the incendiary report, as though they had never met.

Carol Malia, who presents the Look North evening news programme, was involved in a protest at the BBC’s studios in Newcastle.

She said: “Any news organisation has to be seen as impartial to be credible and that is what we are fighting for.”

Mike Baker, an education correspondent in London who has worked for the BBC for 24 years, said staff wanted to make a “symbolic” protest.

Any newspaper would have closed ranks and decried injustice in banner headlines. The BBC interviewed colleagues on its own solidarity demonstration as though they were striking factory workers in Pennsylvania.

The problem is, impartiality as acute as that creates echoes even when you don’t want it to. Look at the quote in the story from Clive Coleman, the legal correspondent, in the article about Rihanna. It’s hard to know quite how to take it. It’s a simple statement of fact, but has been attributed to someone other than the author. Does that suggest that it is, in fact, open to doubt? Has it not been checked – or could it not be double-sourced in accordance with the BBC’s rules? (Although if it turns out to be wrong, the corporation could not possibly be hoping to distance itself from someone identified as a BBC correspondent.) Is it true or not?

Or is the identification of Mr Coleman as an employee of the organisation meant as an assurance of quality? In which case, it would be much better to have no attribution at all, and write it into the story as fact, along with his supporting evidence for the statement.

Fairness and balance is the highest of journalistic goals – indeed, for BBC News, funded by every licence-payer in Britain, it’s the only way it can possibly operate. But, as Jay Rosen has pointed out, there is such a thing as too much innocence in journalism.  Putting quotes around facts determined by your own specialist correspondents just gives impartiality a bad name.

Dateline unknown

3 Jan

Picture 113

Coming up next on global anglophone news: ‘Welfare-to-work programs have failed to reduce unemployment‘.

It was the top article on the business site a few days ago, which was interesting – a slightly wonkish policy analysis doing better than more immediate stories about the collapse of a big parcels firm at Christmas. For readers interested in UK politics, it looked intriguing. The issue has been quiet in Britain of late – it’s been a while since the Labour party’s New Deal or even the coalition’s Work Programme were in the headlines. What’s it about?

Welfare-to-work programs promoted by successive governments have had no impact on unemployment as they fail to take into account the changing labour market, researchers have found.

Well, this looks like bad news for Iain Duncan Smith, the work and pensions secretary. But why is it coming up now? The headline unemployment rate is falling at the moment.

The Australian National University (ANU) research, reported in the Australian on Friday, shows that the proportion of unemployed men aged between 25 and 54 has not changed in almost 15 years, staying at 9-10%.

Ah. Right. This is about Australia.

The first hint you get that this is an antipodean story is here, in the second paragraph of the body text. Nothing alerts you to its provenance before that. The five-most-read counter is a global one that aggregates all Guardian content blindly. The headline and standfirst lack any regional identifiers, and there is no dateline after the byline. And why would there be? The story was, from an Australian point of view, produced by a home reporter about a national report on a domestic topic. You would no more put a dateline on it than you would on a metro-desk story about the city council. Like many articles in the rapidly coalescing global news industry, its international success – or at least its performance relative to stories on two other continents – has taken it rather by surprise.

With British news organisations expanding abroad in the hope of becoming trusted sources of news inside other countries, there are going to be a lot more stories like these: local pieces written in-country as a way of establishing credentials with a local audience, but available globally (and administered, at least for part of the day, from thousands of miles away).

Websites are becoming electronically editionalised to compensate – so much so that some auto-detect your location and make it quite hard to change. But the news editors themselves move back and forth between the offices, taking their old interests out to the satellites and bringing newly learned agendas back to London. And three-newsroom operations throw off so much material that apparently it can’t help but leak across the boundaries – unknown Australian models starring in Britain’s sidebar of shame, Hollywood weddings with dress sizes incompletely altered for UK consumption, or, as here, some parts of a very large website still blind to the technological segregation in other parts.

Perhaps the really alert British reader would have seen that “program” was spelt without its last two letters and realised something was up. But I’ve read so much mid-Atlantic and up-from-Down-Under news that I’m honestly starting not to notice.

This headline has been optimised

20 Dec

Screen Shot 2014-12-20 at 14.58.54

“Boob job scroungers” from Leeds, “va-va-voom” presenters out after hours in Sydney, twerking rappers in Beverly Hills: it’s hard to keep up when ambitious media groups start integrating their American, British and Australian stories into one big anglophone news agenda. And the fact that the stories are published online makes it even more difficult to understand, because headlines for the web are written to communicate with something even more important than the reader: search engines.

Search engine optimisation, or SEO – that is, the practice of ensuring that words likely to be used as search terms on Google are present in the headline and other furniture of a story – is a big deal. Studies at the Tribune suggest that no more than 30% of traffic to our website comes from people manually navigating to our homepage to see what’s going on: the vast majority comes from either social referral (people reposting links on Facebook and Twitter), or from search. In the case of one story I edited recently, about Black Friday, fully 90% of everyone who read it arrived via Google. Website front pages just aren’t pored over in the way that newspaper front pages still are.

What does that mean? It means that, in the limited space of a web headline, there’s very little room for jokes or obliqueness: not only do you have to include the keywords that sum up a story, it’s also best if they appear as close to the start of the headline as possible. But most of all, it means there’s not much room for explanations.

Take a look at the screenshot above from the Guardian website. As an American or British reader, you might find it largely baffling. Who or what are “Walkleys”? Which of the many Mark Scotts in the world is being criticised, and in connection with what – the American Broadcasting Company? The Audit Bureau of Circulation? What does the Duchess of York have to do with it (or perhaps it’s a different Sarah Ferguson)?

If you were writing a print headline for an international audience, you might put something like: “Star Australian Broadcasting Corporation journalist publicly attacks boss at awards ceremony”. But if you were looking for it on Google, you wouldn’t type that. You’d enter something like “sarah ferguson mark scott walkleys”. And that’s what the headline is aimed at capturing. It contains almost every likely search term in 11 words. It’s good SEO.

Of course, the implication of this is interesting. The people who come to your story via Google – in other words, the majority of your audience, in many cases – are already familiar with the people they are searching for, and may even be previously informed about the story you have just published. It may be totally new to the audience coming from Twitter, who have seen a headline in a retweet, thought “what’s this?” and clicked on the link. But a Google audience is already sufficiently engaged with the personalities, or the politics, of the subject to compose a search string that can find a story they already assume must exist.

SEO headlines don’t explain what the story is about because they don’t have to: the audience they are aimed at already know. And that’s why it’s getting so hard to follow what’s going up on multinational news websites: even as the stories go global, the headlines are becoming local.

Follow

Get every new post delivered to your Inbox.

Join 259 other followers