Where do we go after SOPA/PIPA? How about an “IP-Free Workshop”

Note: About two months ago, when the SOPA/PIPA debate was still raging, Glenn Nano, the convenor of the “Code Meet Print” Meetup in New York, challenged us to come up with ideas for a discussion about piracy. “A possible Prompt: Isn’t Piracy Missing the Point?” he wrote. I hammered out this screed, which somehow got lost in CMP’s system and just surfaced on its mailing list today. A couple of people have told me it’s still interesting, so here goes.

As one who sympathises with the opposition to SOPA/PIPA but—as a journalist covering the media—spends quite a lot of his time talking to people in the content industry who support the bills, I’ve been thinking about how to get this debate unstuck. It’s stuck because, like many debates about deep-seated issues (I’m thinking Israel-Palestine, which I’ve also covered), it’s rapidly become an argument about who is right, rather than about how to move things forward.

And to move things forward, I think the onus is on the anti-SOPA crowd. Convinced as it is that these bills are based on an outdated conception of how content should be monetised (I am presuming that that’s the implication in Glenn’s “Isn’t Piracy Missing the Point?” prompt), the anti-SOPA movement spends nearly all its energy on trying to prove its point and not nearly enough on suggesting ways for those whose livelihoods have up to now depended on intellectual-property (IP) protection to make a living differently.

Because, just as the Israel-Palestine conflict is not, at its base, an ideological one about religion or historical rights (those are just layers added by each side to make the argument harder to win) but a power struggle about who gets to live on which piece of land henceforth, the SOPA dispute is not really an ideological one about whether piracy is wrong, but a power struggle about who gets to determine the ways in which people will make money from content henceforth. When we see a resource, we compete over it; it’s what we have done since we were bacteria. There is a long-established industry that has an entrenched interest in living off content the old way, and on which many people’s livelihoods depend; and there is a rapidly-growing new industry (I use “industry” here broadly: it includes Google and startups and VCs as well as cyber-activists and NGOs) that has a whole bunch of new ways in mind, and on which an increasing number of livelihoods is also coming to depend. And the outcome of SOPA will determine the fortunes of people in these industries.

So both sides are trying to defend the way they earn, or will earn, their living. And there is nothing wrong with doing so; we all need to protect ourselves from poverty.

So why do I think the onus is on the anti-SOPA movement? Basically, because it’s newer. The IP protectors don’t know any other way; change is scary. To win the argument, the opponents of SOPA need to provide convincing alternative routes for musicians, artists, film-makers, and so on to make a living from their content. (There’s also the giant ecosystem of publishers, broadcasters, and all the other baggage that depends on the IP model too, but I’ll come back to that in a minute.)

Sure, we’ve all heard ideas about how this can happen. Musicians can put out their stuff free online and make money from touring, and film-makers can crowd-fund their movies, and stand-up comedians can upload a DRM-free video and politely ask their fans to pay $5 if they like it and please not share it on Facebook, and all that jazz. But go talk to some of these people. I have. It’s not easy for them. It’s especially not easy if you’ve built a career and a life and a home or two doing things the old way. Basically, unless you are a young new artist who is just starting out and is internet-savvy and doesn’t have much to lose, or a more established artist who has already built up a big online following and knows just how to exploit it, trying to work in a world where the assumption is that you cannot rely on anyone paying to own a copy of your content is Very Fucking Scary. And even if you think people are wrong, you can’t fault them for being scared.

So what I would like to see, and I don’t know if Code Meet Print is the place to do it (maybe team up with some other related Meetup groups?), is a serious discussion about how to create pathways. An easing of the transition for content creators. Actual, practical methods for making a living from content in an in IP-free world. Success stories and examples to follow. An IP-Free Workshop, if you will. I’m not convinced, at least not yet, that the future is IP-free, by the way; and if it is, this field is still very much in its infancy. But if you’re going to talk the talk, you have to show people that the walk can in fact be walked.

(An aside: Yes, I’m aware that a lot of people who oppose SOPA don’t oppose IP protection per se—they just want to see a more moderate version of the bill. But I’m assuming that at least a proportion of the people on this list are interested in what a world without IP, or with a radically reformed version of it, would look like.)

Finally, on the small matter of that IP-protection ecosystem: sure, the driving force of the lobbying for SOPA and PIPA isn’t the artists as much as it is the big firms that turn the artists’ work into money and keep a large part for themselves. But if your goal is to defeat those firms, you won’t do it by telling them again and again that they’re greedy dinosaurs. You just need to convince the artists they can live without them. And if that’s true, and the artists leave, the rest will happen by itself.

There, that’s my rant. Reactions welcome.

Why people hate the Curator’s Code

Maria Popova was taken aback by the “venom and meanspirited derision” with which some people greeted the Curator’s Code, her and Kelli Anderson’s system for attributing content on the internet. Andrew Beaujon says she is a bit naïve for not realising that the internet is simply a mean place:

The source of Popova’s chagrin is an immutable Newtonian law: For every action that makes news in The New York Times, there is a swift and merciless opposite reaction. Her good-natured proposal that the Internet should agree on a sort of Esperanto of links, one that will definitively reward the first person to share a piece of content, triggered two sorts of negative reactions: Derision from people who worry about the meaning of curation, and derision from people who need to squeeze out a blog post. Popova says the tenor of the reaction surprises her, which makes me wonder if she’s hooked up to the same Internet I am.

I’m not surprised either, but the reasons are a little more involved than this. We’ve spent years arguing and fighting over how people should get to use stuff that other people made. From early software piracy, to later music and video piracy, and the attempts to stop them using technology (DRM) and law (DMCA, SOPA/PIPA, Hadopi, et al); to the journalist/blogger divide; to the feud over whether news aggregators are parasites or symbionts for the places that actually report the news; to the paywall debate. This has become not only a resource war, over who gets to profit from what, but, like the most intractable conflicts, also a culture war, between the culture of scarcity and the culture of ubiquity. The Curator’s Code is firmly in the camp of the culture of ubiquity. It stirs up echoes of all the debates that have gone before and touches a still-fresh wound.

However, there is a second reason for the kerfuffle, which is that, as tends to happen in culture wars, people talk past each other. Take Popova’s assertion that curation (though she dislikes the word itself, finding it “vacant and inadequate”) is “a form of creative and intellectual labor, and one of increasing importance and urgency”. Marco Arment retorts:

…regardless of how much time it takes to find interesting links every day, I don’t think most intermediaries deserve credit for simply sharing a link to someone else’s work… Discovering something doesn’t transfer any ownership to you. Therefore, I don’t think anyone needs to give you credit for showing them the way to something great, since it’s not yours. Some might as a courtesy, but it shouldn’t be considered an obligation.

I’ve aggregated some links in the course of writing this blog post: do I deserve a hat tip for bringing them to your attention if you now go and write another? Arment says no, and I think he’s right. That aggregation was just a by-product of what I was doing (and I doubt I could reconstruct how I found each of those links myself). Popova spends all day long finding material for her site, Brain Pickings: does her aggregation deserve to be recognised as work? She says yes, and I think she’s right too. In her case it’s a not a by-product of what she does; it is what she does.

So if the Curator’s Code has a flaw (other than more practical questions about its usability), I think it’s that it attempts to extend its scope beyond the problem it is really trying to solve. The problem it is really trying to solve is how to give due recognition to those, like Popova, whose work substantially consists of aggregation. But it attempts to systematise attribution for everyone who aggregates, so a lot of people see it as meddling in the untraceable tangle of sharing that is simply everyday life. And the reason they reacted so harshly is that this is just another salvo in the resource and culture war that we’ve been having since before the internet was invented.

What journalism can learn from science

Note: this essay is based on a talk (audio) that NPR’s Matt Thompson and I gave at SXSW Interactive on March 13th, 2012. Matt laid out the groundwork for the talk in this Poynter article last September.

1. Why make journalism more like science?

Journalism and science both try to make sense of the world. They make observations, and test them against theories about how the world actually works. But there the similarity more or less ends.

Science has established standards of measurement and evidence. Journalists make them up as they go along. (The closest thing journalism has to a standard of evidence is the rule that something is true if two independent sources told you it was—hardly a bedrock of epistemology.) Scientists add to the edifice of knowledge piece by piece, citing and building on each others’ work. Journalists have no specific, agreed-on body of knowledge, other than what we call “general knowledge”, that they can refer to. Science makes precise, testable claims and then tries to prove them wrong; it uses the doctrine of falsifiability. Journalists tend to make broad, untestable claims (or cite such claims by others: “Video games will damage children’s brains!”) and then look for factoids to confirm them, avoiding the inconvenient evidence to the contrary. And so on.

All of this is very sweeping, of course; it isn’t to say that there isn’t plenty of good, careful, solid journalism. Nor is real science the serene and lofty pursuit of crystalline certainty that it likes to seem to be; what we’ll describe here is an idealised form. But science does have a methodological basis and rigour that journalism does not.

And that’s fine. If journalists had to work the way scientists do they would publish rarely, interest almost nobody, and be out of business very fast. But that isn’t to say that journalism can’t borrow from some of science’s methods. And in fact, as we will explain, it’s already starting to happen.

First, though, a digression. We’re about to propose a few things—tools—that could make journalism more scientific. These tools mostly involve adding more stuff to the stories journalists produce. This will not happen by fiat. Journalists have plenty to do already. So these tools will catch on only if they meet three conditions:

  • Self-justifying: they are useful—immediately useful, not just long-term useful—to a journalist working on a story, regardless of whether they help anyone else.
  • Easy-to-use: they don’t impose a significant burden or learning curve.
  • Interoperable: they work on any publishing platform or content-management system.

A good example is Storify, a platform for collecting and displaying various kinds of material from the web—tweets, blog posts, pictures, etc—in one place. (You can see the Storify of our session here.) Storify is self-justifying: it helps a journalist collate and organise source material for a story, though it also helps others see where the information came from. It is easy to use (try it). And it is interoperable: a Storify collection can be embedded on any web page. As a result, though no newspaper built Storify and nobody told them to use it, more and more are now doing so.

And so to our thesis.

2. What’s so good about the scientific method?

We’ve identified three broad-brush qualities of science (because journalists always like to have things in threes) that we think journalism ought to have more of.

>> First, science is collaborative. As we’ve said, scientists build on each others’ work. So do journalists. Once something has been well enough reported and established—such as where Barack Obama was born, or what led to the collapse of Lehman Brothers—other journalists shouldn’t need to repeat the reporting in order to take the story further. In this sense, journalism is already collaborative… sort-of. But there’s no agreed way of establishing and pointing to what considered is solid knowledge. There’s only a fuzzy and arbitrary general understanding, which not everyone may agree with. In short, there is no convention for signalling authority.

In science, authority is dealt with by means of citations. A paper cites other papers to establish the starting point for its research; the citations effectively say, “this is what we already know.” A cited paper may in fact turn out to be wrong later on, but in that case the citations act as a paper trail, showing other scientists what other results are dubious as a consequence.

Citations perform another function, as well: they act as measures of reputation. A paper, and a scientist, that is cited often is important and probably reliable (either that, or notoriously wrong). The way this measure is calculated is through the Science Citation Index, which collates all citations. The index serves as a kind of map of the scientific edifice. It shows which bricks rest on which other bricks, and lets you see which ones are most central to the structure.

So what if there were some kind of citation index for news? The first element of this is already, crudely speaking, in place. On blogs and on some newspaper websites, articles carry links to other articles; they show, in other words, what work they are relying on. But we don’t do this systematically. And we don’t have a bigger picture of how these links all relate. Technorati used to act as a kind of primitive citation index for blog posts: you could enter the address of a post and it would show you every page that linked to it. It seems not to any more.

Something of that nature, only more sophisticated, would be enormously helpful in journalism. It would be make the starting points and assumptions of a story more transparent. It would help identify particularly useful or ground-breaking pieces of work. And it would be especially useful in tracking down the origins of misinformation.

Ideally, the news citation index would be more sophisticated than Technorati (when it worked) or even than the Science Citation Index. It would need not only to show which other articles linked to a given story, but to track where in those articles the links came from and what in the story they were linking to: to show, in other words, which facts they were attributing to the story in question. This, what we might call “fine linking”, is not something the web is well set up to do right now. But it would be a step towards the “web of data” or “semantic web”, a much-discussed goal of information seers, in which links point to and from facts, rather than pages.

A news citation index would be self-justifying if it helped journalists organise their own research, by giving them a way to keep track of what they already know and where it came from. And it would be interoperable by definition if it were based on hyperlinking, which is the architecture of the web itself.

>> Second, science is replicable. Scientific work is, broadly speaking, transparent: you show your assumptions, your method, your data, and how those led you to your conclusions. But why is transparency important? Not for its own sake, but because it allows others to repeat your work. Scientists must be able to replicate each others’ results—first, so that they can check for errors (some experiments need to be done several times by different groups in different places to make sure they are right); and second, because to see if a hypothesis applies more broadly, they may need to do the same experiment under different sets of conditions. So we’ve chosen replicability to be one of our three traits, as a finer-grained version of the notion of transparency.

Journalists, too, could do with some replicability. A investigation finds cases of police brutality in one town. Is the same pattern being repeated up and down the region or the country, evidence of a systemic problem, or is it just a local anomaly, in which case some bad apples are to blame? Another investigation alleges corruption in the city council, based on some funny-looking numbers in the accounts. But is the conclusion warranted, or is there a more innocent explanation? Other journalists, and the public, need to be able to see how these investigations were done so they can judge the results and repeat them if necessary. It’s also, frankly, good discipline for the reporter who wrote the story: when you have to list your sources, you are more careful about checking them.

What makes a scientific paper replicable is the transparency about its methods, data and reasoning. It might seem at first that this is impossible for journalists. They have to protect sources and guard scoops. They also mustn’t overload their audience with facts. Stories would be unreadable if they looked like academic papers. But in fact, online journalism has become far more transparent in the last three years or so.

Inline links, which we’ve already mentioned, have been around for a while. A slightly newer practice is posting photos, videos, interview recordings or transcripts that can’t go in the print version. But some websites now publish source documents next to their articles using document-sharing platforms such as Scribd and DocumentCloud. ProPublica, an investigative journalistic non-profit, publishes step-by-step guides on its digging and data analysis, as with this story on tracking nurses’ licences. Politifact’s Truth-O-Meter™, a collaboration between several American newspapers that fact-checks statements by public figures, lists all its sources of information in its articles.

And then there is Wikipedia. Sure, many still don’t consider Wikipedia a journalistic outfit. It doesn’t check material for errors before publishing it; its “stories” are constantly evolving and never finished; and when mistakes get in they may, or may not, get caught and cut out later. But the Wikipedia page for a big current event can have far more people at once working on it than at any newspaper, and when errors are caught they disappear far more quickly than they would from a normal online article.

And—here is the point—Wikipedia uses footnotes. When its users are following the guidelines, they give a source for every fact. Journalists and schoolchildren are taught not to rely on a Wikipedia page for accuracy, but what it does do very well is show where purported facts came from, so you can check for yourself. And footnotes are more useful than inline links, because you have to click on each link to see what it leads to, whereas a set of footnotes gives you, in one place, an alternative view of the story as told through its sources.

So: how about footnotes for news? Your first reaction might be: ugh, that would look terrible. Book publishers hate footnotes; they make books look academic and put readers off. But Wikipedia wouldn’t be the 6th most-visited site on the internet if people really hated footnotes. It’s a matter of habit, and also of design: there are ways to make footnotes less ugly. (Here’s an easy one: make them show up only when you hover your mouse over the text.)

In fact, stories would be snappier with footnotes. Instead of “A survey last month by the Whatalongname Foundation for Public Health Research found that 43% of American adults are unhappy with their GP, but according to the Department of Health and Human Services, just 3% of them changed doctors last year”, wouldn’t it be easier to read that “43% of American adults are unhappy with their GP1 but just 3% of them changed doctors last year2”? Having to state your sources in the text is really just a holdover from the days of print. With footnotes you can separate the story from the scaffolding behind it, and thus make both the story clearer and the scaffolding more visible.

As for protecting your sources: sure, some have to stay secret. But just as a story can quote “a source close to so-and-so”, the footnote could explain that a particular fact came from a protected source. The point is not to reveal the source so much as it is to reveal the process, so that others can retrace its steps.

Footnotes for news would be self-justifying if, like the citation index, they helped journalists organise their research. (The citation index might, in fact, come about as a by-product or a layer on top of the footnotes.) Some blogging platforms, such as WordPress, already have easy-to-use footnoting plug-ins, as does Wikipedia. To make them interoperable might mean adopting some standards for footnoting. But this could well come about by convergence.

>> Third, science is predictive. As we’ve already said, scientists makes precise, testable predictions, whereas journalists and the people they quote tend to make general, untestable ones. And this is bad. But what’s worse is that it’s very hard to hold the makers of these predictions to account when they get it wrong.

An example that has entered the lore is Thomas Friedman of the New York Times. In November 2003, eight months after the beginning of the war in Iraq, he predicted that the following six months would “determine the prospects for democracy-building there”. Over the next three years, he made several more predictions, all stating that the following six months or so would be decisive. The blogger Atrios defined a “Friedman”, later the “Friedman Unit”, as any six-month period starting from the present moment. In the spirit of science we would like to redefine the Friedman Unit more broadly, and time-agnostically, as “that period of time within which a prediction by a public figure is almost certain to be forgotten”.

Friedman is hardly alone in this habit, though. And since the central mission of journalism is to hold the powerful to account, the fact that it is singularly bad at keeping track of predictions they make is one of its most fundamental weaknesses as an institution. On the great issues of our time—stimulus measures, global warming, healthcare, international security policy, the role of the state, and so on—the poverty of public debate is due in no small part to the fact that those who say things that later turned out to be false get away with it, because nobody can remember who said what when.

And so we think journalism needs a prediction tracker. The prediction trackers we have are mainly in the form of journalists’ own, sometimes short and patchy, memories. A more concrete one does exist: Politifact’s “Obameter”, which has been keeping track of promises Barack Obama made, and reporting on whether they are being fulfilled. But despite being a superb thing in itself, it has not become a point of reference for most journalists. It rarely crops up in stories. Readers do not flock to see how well or badly the president has been doing lately. Tracking predictions may be good for a news outlet in the long term, like eating your spinach, but in the short-term it just means extra work. So it is not obviously self-justifying.

However, there are steps towards making it interoperable. Dan Schultz’s Truth Goggles project is an experiment in making data from Politifact’s Truth-O-Meter crop up next to a politician’s name in any news article, so that you can see that politician’s overall record of truth-telling. A twist on that technology might be to pull up past statements the politician had made about the topic of the article in question from a database (software already exists for aggregating such quotes from the web), so you could instantly see whether she was contradicting herself. If news organisations decide to include prediction-tracking of this sort for the benefit of their readers, they might start to realise that it would do their journalists good to make use of it too—if only to avoid looking stupid.

3. What else might journalism learn from science?

We have focused on three traits of science that we think news ought to have more of: being collaborative, replicable and predictive. But science has many other qualities. These may or may not be worth trying to adopt. For instance, the doctrine of falsifiability is the very linchpin of what distinguishes science from speculation. But can you imagine making journalists look for proof that their ideas are wrong rather than right?

On the other hand, it’s interesting to try to imagine how the checks and balances imposed by peer review might look in a journalistic context. It already exists, in a way: comment threads let anyone (not usually the journalists’ peers, though) pick a story apart. They’re mostly junk, but there are gems of useful information in them that can make a story better. And an entire industry exists around trying to make comment threads less troll-laden and more useful in order to drive more traffic to sites, so it’s already in the media’s interests to improve them.

There is also a genuine experiment in peer review in the form of NewsTrust, a site that lets experts and journalists rate articles for accuracy, fairness and so on. Websites can choose to display the ratings next to their stories. Not many do so, because the ratings fail the self-justifying test: even if they have a long-term benefit, they don’t make a journalist’s work immediately easier. But maybe they contain the germ of a system that could.

As we’ve said, this isn’t about making journalism into science. But we hope we’ve shown that the two could have a lot more in common, and that this could make journalism better, more reliable, and more valuable to society.