The Colour of Copyright

This post is in honour of the Infinite Cat Project. Its lineage is me reading a post by Martin, who read a post by Seth, who read a post by Matthew Skala.

Matthew’s post basically attempts to provide a way of thinking about copyright violations, and more particularly about why computer scientists often don’t think much of copyright. He basically postulates that there’s an invisible “colour” associated with bits, and that where computer scientists get into trouble is trying to ignore that colour.

I don’t really think that view’s helpful: colours that you can’t actually see don’t make things easier to reason about; and while sometimes you have to come up with terms to describe things because there’s no more meaningful way to look at things, this isn’t one of those cases.

The main mistake is thinking that copyright is something about restricting what bits can exist; it’s not — it’s about restricting what you can do with bits. Fundamentally it’s about stopping you from making copies of bits, but sometimes it’s also about stopping you from looking at them in certain ways too. The only “colours” (which we might as well just call “attributes”) that a set of bits have as far as copyright is generally concerned is an “owner” (whoever bought them, or got given them), and a “copyright holder” (whoever created them originally, or whoever received the copyright title from that person).

Those attributes aren’t determined by the bits at all — if you sell your computer, then the owner of the software on it is changed without any of the bits that make up that software changing at all. And similarly, copies of the same set of bits, laid out in the same order with the same meaning, can be owned by a whole range of different people simultaneously — and the fact that I might own a copy of something doesn’t mean that I can steal, change, or even use your copy, even if I could do all of those things with my copy. The “copyright holder” attribute is similar — it can be changed just by signing some papers, none of the actual bits have to be flipped at all.

Another thing computer scientists will try to do is to treat Colour as a function (in the strict mathematical sense of “function”) of the bits — maybe an uncomputable function (in the strict mathematical sense of “uncomputable”), maybe intractable, but a function nevertheless. We either do that because we mistakenly believe that Colour really is a function, or because we’re a little more sophisticated, we know that it’s not a function, but we think that we can fake it closely enough with a function to get the lawyers off our backs. Either way, the idea is that we should be able to look at bits and somehow determine, from the bits themselves, what Colour they ought to be.

Functions, in the mathematical sense, are pretty general things — they’re just a way of saying that given a particular question (like “Am I allowed to do this?”), and an appropriate amount of information and context, there’s a single, definite, answer. In mathematics, this is usually written like q(I) = A, that is given the necessary information about the situation, I, the answer to the question, q, is always A. If the question is “What is the sum of these two numbers”, then the information you need takes the form of “The two numbers are __ and __”. If you don’t have that information (or something essentially equivalent), you don’t have a function — if you’re only told one of the numbers, then you can’t give a single answer, eg, since the sum of “1” and “some other number” can still be anything. If you’ve got more than that information (“the numbers are 3 and 4, and James Gleick writes cool books”), you do have a function, but you’re being redundant, which is frowned upon.

So what does that mean? Fundamentally, it means that “colour”, that is the copyright status, isn’t a function of the bits themselves — it’s a function of how the bits were obtained, and of the legal agreements signed by the copyright holder. But that’s not the end of the story — there’s a reason why computer scientists try to make copyright status a function of bits, and that reason isn’t because they’re stupid, and it isn’t because they’re trying to come up with an excuse to ignore copyright (well, that’s not the reason in all cases anyway).

The ultimate reason is that we want to be able to enforce copyright in software rather than in courts — both because software’s a lot cheaper and more efficient, and because potentially it’s a lot more effective. The RIAA gets a lot of flack about suing kids for copyright violations, and they’d love it if they could just stop the kids from violating copyright in the first place so they didn’t have to worry about enforcing their rights in the traditional way. But fundamentally, software, whether it’s trying to enforce copyright or do anything else, only gets to look at the bits, and can only come up with one answer, so a function is exactly what’s needed.

But, as we’ve established, that’s simply not possible to do accurately, and doing it inaccurately screws up the copyright balance by definition — either the copyright holders get screwed, the users get screwed, or both.

It’s not irrecoverable though — there’s no reason why you can’t just provide the software with all the information it actually needs: working out who the current copyright holder is could be made as easy as querying the Library of Congress’s website, or some similar body, governmental or private as appropriate. As long as you have the information your function actually needs, determining the copyright status of some bits is straightforward.

What’s not straightforward is going to the next step and actually preventing copyright infringement. The above lets you inform the user that they can or can’t copy (or otherwise use) whatever they’re looking at, but it doesn’t actually prevent them from doing it, which is a whole other matter.

The computer science applications of Colour seem to be mostly specific to security. Suppose your computer is infected with a worm or virus. You want to disinfect it. What do you do? You boot it up from original write-protected install media. Sure, you have a copy of the operating system on the drive already, but you can’t use that copy – it’s the wrong Colour. Then you go through a process of replacing files, maybe examining files, swapping disks around and carefully write-protecting them; throughout, you’re maintaining information on the Colour of each part of the system and each disk until you’ve isolated the questionable files and everything else is known to be the “not infected with virus” Colour.

This is a different sort of “colour” — whether you can trust software, which is to say, whether it does what you expect it to, or how much unexpected damage it’s likely to do, is purely a function of the bits (well, and your expectations). The problem is it’s not one you can usually (or efficiently) work out from the bits — who wants to pore over a print out of 1’s and 0’s that goes on for thousands of pages, anyway? It’s similar only in the outcome: when we want to work out the correct answer, we don’t look at the bits, we look at other information. But that’s not as strong a case as we can make for copyright: in establishing trust, we could look at the bits, but it’s easier to look elsewhere. For establishing copyright software we have to look elsewhere.

Compare this with another sort of activity courts look at. Murder’s a crime which should be punished. But we don’t punish possession of a bloody knife — after all, that could just indicate you cut yourself peeling a potato and haven’t yet rinsed it. It’s not the results of a murder that indicate punishment’s warranted: not the bloody knife, not the body, not even the confession — it’s the act of murder itself that requires punishment, and the rest are just evidence that indicates the act did (or didn’t) actually happen.

The same’s the case for copyright violations: it’s the act that’s the problem, not the end product; but that doesn’t mean you shouldn’t look at the end product as (possible) evidence of an infringing act being committed.

UPDATE 2004/06/24:

Oh, I suppose I may as well note it before someone else does. Advantage: Inchoate.

Leave a Reply