On Network-based Copyright Enforcement

Terry Gray
Written: 24 Apr 2007
Revised: 10 May 2007 and 04 June 2007

INTRODUCTION

On April 19-20 2007 a workshop was held in Washington D.C. with the goal of defining requirements for network-based appliances/tools aimed at reducing copyright infringement (primarily music/movies/videos).

Attendees included reps from Higher-Ed, RIAA, Movielabs, Warner, NBC/Universal, and several enforcement product vendors.

Most of the requirements identified tended to be pretty general, and several industry reps felt "not too helpful" for shaping a product. On the other hand, our moderator (Mark Luker of EDUCAUSE) pointed out that many existing products could not meet some of the newly-stated requirements. There were, however, some very positive vibes flowing about some of the ideas exchanged toward the end of the meeting.

My thoughts below are not limited to the stated agenda of product requirements specification; rather, they summarize my sense of where things are in this space, based on what I heard at the workshop (both in the scheduled sessions and private conversations) plus my own reading and thoughts about the subject.

I. Premise #1: The problem space

We all agree that:

Rights-holders should be compensated for their intellectual property (and, hopefully, artists too :)
Sharing copyrighted files without authorization and beyond any reasonable definition of fair-use is and should be illegal.
Higher Ed can and should try to help rights-holders, up to a point --but that point varies widely from one institution to another, for reasons outlined in section III below.
Infringement methods and scenarios constantly evolve, e.g. moving from wired to wireless-connected computers, and thence to handheld devices, and from open protocols to encrypted ones.
Constraining massive sharing of infringing material to/from "the Internet" is the highest priority, although local sharing is a growing concern.
This is a really hard problem, and not specifically or uniquely a Higher-Ed problem, except that HE has lots of bandwidth, and many would agree, a civics/ethics education role, although not necessarily in-loco-parentis accountability for the ethical behavior of students who happen to live on campus, i.e. for whom the university just happens to be the land-lord.

We do not all agree on whether:

The infringement problem on campus is getting better or worse (though most agree there are improvements in some areas).
There are good reasons not to block P2P apps known only for infringing (e.g. eDonkey, Limewire).
Any technical enforcement method can make a significant difference (vs. being quickly circumvented.)
Privacy concerns can be mitigated sufficiently to allow widespread deployment of enforcement technology on campuses.

II. Premise #2: The solution space

A three-legged stool:

a. Making sure individuals know/agree-on what is right vs. wrong (and "society" too.)
b. Making it easy to do the right thing.
c. Making it hard to do the wrong thing. This includes both technical and non-technical policy enforcement strategies (although the workshop was specifically focused on requirements for *technical* enforcement.)

Part (a) is the responsibility of many, including parents, the entertainment industry, legislators, and educators.

Part (b) is the responsibility of the entertainment and computer/mobile device industry.

Part (c) is the responsibility of ... ??

This is a philosophical debate point which some would like to resolve legislatively, but as a practical matter, every HE representative at the meeting was present because their institution was interested in being helpful, and exploring the feasibility of technical means for making it harder for students (and others) to "do the wrong thing".

(Section added: 2007 May 10)

From a technical "systems" perspective, there seem to be two general approaches to the part (c) "make it hard to do the wrong thing" problem. The first involves strategically placed intercept boxes that (with varying degrees of accuracy) can identify and immediately block potentially infringing material. These have the advantage of not needing any mapping between network data and user identity. They have multiple disadvantages however, including potentially undesired impact on the network, susceptibility to encryption work-arounds, and the general assumption that "copyrighted == infringing".

A second approach includes:

a component to identify potentially infringing material.
a component to map available information from the above (typically an IP address --possibly dynamic-- and a reliable timestamp) into a user identity.
a component to take action, e.g. notification or blocking.

One challenge with both approaches concerns where the sensor(s) live, with respect to traffic flows in the network. Choices include:

many sensors in many places in the institutions network (widely perceived as impractical).
sensors placed at institutional borders, which need to accommodate very high-speed network links, and which will not see local/intra-institutional traffic (much less, intra-subnet traffic).
sensors which operate at the application level, directly participating in the file sharing community. These can be placed anywhere in the network. They have the advantage of working even when traffic flows are encrypted, and regardless of network topology.

III. Concerns and Constraints

Imagine a magic box that could somehow identify infringing material so that action could be taken. Its output would include synchronized date/time and the address (IP and/or MAC) of the device receiving or sending potentially infringing material, although desired responses and outputs might vary (as noted below).

What are the barriers to developing such a device, and what are the barriers to deployment of such a device? I suggest a taxonomy of four categories of concerns/constraints:

1. Philosophical concerns ("Is this a dagger I see before me?")

a. Concern about why this is Higher-Ed's problem to solve, beyond quickly responding to complaints, and trying to help students learn about proper behavior in a civil society. (Usual metaphor: networks as highways; Dept's of Transportation aren't responsible for policing the contents of the trucks.)
b. Concern about setting precedent for becoming "content police" with implications in other domains, e.g. content that might be perceived by some as lacking any socially redeeming value :)

2. Legal concerns ("Stuck in the middle again")

a. Concern about loss of DMCA safe-harbor defense.
b. Concern about student (and staff) privacy rights, especially when mandated by state and/or federal laws, and especially when we're talking about activities going on in people's *homes*.

3. Cost/Benefit concerns (questionable ROI)

a. Concern about deployment costs, which in large schools could be huge if intra-LAN traffic needs to be monitored.
b. Concern about administrative costs associated with operations and adjudication.
c. Whack-a-mole phenomenon, which tends to significantly limit the lifetime of any technical solution.
d. Encryption, which makes deep packet inspection impossible, thus making *network* level identification moot (although it may not preclude *application* level identification from the "edge" of the network.)
e. Device-to-device sharing via ad hoc networking is a growing concern for rights holders, but is not a problem higher-ed can solve via technical means.

4. Operational concerns ("Messing up the network")

a. Performance: few "traffic disruption appliances" can cope with the performance requirements often found within research universities (e.g. uncompressed HD videoconferencing).
b. MTTG --Mean Time To Glitch: unintended collateral damage, or even intentional consequences that lead to complaint calls to the Network Operations Center (NOC).
c. MTTD --Mean Time to Diagnosis: some content monitoring/blocking designs make it difficult to know what has happened or diagnose the problem when a customer calls the NOC... the user experience might be similar to that resulting from other potential network faults.
d. Violating "Principle of Least Surprise" --policy enforcement points that disrupt network traffic should tell end-users what they are doing and why their action may be failing.
e. Impact on innocents: our residence halls have some non-students in them (sometimes spouses, sometimes staff, sometimes others), and technical constraints to inhibit copyright infringement by the target class might unreasonably constrain (non-infringing) activities of others (e.g. at UWashington our "no servers in dorms" policy enforcement via ACLs is problematic for some who need certain protocols to work a certain way for access to their employer's network).

IV. Possibilities

Although nothing was presented that invalidates Bruce Schneier's wisdom ("trying to make digital media uncopyable is like trying to make water unwet"), no one argued that *nothing* could be done. Most industry voices tended to focus on stool-leg #3 (making it hard to do the wrong thing, i.e. enforcement); this was not a surprise, especially since that was the subject of the workshop.

In contrast, I suspect HE folks tend to focus on stool-legs #1 and #2 (ethics education and making it easy to do the right thing). Several industry reps also expressed the view that making it easier for people to do the right thing was a key to success, although there were lively debates over dinner as to whether the education establishment or the entertainment establishment bears more responsibility for (some/many?) young people not knowing right from wrong.

Re objection=Loss of DMCA Safe harbor: Bruce Block's view is that industry and legislators can solve this.

Re objection=Legal/Invasion of Privacy: Craig Seidel's idea of having a magic box that does not disclose info to university staff, but communicates directly with the user, might mitigate many privacy concerns. The school would have to provide an IP-addr-to-email-ident mapping service for the monitoring box to use, however. Different institutions will have different thresholds-of-pain on privacy, so this would be but one approach; perhaps most suitable for an automated "first notice" scenario.

Re objection=Operational/Messing up the network: Use of out-of-band detection/identification and allowing for a range of institutional responses (from direct/automated communication with the user, to support for progressive discipline (e.g. a "3 strikes" model), seemed promising to many at the meeting. It avoids all of the operational concerns about messing up the network (although there is still the issue of locating a monitoring device, and how much potentially infringing traffic can be seen.)

The observation that a non-inline ("out-of-band") monitoring scheme avoids huge network operational problems, and supports flexible response led to considerable enthusiasm for that approach, and may have been the most important part of the meeting.

Although the trend toward encryption will not stop, and will be accelerated by countermeasures against non-encrypted file-sharing, there are still some application-level possibilities for identifying potentially infringing end-points.

It seems that use of watermarking and serialization is likely to increase, especially since this provides opportunities for the industry to add-value for customers (e.g. discounts, fan stuff, access to concert or movie tickets.) Such strategies need to accommodate the moral equivalent of buying a CD and giving it as a gift, presumably by transferring a key of some sort to the giftee.

Use of AV-like software on user devices is another approach, and the key technical alternative to network-based ID and/or intercept of potentially infringing material. The big advantage of this approach is that it moves enforcement to the edge, thus avoiding the "mess up the network" problem, and is potentially effective even when the network traffic is encrypted. The big disadvantage is that software on general purpose computers can usually be circumvented, especially when everyone has administrator privileges.

The industry really, really doesn't like the idea of compulsory licensing --although many outsiders have argued that this will ultimately be inevitable, because the whack-a-mole and encryption issues make it the lesser evil (compared to leaving *all* the money on the table), and they point out that compulsory licensing need not mean all artists receive the same royalty. The technical problem of providing popularity statistics for divvying up a revenue stream is far easier than stopping the data streams themselves. (Counter-argument from industry: compulsory licensing might work for most popular artists, but isn't so good for "the long tail".)

Solution stool-leg #1 (getting a shared vision of right and wrong --a battle for the hearts and minds of our young people!) was clearly outside the scope of the workshop, but nevertheless a fertile ground for discussion over dinner, with inevitable questions about the effectiveness of DRM, lawsuits, P2P blocking, etc. (Several of us have ideas on how the RIAA might shed its mantel of "Most hated 'corporation' in America", and believe that doing so is very important in this hearts-and-minds battle.)

A strong sub-text in the enforcement area is "be careful what you ask for". As one workshop participant, representing a product vendor, described it to me (approximately): "Don't try to completely block the P2P traffic; if you do, it will become even harder-to-detect encrypted traffic."

On the other hand, in response to the view that technology is agnostic and we should regulate activities not specific protocols or apps, industry is still waiting to see any evidence that Aries, eDonkey, and Limewire are actually being used for any legitimate, non-infringing purpose (in contrast to BitTorrent, which clearly is being used for legit purposes.)

V. A way forward?

If I were charged with solving this problem, I would certainly put most of my effort in the first two solution "stool-legs", specifically on reaching cultural consensus about what constitutes fair use, and incentives to do the right thing (this translates to making the legitimate media services more attractive than the illegitimate ones, for all but the hardened sociopaths.) But we also need to come to terms with the irony (pointed out by Howie Singer of Warner) that people who steal music will often/willingly pay $2 for a 30 second ring-tone on their cell phone --so it's hard to argue that "making it harder to do wrong thing" isn't an important part of the overall solution.

On that point, making it harder via technical means, I would do the following:

a. Recognize that the trends toward encryption are not likely to be reversed; indeed my hypothesis that the Internet is becoming a collection of enterprise and home networks linked by port 443 continues to gain supporting evidence. Thus identification of infringing material must be done by hosts operating at the application level, rather than via deep packet inspection in the network core. (Yes, there are network intercept products available that will work for awhile longer, but the more fully they are deployed, the more quickly they will be rendered obsolete.)

b. Recognize the enormous benefit of a strategy (enunciated eloquently on Friday afternoon by Steve Wallace of I2 and IU) that avoids HE's operational ("messing up the network") concerns by making the inspection/identification devices work out-of-band, via mirror or span-ports. (Update: or via application-level "participant" sensors, which can be placed anywhere in the Internet.)

c. Recognize the enormous attractiveness of a solution that affords the institution with a rich set of enforcement actions, ranging from disintermediated direct notification of users, to notification of officials, to direct network action (blocking), perhaps as the final stage of a progressive adjudication regime. Systems that preserve a presumption of innocence will be very attractive for many HE institutions, both philosophically and because they lead to "teachable moments" with the students.

d. Recognize that partial visibility is better than no visibility. Thus, try not to accelerate more rapid adoption of encryption and "dark net" strategies by those seeking to avoid paying for their songs/movies.

e. Investigate technical and legal feasibility of Craig Seidel's idea about devices that communicate directly with the user (at least for the first notice) as a way of mitigating privacy concerns.

This would lead to a "magic box" that would operate on two levels:

1. It would be a participant (a faux-client) in popular P2P file-sharing arrangements, where that is technically possible. The modified client code would look for sources or destinations that were within the institutions' IPv4 and v6 address ranges.

2. It could accept mirrored traffic feeds looking for unencrypted P2P or flash video traffic (for as long as such exists), or other signatures that would point to potential infringement. As above, it would be up to the vendor to distinguish between infringing and non-infringing sessions.

The range of actions that could be configured by such a magic box would be as noted above, including:

Option for direct communication with user (assumes institution provides IP to email mapping), perhaps for first notice.
Delivery of records to school officials for human follow-up and "re-education" or explanation of why traffic was not infringing.
Support for progressive discipline; i.e. at some point a repeat offender gets shut down automatically.

I would also be working with my library and vendors to see how instructional fair-use requirements (e.g. for film classes) could be addressed in an orderly way. This might mean, among other things, a white-list feed into the magic box above.

Finally, and before spending a lot of effort on magic boxes, I'd want to get a much better handle on current infringement trends and plausible success metrics. For example:

Claim: reduction in DMCA notices would be a good success metric. Counter-claim: Lack of DMCA notices does not imply lack of infringement.

Claim: reduction in P2P traffic would be a good success metric. Counter-claim: Lack of (visible) P2P traffic does not imply lack of infringement, and legitimate P2P use is increasing.

At UWashington, RIAA complaints are down, but complaints about TV clips (via BitTorrent) are up. Recidivism after a first complaint is essentially nil (and that was not the case two years ago.) Should we feel good or bad about that story?

Certainly our university, like many/most others, is processing complaints without objection, and is in general attempting to work with the rights-holders in good faith, even when the machine in question is not owned by the university.

Questions I'd want to ask include:

How much infringing is going on that isn't being detected already by the rights-holders current methods, and which could plausibly be detected by a "magic box"? How do we find out?
How soon before deep-packet inspection of network streams is essentially impossible due to encryption?
How soon before wired network connections are irrelevant, as compared to wireless laptops or handheld devices? (They are nearly so already at UWashington.)
How soon before campus nets are largely irrelevant to sharing, as wireless devices exchange data via ad hoc nets, or cell carrier nets, and therefore totally outside the control of the institution?
As legal services evolve and improve, how much effort is warranted in the network-based technical enforcement realm? (i.e. ROI?)
What are the industry associations doing to win the hearts-and-minds battle, and what are the prospects?

That's all for now....

p.s. "just one more thing" --here's some recent data on what UW's residence hall packeteer sees coming in:

HTTP 34%
Unknown 20%
FlashVideo 16%
ClubBox P2P 6%
Itunes 3%
Winmedia 2%
Real Media 2%
BitTorrent 2%
NNTP 1%
MPEG-Audio 1%
GRE tunnels 1%
SSL 1%
Skype 1%
MPEG-video 1%
FTP 1%
QuickTime 1%
port 8080 1%
WorldOfWarcraft 1%

IRC, Winamp, AOL-IM-ICQ, Half-life, MSN-Messenger and everything else the Packeteer recognizes add up to the remaining 5%

TEG HOME