Tor, Libraries, and the Department of Homeland Security

During an appearance on the LITA Top Technology Trends panel in 2014, I was discussing privacy of patron data, and mentioned that I thought it was a good idea for libraries to run Tor nodes on library servers. So when the Library Freedom Project launched their Tor in Libraries project, I was totally behind them…I even did a Tor workshop for Librarians for their workshop at ALA Annual in San Francisco.

If you aren’t familiar with Tor, I recommend reading the Wikipedia article. The TL:DR version is that Tor is a protocol and a network that is currently the best mechanism that we have for accessing information on the Internet anonymously. There are a few ways that one can use Tor, ranging from using an operating system that routes all your Internet traffic over the Tor network to just using the Tor browser, which just anonymizes your web traffic.

The way that Tor anonymizes your traffic is through a combination of encryption and blind routing,  When you initially connect to the Tor network, the connection is encrypted in much the same way that the connection to your bank would be, via a public key encryption system. When you make a request for a website through the network, the Tor protocol bounces the request from one network node to the next, encrypting the traffic at every hop. Once the traffic gets a couple of hops away from the originating computer, it’s impossible to know where the request came from. Eventually the traffic exits the Tor network, back onto the regular old Internet, and gathers what you asked for, then reverses the process to get back to you.

The result is that, under ideal conditions, it is completely impossible to track or trace what’s being transmitted via Tor. For Tor to continue to operate, it needs two sorts of nodes….relay nodes that act as the “bouncing” nodes for inside the network, and exit nodes that are the places where the traffic goes out of the encrypted Tor network and back onto the regular Internet. You need both, although a ratio of more relay nodes to fewer exit nodes is fine. The traffic that goes across relay nodes is completely anonymous…from the perspective of both the network and the individual server, it is just a random string of binary code. Only at the exit nodes does the traffic decrypt, and thus exit nodes bear the brunt of all of the requests going across the network. The traffic for the broader network all has to squeeze itself through exit nodes, and the fewer exit nodes there are, the easier it is for them to be monitored…although you can’t tell where the requests for the information came from without advanced knowledge.

So why am I talking about Tor? Because I wanted to set up the story that broke last week about the first library in the US to publicly go live with a Tor relay (a middle relay) getting pressured by their local police to turn it off. The police were, in turn, pressured by the US Department of Homeland Security. From the original article on the event:

In July, the Kilton Public Library in Lebanon, New Hampshire, was the first library in the country to become part of the anonymous Web surfing service Tor. The library allowed Tor users around the world to bounce their Internet traffic through the library, thus masking users’ locations.

Soon after state authorities received an email about it from an agent at the Department of Homeland Security.

“The Department of Homeland Security got in touch with our Police Department,” said Sean Fleming, the library director of the Lebanon Public Libraries.

After a meeting at which local police and city officials discussed how Tor could be exploited by criminals, the library pulled the plug on the project.

“Right now we’re on pause,” said Fleming. “We really weren’t anticipating that there would be any controversy at all.”

He said that the library board of trustees will vote on whether to turn the service back on at its meeting on Sept. 15.

Why do I think that libraries should be running Tor nodes? I had a long discussion about this on Twitter recently, but let me use the freedom of more than 140 characters to try and talk through my thinking on this. Tor is, currently, the best option that people have for anonymous speech on the Internet. It is possible to create accounts without using your real name, it’s possible to use wifi at coffeeshops and your local library to prevent your IP from being recorded…but for real anonymity of network traffic, nothing beats using Tor.

Anonymous speech is important because it is a necessary component of the freedom of speech. The US Supreme Court has ruled again and again that the right to anonymous speech is a protected part of the First Amendment, saying in McIntyre v. Ohio Elections Commission:

Anonymity is a shield from the tyranny of the majority…It thus exemplifies the purpose behind the Bill of Rights and of the First Amendment in particular: to protect unpopular individuals from retaliation…at the hand of an intolerant society.

Libraries have been concerned over time with the Freedom to Read, but to doubt the role of the library in the Freedom of Speech in the US is to fundamentally misunderstand the Library (and possibly speech itself). Speech is a necessary precursor to Reading, as creation is a necessary precursor to consumption. Libraries are and should be cornerstones of free expression in the United States, and have worked to provide access to the tools of speech for years and years.

For the Department of Homeland Security to use the boogie-man of “bad things happen on Tor” as a lever to get the relay turned off is the worst sort of fear mongering. Any tool can be a weapon, and any communications mechanism can and probably will be used to enable illegal activity. There is enormously more illegal activity on the open Internet, and yet libraries everywhere provide open and robust access to the Internet via both terminal and wifi. To paint Tor as a haven for thieves and drugs and child pornography is to misunderstand not only the Tor network but to, in my opinion, to mistake the forest for the trees. Yes, tools can be used for immoral and illegal things. But that does not make the tool either immoral nor illegal.

The only rational explanation for the DHS pressuring the library to shut down their Tor relay node is that the DHS doesn’t want individuals, including US citizens, to have more robust mechanisms for anonymous speech. Per the US Supreme Court’s rulings on the links between anonymity and freedom of speech, this indicates to me that the DHS is actively attempting to prevent free and open speech on the Internet.

That is not ok with me, and it absolutely should not be ok with libraries.  

If you have made it this far, please visit the EFF’s Take Action page on this effort and sign.

Creative Commons NC clause and 3D printing

I was browsing through some 3D printing files today, STLs that both I produced and were produced by others. For example, I designed and uploaded a 3D case for a LibraryBox that others have downloaded and printed. It is CC licensed, specifically CC BY-NC. I was looking at other STL files that had a CC NC license applied to them, and it made me think what that NC is really protecting.

Obviously, at the very least, the license prevents others from selling the STL files. Does it, however, prevent someone from using the files to create the physical object (that is, using a 3D printer to print the box itself out) and then selling the object? My instinct says yes, as the physical object is an instantiation of the digital file. But let’s scale the example up…what if someone built a house based on CC NC licensed plans? Could they ever legally sell the house?

This is purely theoretical. To my knowledge, no one is selling my designs, and I’m not planning to sell anyone else’s. But I am curious where the line between licensing a digital file and the resultant legal implications of the physical instantiation of that file might be.

The only case and real discussion I can find online is this case that was written up by Make, US Legal Lessons from Canada’s First STL IP Infringement Case. The discussion there indicates that Make’s author, Michael Weinberg, doesn’t believe that, once printed, there is any protection for a utilitarian object under copyright law (and since that’s what underpins Creative Commons, nothing there either).

Anyone want to weigh in on this?

Heresy and Patron Data

I’ve spent a lot of time over the last several years thinking, writing, and speaking about ebooks. I’m on the Board of Directors of Library Renewal, a group dedicated to finding ways to make the ebook experience a good one for libraries, publishers, and authors. And I’ve spoken all over the US and Internationally about eReaders and how digital content changes libraries. So what I am about to suggest is something that has been rattling around in my head for some time now, and I feel like it’s something that I’d love to hear other thoughts about.

When we look at how libraries, pubishers, and authors all interrelate vis a vis electronic content, specifically ebooks, the models that are largely being forwarded are straightforward economic models. The rights-holders have content, we want content, we pay them for content. Most of the disagreement comes down to the details: how much are we paying, and what rights do with have to the content that we are paying for. The majority of “new” models that are being trumpeted in libraryland, like the Douglas County Public ebook model, are just differently-arranged ways of doing exactly the same thing…which, admittedly, gives different outcomes on the two contentious fronts (cost and rights) but isn’t actually new in any significant way.

In an economic system, when one side of an equation (libraries) want something from another side (rights-holders), there is an exchange of value that takes place wherein both sides agree that said value exchange is fair in both directions. Libraries pay money for content…this is, at its base, just a value exchange between libraries and publishers.

Libraries don’t want a free ride as far as ebooks are concerned. Every single librarian that I have spoken with is perfectly willing to continue to pay for content. Unfortunately, the economics of libraries are such that when we want more rights (the ability to check out ebooks to any number of patrons simultaneously, or the right to ILL ebooks, etc) we don’t have the ability to exchange our typical economic instrument (money) for them. Think about Amazon and their ability to put the Harry Potter books into their Lending Library…freely available to anyone with an Amazon Prime membership. Libraries would kill for the right to do this, but Amazon is the one that can write the check. If we had tens or hundreds of millions of dollars to throw at publishers, we could dictate any rights we wished. But we don’t.

So the question that’s been bugging me is: what else do we have, besides cash, that is of value to the rights holders and could be traded for more of what we want. Libraries generate value in enormous numbers of ways, but what do we have that publishers might want that would give us some bartering ability?

Some librarians have started looking at these value-exchanges in a new way. Toby Greenwalt, a librarian at the Skokie Public Library, started asking what the value was to the publishers of the awards that the American Library Association gives out for childrens and young adult titles, and Andromeda Yelton followed up with a look at how those awards related to the ability for libraries to lend those books electronically. Here’s something that the ALA does, which appears to be significant value to publishers, with no visible complimentary exchange of value going the other direction.

Finally we get to what I’ve been thinking of as my heretical idea. Because when I think about what other thing of value that libraries have that could potentially be traded to publishers in order to get an equivalent set of value back from them in the way of ebook rights, I keep coming back to one thing:

Information. Information about our patrons, information about our circulations of individual books, and demographic information about our users and what books they read.

I know. A lot of librarians just stopped reading, or perhaps began clutching the arms of their chairs a bit too tightly. Patron information! The holiest of holies in library land, the Thing Which Must Not Be Shared! One of the core tenets of librarianship is that the borrowing history of the individual is sacrosanct. And for very, very good reasons…it doesn’t take a paranoid person to see the ways in which reading histories should be kept private, from the teenager looking for information about sexuality to the individual checking out a book about chronic illness (you wouldn’t want your insurance company to know that, now would you). As the saying goes, “show me what you read and I’ll tell you who you are”.

But this information is valuable. Publishers would love to know more about their readers, as it helps them to make better decisions about what to publish, how to market, and what sorts of books that a given population is more likely to buy. The amount of data that libraries could have in this realm is enormous, and could be a huge lever with which to move the playing field that we are all currently on regarding ebooks.

I am very aware, there are huge problems with this idea. The data in many cases is actually non-existent (libraries are very good about dumping this data so that it can’t be used by law enforcement or others in negative ways against readers). In order to maintain any sort of patron trust, there would have to be serious thought given to sanitization of the data, stripping of individually identifying information, and more (and yes, I am aware that stripping of individually identifying information has been shown to be basically useless…I retain some hope that there is a way to do it that isn’t). It is also the case that with the rise of cloud-based ILS systems that this information is going to be more available than ever, and centralized on servers that are out of library’s control.

But if we want the next decade to be a good one for us, libraries and librarians need to put some serious thought into what our other value-creation areas are, and how we can begin to identify and trade on those against the rights-holders. Because our money is getting thin, our prices are going up, digital is likely to kill our existing model completely, and we need new ways to think about these things.

What else do we have? What sort of leverage do we have that we aren’t using? What can we bring to the negotiating table that we haven’t yet?

Library of Congress blocks access to Wikileaks

This is evidence of the insane world we’re currently living in…the Library of Congress, ostensibly the Library of Record for the United States, is currently blocking access on it’s staff computers as well as it’s guest wireless network to Wikileaks.

From the above story, the Library issued a statement, saying:

The Library decided to block Wikileaks because applicable law obligates federal agencies to protect classified information. Unauthorized disclosures of classified documents do not alter the documents’ classified status or automatically result in declassification of the documents.

Oh, really? Is that so?

Anyone online realizes this is a senseless act, and that anyone with any knowledge of the Internet will be able to get around this sort of filter trivially…this does absolutely nothing to protect classified information. As far as I can tell, it does nothing except make the Library of Congress look asinine. Perhaps the librarians running the LoC should take another gander at the Library Bill of Rights to remind themselves what exactly it is that they should be doing.

I hope that there is serious fallout for those who made this decision. ALA Council…here’s a discussion worth having.

There's an app for that – OITP Brief on Mobile

The American Library Association Office of Information Technology Policy, better known as ALA-OITP, just released their Policy Brief on Mobile Tech, There’s an App for That! Libraries and Mobile Technology: An Introduction to Public Policy Considerations. Written by Timothy Vollmer, formerly of OITP and now working for Creative Commons, it’s a great “state of the union” brief on Mobile tech, and how it effects the library world in the current and near-future time frame.

I was honored to have been an early reader on this piece, and to have been able to give feedback to Timothy as he worked it up. If you have any interest at all about the future of libraries and the mobile world, this is a must read.

Disney, Libraries, and Copyright

Anyone who is even remotely familiar with Copyright Law in the United States has probably heard of the Mickey Mouse Protection Act, passed in 1998 largely due to the lobbying power of the Disney corporation. Anyone who has children, or is just a fan of the Disney oeuvre, is likewise aware of their “Disney Vault” system, wherein Disney releases one of their animated films on DVD for a limited time, and the withdraws it from retail sale for between 7 and 10 years.

The tension between copyright, libraries, and Disney’s Vault process is one that was brought to light for me in a series of tweets this past week from Gretchen Caserotti. She was struggling to replace films in their collection due to damage, loss, etc, and discovered that there are 22 Disney films that they can’t replace now due to Disney’s Vault. I hadn’t thought about this as a result of the Vault, and my first thought when the issue came up was Section 108 of the US Copyright Code…otherwise known as the Reproduction by Libraries and Archives section.

Librarians should all be familiar with Section 107, the Fair Use provision of US Copyright law. But I’m consistently surprised how few librarians know Section 108. It gives libraries specific abilities regarding copyright, and in regards to the Disney issue, I immediately thought of this section:

h) (1) For purposes of this section, during the last 20 years of any term of copyright of a published work, a library or archives, including a nonprofit educational institution that functions as such, may reproduce, distribute, display, or perform in facsimile or digital form a copy or phonorecord of such work, or portions thereof, for purposes of preservation, scholarship, or research, if such library or archives has first determined, on the basis of a reasonable investigation, that none of the conditions set forth in subparagraphs (A), (B), and (C) of paragraph (2) apply.
(2) No reproduction, distribution, display, or performance is authorized under this subsection if —
(A) the work is subject to normal commercial exploitation;
(B) a copy or phonorecord of the work can be obtained at a reasonable price; or
(C) the copyright owner or its agent provides notice pursuant to regulations promulgated by the Register of Copyrights that either of the conditions set forth in subparagraphs (A) and (B) applies.

There is also the section that allows for:

(c) The right of reproduction under this section applies to three copies or phonorecords of a published work duplicated solely for the purpose of replacement of a copy or phonorecord that is damaged, deteriorating, lost, or stolen, or if the existing format in which the work is stored has become obsolete, if —
(1) the library or archives has, after a reasonable effort, determined that an unused replacement cannot be obtained at a fair price; and
(2) any such copy or phonorecord that is reproduced in digital format is not made available to the public in that format outside the premises of the library or archives in lawful possession of such copy.

This section isn’t as useful, as (c)(2) prevents the circulation of copied DVDs, although it does appear to allow for patrons to view the DVD inside the library. And section (h) is limited to works in the last 20 years of their copyright term. The original Mickey Mouse cartoon, Steamboat Willie, was published in 1928, and Wikipedia reports that it should fall into the public domain in 2023. This would mean that, roughly, Disney media published between 1928 and 1935 would be subject to this rule. That range does not, unfortunately, cover any Disney films, as Snow White was released in 1937, and was the first major motion picture put out by Disney. But it would mean that in just 2 more years, in 2012, that Snow White _should_ fall into this category.

So, copyright nerds and librarians: is it legal for libraries to back up their DVD purchases, in a situation where it is a known fact that they will not be able to re-purchase them when they are needed? Here’s a situation where a legal copy of a DVD is owned, it is damaged or stolen, and the legal owner wants to buy a replacement…but can’t. Shouldn’t it be legal for a library to do this? Does Fair Use (section 107) give a library the right to make a copy in this situation?

Shut up and get out of the way

Google, on the Book Settlement, from arstechnica:

“Approval of the settlement will open the virtual doors to the greatest library in history, without costing authors a dime they now receive or are likely to receive if the settlement is not approved,” Google’s filing reads. “Nor does anyone seriously dispute, though few objectors admit, that to deny the settlement will keep those library doors locked while inviting costly, fragmented litigation that could clog dockets around the country for years.”

Or, in other words: Shut up, and get out of the way.

Copyright Clearance Center = FAIL

Sometimes, it’s just nice to laugh at industries that are desperately attempting to hang on to their relevancy in a changing world. Exhibit A for today is the Copyright Clearance Center, and their interesting attempt to educate users about copyright via their Copyright Basics video. Let’s examine the ways in which CCC fails at modern web usage.

First: here’s the opening screen of the video


I think that’s enough said, yes? Among the nearly-unreadable text is the prohibition to “distribute copies of the Program to persons outside your company, or post copies of the Program on any public website (including any video sharing or social networking site).”  Yep, that’s the CCC…all about education. Wouldn’t want those non-paying people to easily get your content that explains why they should pay for your content. 

Second: To get a copy of the video to use internally, on a non-public server that is limited to only your employees, you have to fill out a form on this page. Or, you know, just look at the page source:


Where the FLV file is handily linked for anyone who might want to use it. 

If ever there was a direct example of how the modern web breaks copyright, the CCC just gave it to us. The answer, of course, isn’t to ignore the de facto standards for the distribution of video on the web, to limit the ability to share and distribute content, and to generally treat people who want to use your content like criminals. The way to make yourself valuable and heard is to share what you make as widely as you possibly can…something that the CCC can’t bring itself to do.  It’s really hard to participate in the modern conversation when your very business model is tied to archaic and irrelevant legalese.

Ebooks, copyright, and the University of Virginia

I’m in the middle of writing a book about Mobile Technologies and Libraries, and am researching libraries providing mobile-specific services of all sorts. I came across the University of Virginia’s Ebook Library, and decided to take a look at what they are offering. It’s a very old ebook collection, with the original Etext division starting in 1992. Here’s the part that made me scratch my head…it’s in their Access and Conditions of Use:

While many of these items are made publicly-accessible, they are not all public domain — the vast majority of the images, and a number of the texts, including all of those from the University of Virginia Special Collections Department, are copyrighted to the University of Virginia Library, for example, and a number of other texts are still copyrighted to their original print publishers and made available here with permission.

I have no qualms with the texts that are copyrighted by their original publishers, and that UVA got permission to use. My eyebrows raise at the bit about “including all those from the University of Virginia Special Collections Department, are copyrighted to the University of Virginia Library…”


I had my suspicions here…it’s not like the UVA Special Collections Department are writing books, right? After look around, I found this text: Po’ Sandy by Charles W. Chestnutt. Published in 1888 in the Atlantic Monthly in New York, it is clearly in the public domain in the United States. But there it is, in the front matter:

Copyright 1999, by the Rector and Visitors of the University of Virginia

Looking around just a bit, it looks like this shows up on all sorts of texts that UVA digitized. My favorite is The Autobiography of Benjamin Franklin, completed in 1788 by Franklin but the particular version republished by UVA was published in 1909 by P. F. Collier & Son Company in New York. Also, without any doubt, in the Public Domain in the US. It also has the note:

Copyright 1999, by the Rector and Visitors of the University of Virginia

What gives UVA the right to claim copyright on these texts? They couldn’t have legally digitized them if they weren’t in the Public Domain at the time of their digitization, and changing the form of something doesn’t give you the right to claim a copyright, especially on the bits that make the work up. Even stranger, they aren’t just claiming copyright, but including a EULA!

By their use of these ebooks, texts and images, users agree to follow these conditions of use:

  • These ebooks, texts and images may not be used for any commercial purpose without permission from the Electronic Text Center.
  • These ebooks, texts and images may not be re-published in print or electronic form without permission from the Electronic Text Center. However, educators are welcome to print out items and hand them to their students.
  • Users are not permitted to download our ebooks, texts, and images in order to mount them on their own servers for public use or for use by a set of subscribers. Individuals and institutions can, of course, make a link to the copies at UVa, subject to our conditions of use.

Really? Is UVA asserting rights here that they just do not have? Not permitted to republish? Only if there is a copyright concern…which I think that UVA is asserting incorrectly here. It’s possible that there is some piece of copyright law that they are leaning on for these claims, but on the face of it, this seems like over reaching. Can anyone explain to me how they could possible have legitimate copyright claims on things that they didn’t create and are beyond the time limit for copyright protection in the US?