Tag Archives: data

Apple, the FBI, and Libraries

I’m sure most people who might read this blog are at least familiar that there is currently a battle occurring between Apple and the FBI over access to information on a phone that had been used by the San Bernardino terrorist. The details of that case are fascinating and nuanced, and can be summarized very roughly as:

The FBI has obtained a court order that compels Apple to create a new version of iOS that is different from the existing version that lives on the phone in question in three ways: one, that the new version will bypass the time-delay between password attempts that is standard for iOS; two, that the new version will be able to enter password attempts in a programmatic fashion instead of through finger presses on the screen; and three, that the new version of iOS will disable the security setting that may be active that erases the phone unrecoverably if 10 password attempts are incorrect. The reason that the FBI needs this to access the information that is stored on the phone is that iOS uses encryption to secure the information on the phone when it is, in the parlance of computer security types, “at rest.” The FBI could make a bit-for-bit copy of the software that is on the phone, and examine it until the heat death of the universe, and not be able to decrypt the information into a readable form.

While the court order and the responses on both sides are not directly about encryption, the reason that this is a question at all is encryption…if the FBI could dump the contents and read them, there would be no need for them to access the phone at all. Indeed, the information from the phone that they do have, given to them by Apple, is from a 6-week-old iCloud backup of the device that isn’t encrypted (currently, iCloud backups are NOT encrypted, or rather, they are encrypted but with a key that Apple has).

Why is this relevant to libraries? I think it’s past time that we start paying very close attention to the details of our data in ways that we have, at best, hand-waved as a vendor responsibility in the past. There have been amazing strides lately in libraryland in regards to the security of our data connections via SSL (LetsEncrypt) as well as a resurgence in anonymization and privacy tools for our patrons (Tor and the like, thank you very much Library Freedom Project).

Data about our patrons and their interactions that isn’t encrypted at rest in either the local database or the vendor database hosted on their servers (and our electronic resource access, and our proxy logins, and, and, and…) is data that is subject to subpoena and could be accessed in ways that we would not want. It is the job of the librarian to protect the data about the information seeking process of their patrons. And while it’s been talked about before in library circles (Peter Murray’s 2011 article is a good example of past discussions) this court case brings into focus the lengths that some aspects of the law enforcement community will go to in order to have the power to collect data about individuals.

For a great article on the insanity associated with the government’s position on this, please take a moment and read James Allworth’s The US has gone F&*%ing Mad. Also take a look at the wonderful article by Barbara Fister from Inside Higher Ed, wherein she boils the case down and does some deft analysis of the situation (sidenote: I’m a massive fan of Barbara’s writing, if you do not regularly read her stuff, fix that).

It’s fairly clear, I think, that the FBI is using this case to seek to set a precedent that would allow for future access to information on iOS devices. The case was chosen specifically to have the right public relations spin for them, it’s a thing that is technically possible (unlike a request to “break the encryption” which may actually not be technically possible), and they have asked for a tool to be created that is easy generalizable to other iOS devices. I back Apple on this, and believe that strong security measures (including but not limited to strong encryption) make us safer.

And I would feel lots, lots better about the state of data in libraries if I knew we were using strong encryption that protects our data. I would love to see an architecture for a truly secure (from a data standpoint) ILS, because I’m pretty certain that none of the ones in use right now are even close. In the same way that I’m certain that Apple is working on producing a version of iOS that they cannot access at all….we need to architect and insist on the implementation of data storage that even we can’t get directly into. If patrons want us to keep their lending history (and we have some evidence that opting in to such a system is something that patrons do want), then let’s insist that our ILS treat that data like toxic waste: behind closed and locked vaults that neither we nor the vendor can access.

Heresy and Patron Data

I’ve spent a lot of time over the last several years thinking, writing, and speaking about ebooks. I’m on the Board of Directors of Library Renewal, a group dedicated to finding ways to make the ebook experience a good one for libraries, publishers, and authors. And I’ve spoken all over the US and Internationally about eReaders and how digital content changes libraries. So what I am about to suggest is something that has been rattling around in my head for some time now, and I feel like it’s something that I’d love to hear other thoughts about.

So as the Joker said in The Dark Knight Returns:

When we look at how libraries, pubishers, and authors all interrelate vis a vis electronic content, specifically ebooks, the models that are largely being forwarded are straightforward economic models. The rights-holders have content, we want content, we pay them for content. Most of the disagreement comes down to the details: how much are we paying, and what rights do with have to the content that we are paying for. The majority of “new” models that are being trumpeted in libraryland, like the Douglas County Public ebook model, are just differently-arranged ways of doing exactly the same thing…which, admittedly, gives different outcomes on the two contentious fronts (cost and rights) but isn’t actually new in any significant way.

In an economic system, when one side of an equation (libraries) want something from another side (rights-holders), there is an exchange of value that takes place wherein both sides agree that said value exchange is fair in both directions. Libraries pay money for content…this is, at its base, just a value exchange between libraries and publishers.

Libraries don’t want a free ride as far as ebooks are concerned. Every single librarian that I have spoken with is perfectly willing to continue to pay for content. Unfortunately, the economics of libraries are such that when we want more rights (the ability to check out ebooks to any number of patrons simultaneously, or the right to ILL ebooks, etc) we don’t have the ability to exchange our typical economic instrument (money) for them. Think about Amazon and their ability to put the Harry Potter books into their Lending Library…freely available to anyone with an Amazon Prime membership. Libraries would kill for the right to do this, but Amazon is the one that can write the check. If we had tens or hundreds of millions of dollars to throw at publishers, we could dictate any rights we wished. But we don’t.

So the question that’s been bugging me is: what else do we have, besides cash, that is of value to the rights holders and could be traded for more of what we want. Libraries generate value in enormous numbers of ways, but what do we have that publishers might want that would give us some bartering ability?

Some librarians have started looking at these value-exchanges in a new way. Toby Greenwalt, a librarian at the Skokie Public Library, started asking what the value was to the publishers of the awards that the American Library Association gives out for childrens and young adult titles, and Andromeda Yelton followed up with a look at how those awards related to the ability for libraries to lend those books electronically. Here’s something that the ALA does, which appears to be significant value to publishers, with no visible complimentary exchange of value going the other direction.

Finally we get to what I’ve been thinking of as my heretical idea. Because when I think about what other thing of value that libraries have that could potentially be traded to publishers in order to get an equivalent set of value back from them in the way of ebook rights, I keep coming back to one thing:

Information. Information about our patrons, information about our circulations of individual books, and demographic information about our users and what books they read.

I know. A lot of librarians just stopped reading, or perhaps began clutching the arms of their chairs a bit too tightly. Patron information! The holiest of holies in library land, the Thing Which Must Not Be Shared! One of the core tenets of librarianship is that the borrowing history of the individual is sacrosanct. And for very, very good reasons…it doesn’t take a paranoid person to see the ways in which reading histories should be kept private, from the teenager looking for information about sexuality to the individual checking out a book about chronic illness (you wouldn’t want your insurance company to know that, now would you). As the saying goes, “show me what you read and I’ll tell you who you are”.

But this information is valuable. Publishers would love to know more about their readers, as it helps them to make better decisions about what to publish, how to market, and what sorts of books that a given population is more likely to buy. The amount of data that libraries could have in this realm is enormous, and could be a huge lever with which to move the playing field that we are all currently on regarding ebooks.

I am very aware, there are huge problems with this idea. The data in many cases is actually non-existent (libraries are very good about dumping this data so that it can’t be used by law enforcement or others in negative ways against readers). In order to maintain any sort of patron trust, there would have to be serious thought given to sanitization of the data, stripping of individually identifying information, and more (and yes, I am aware that stripping of individually identifying information has been shown to be basically useless…I retain some hope that there is a way to do it that isn’t). It is also the case that with the rise of cloud-based ILS systems that this information is going to be more available than ever, and centralized on servers that are out of library’s control.

But if we want the next decade to be a good one for us, libraries and librarians need to put some serious thought into what our other value-creation areas are, and how we can begin to identify and trade on those against the rights-holders. Because our money is getting thin, our prices are going up, digital is likely to kill our existing model completely, and we need new ways to think about these things.

What else do we have? What sort of leverage do we have that we aren’t using? What can we bring to the negotiating table that we haven’t yet?