Heresy and Patron Data

I’ve spent a lot of time over the last several years thinking, writing, and speaking about ebooks. I’m on the Board of Directors of Library Renewal, a group dedicated to finding ways to make the ebook experience a good one for libraries, publishers, and authors. And I’ve spoken all over the US and Internationally about eReaders and how digital content changes libraries. So what I am about to suggest is something that has been rattling around in my head for some time now, and I feel like it’s something that I’d love to hear other thoughts about.

So as the Joker said in The Dark Knight Returns:

When we look at how libraries, pubishers, and authors all interrelate vis a vis electronic content, specifically ebooks, the models that are largely being forwarded are straightforward economic models. The rights-holders have content, we want content, we pay them for content. Most of the disagreement comes down to the details: how much are we paying, and what rights do with have to the content that we are paying for. The majority of “new” models that are being trumpeted in libraryland, like the Douglas County Public ebook model, are just differently-arranged ways of doing exactly the same thing…which, admittedly, gives different outcomes on the two contentious fronts (cost and rights) but isn’t actually new in any significant way.

In an economic system, when one side of an equation (libraries) want something from another side (rights-holders), there is an exchange of value that takes place wherein both sides agree that said value exchange is fair in both directions. Libraries pay money for content…this is, at its base, just a value exchange between libraries and publishers.

Libraries don’t want a free ride as far as ebooks are concerned. Every single librarian that I have spoken with is perfectly willing to continue to pay for content. Unfortunately, the economics of libraries are such that when we want more rights (the ability to check out ebooks to any number of patrons simultaneously, or the right to ILL ebooks, etc) we don’t have the ability to exchange our typical economic instrument (money) for them. Think about Amazon and their ability to put the Harry Potter books into their Lending Library…freely available to anyone with an Amazon Prime membership. Libraries would kill for the right to do this, but Amazon is the one that can write the check. If we had tens or hundreds of millions of dollars to throw at publishers, we could dictate any rights we wished. But we don’t.

So the question that’s been bugging me is: what else do we have, besides cash, that is of value to the rights holders and could be traded for more of what we want. Libraries generate value in enormous numbers of ways, but what do we have that publishers might want that would give us some bartering ability?

Some librarians have started looking at these value-exchanges in a new way. Toby Greenwalt, a librarian at the Skokie Public Library, started asking what the value was to the publishers of the awards that the American Library Association gives out for childrens and young adult titles, and Andromeda Yelton followed up with a look at how those awards related to the ability for libraries to lend those books electronically. Here’s something that the ALA does, which appears to be significant value to publishers, with no visible complimentary exchange of value going the other direction.

Finally we get to what I’ve been thinking of as my heretical idea. Because when I think about what other thing of value that libraries have that could potentially be traded to publishers in order to get an equivalent set of value back from them in the way of ebook rights, I keep coming back to one thing:

Information. Information about our patrons, information about our circulations of individual books, and demographic information about our users and what books they read.

I know. A lot of librarians just stopped reading, or perhaps began clutching the arms of their chairs a bit too tightly. Patron information! The holiest of holies in library land, the Thing Which Must Not Be Shared! One of the core tenets of librarianship is that the borrowing history of the individual is sacrosanct. And for very, very good reasons…it doesn’t take a paranoid person to see the ways in which reading histories should be kept private, from the teenager looking for information about sexuality to the individual checking out a book about chronic illness (you wouldn’t want your insurance company to know that, now would you). As the saying goes, “show me what you read and I’ll tell you who you are”.

But this information is valuable. Publishers would love to know more about their readers, as it helps them to make better decisions about what to publish, how to market, and what sorts of books that a given population is more likely to buy. The amount of data that libraries could have in this realm is enormous, and could be a huge lever with which to move the playing field that we are all currently on regarding ebooks.

I am very aware, there are huge problems with this idea. The data in many cases is actually non-existent (libraries are very good about dumping this data so that it can’t be used by law enforcement or others in negative ways against readers). In order to maintain any sort of patron trust, there would have to be serious thought given to sanitization of the data, stripping of individually identifying information, and more (and yes, I am aware that stripping of individually identifying information has been shown to be basically useless…I retain some hope that there is a way to do it that isn’t). It is also the case that with the rise of cloud-based ILS systems that this information is going to be more available than ever, and centralized on servers that are out of library’s control.

But if we want the next decade to be a good one for us, libraries and librarians need to put some serious thought into what our other value-creation areas are, and how we can begin to identify and trade on those against the rights-holders. Because our money is getting thin, our prices are going up, digital is likely to kill our existing model completely, and we need new ways to think about these things.

What else do we have? What sort of leverage do we have that we aren’t using? What can we bring to the negotiating table that we haven’t yet?

Tags data, ebooks, libraries, patron, publishers

By griffey

Jason Griffey is the Executive Director of the Open Science Hardware Foundation. Prior to joining OSHF, he was the Director of Strategic Initiatives at NISO, where he worked to identify new areas of the information ecosystem where standards expertise was useful and needed. Prior to joining NISO in 2019, Jason ran his own technology consulting company for libraries, has been both an Affiliate at metaLAB and a Fellow and Affiliate at the Berkman Klein Center for Internet & Society at Harvard University, and was an academic librarian in roles ranging from reference and instruction to Head of Library IT and a tenured professor at the University of TN at Chattanooga.

Jason has written extensively on technology and libraries, including multiple books and a series of full-periodical issues on technology topics, most recently a chapter in Library 2035 - Imagining the Next Generation of Libraries by Rowman & Littlefield. His latest full-length work Standards - Essential Knowledge, co-authored with Jeffery Pomerantz, was published by MIT Press in March 2025.

He has spoken internationally on topics such as artificial intelligence & machine learning, the future of technology and libraries, decentralization and the Blockchain, privacy, copyright, and intellectual property. A full list of his publications and presentations can be found on his CV.
He is one of eight winners of the Knight Foundation News Challenge for Libraries for the Measure the Future project (http://measurethefuture.net), an open hardware project designed to provide actionable use metrics for library spaces. He is also the creator and director of The LibraryBox Project (http://librarybox.us), an open source portable digital file distribution system.

View Archive

7 replies on “Heresy and Patron Data”

I agree with you about the data. Data is one of the reasons I was so unhappy about the deal with Kindle & Amazon – they got a LOT of data from that deal that we should have had access to. We have data, probably good data and you’re right it is valuable. My biggest concern, and you touched on this, would be that there is no safe way to strip the data of identifying information. Remember the Netflix contest?

In a way, at least in the children’s/YA book world, that information is informally shared with the school/library marketing folks in terms of the conversations we have at conferences. “What is popular, what is going off the shelf, what works, what didn’t, etc.” all happens in an anecdotal, non-individual’s privacy way.

Personally, I advocate for the reader and the reader having choices and connecting the readers to books, and I hope that librarians don’t give up that role in their patrons lives.

How about all the tech support librarians are providing Kindle, Nook, and iPad customers, at no cost to Apple, Amazon, or Barnes and Noble?

A couple of things leap to mind, but these are my knee jerk reactions to your post.

1) Don’t publishers do a lot of sponsorships at the ALA conferences? Whether it’s a coffee break or paying for another professional award or a speaker, it’s a sponsorship that happens regardless as to whether they have an award winning book or audiobook or whatever. I could be wrong on this, but that’s the impression I’ve gotten from reading through the regular conference planner.

2) In New Jersey, patron records are confidential and protected via statute. I’m not sure how it would be possible to share any data from that record as part of an information exchange.

Although, on the other hand, there are circulation numbers that could potentially be shared without coming into contact with patron records. However, I have a feeling that publishers would be interested in more personal data than our circulation numbers.

I think this is exactly what Overdrive is doing with their big data reports. They have the added advantage of being able to aggregate a whole boatload of library data. Unless libraries figure out a way to aggregate their sanitized data quickly, that ship may sail without us.

Publishers are also interested in discovery and libraries have pretty keen insight into how to promote discovery of even the most obscure titles. Making that knowledge quantifiable (and therefore tradable) will be key, but it’s something else we have to offer.

Regarding toby & ALA Awards: I just want to point out that was mainly educational to librarians or those without access to Bookscan numbers. That data and the value re sales, reprints, etc. is known to publishers.

I agree that we have excellent marketing data that we can use as a bargaining chip with publishers, and I do think that we can find ways to anonymize this data efficiently. I think we can also use it to improve our own services, if we just stopped being so darn afraid of it.

And Kate, you’re right, if we don’t figure out how to do this, the ship will sail without us.

I’m glad conversations like this are happening. It’s important to talk about our sacred cows and question whether they really need to be on those pedestals.

Share this:

By griffey

7 replies on “Heresy and Patron Data”

Leave a Reply Cancel reply