The online library world exploded today over the revelation that Adobe Digital Editions, software that is required for many library-focused eBook services, evidently leaks like a sieve when it comes to our user’s information. The TL:DR version of the story is that ADE appears to be sending in plain text to Adobe’s servers information such as: the book you are reading, title, publisher, which pages you have read and which page you are currently on. Much longer discussions about the leak and potential fallout here:
- Nate Hoffelder at the Digital Reader broke the story
- Ars Technica followed up
- Liza Daly confirmed the leak
- As did several librarians, including Andromeda Yelton and Galen Charlton
Andromeda and Galen then both went on to touch on some of the core problems with this leak, focusing on the conflict between Adobe’s action and the ethics of librarianship, and the possible role that ALA may have in bridging the gaps in libraries’ knowledge of these actions.
There are a few things I wanted to emphasize about this situation. The first is that several of the reports have noted that earlier versions of Adobe Digital Editions didn’t seem to “spy on its users” in the way that the most recent version (version 4) does, and recommend using earlier versions. The truth of the matter is that of course the earlier versions are spying on users…they just aren’t doing it in as transparent a manner as the current version. We need to decide whether we are angry at Adobe for failing technically (for not encrypting the information or otherwise anonymizing the data) or for failing ethically (for the collection of data about what someone is reading). It’s possible to be angry at both, but here’s a horrible truth: If they had gotten the former right and encrypted the information appropriately, we’d have no idea about the latter at all.
I think that Andromeda has it right, that we need to insist that the providers of our digital information act in a way that upholds the ethical beliefs of our profession. It is possible, technically, to provide these services (digital downloads to multiple devices with reading position syncing) without sacrificing the privacy of the reader. For example (and this is just off the top of my head) you could architect the sync engine to key off of a locally-hashed UserID + BookID that never left the device, and only transmit the hash and the location information in a standardized format. This would give you anonymous page syncing between devices without having to even worry about encryption of the traffic, as long as you used an appropriate hash function. I would prefer this approach, because (as mentioned above), if the entire communications stack is encrypted, it’s a black box for anyone attempting to see inside and verify what the vendor is actually collecting. There are answers to this as well (encryption keys that the vendor never sees at all, for example, and are totally local to the user’s device a la Apple’s latest security enhancements).
There are technical solutions that satisfy our ethical concerns. We need to insist that our vendors care enough about our ethics that the technical answers become a market differentiator. We need to insist that this is important and then we need to make them listen.
8 replies on “Adobe Digital Editions and infoleaks”
[…] Jason Griffey responded to Yelton’s piece, and part of his response is worth quoting in full: […]
[…] “Adobe Digital Editions and infoleaks” by Jason Griffey […]
It’s Liz Daly, actually. There’s a z and not an s.
…but there IS an a at the end 😉 Liza.
ARRRRGH. I’m sorry, Liza. I’ve butchered your name twice. FIXING THAT NOW.
Thanks for the notes, Nate and Andromeda.
[…] his blog Pattern Recognition, LibraryBox developer Jason Griffey describes a syncing scenario in which “you could architect the sync engine to key off of a locally-hashed UserID + BookID […]
The local hashing of userid and titleid is a good idea. Worth trying to build a prototype to see if it would work in practice. But a fragment stream would likely have a fingerprint that could be used to crack the title unless you’re really careful.
Eric: you’d have to treat it like an actual password hash, so: use a good hash engine (bcrypt or the like) and use a salt. If done properly (always the hard part) there’s very little risk of exposing the title of the book in question.
/me *goes off to see if anyone is doing this*