Categories
Artificial Intelligence presentation Technology

Facial Recognition is Broken & Racist

A few months ago, I was asked to fill in and present at the virtual Computers in Libraries and Internet Librarian conference on the topic of Artificial Intelligence. The description for the presentation was already written and published, and I was asked whether I wanted to step in and create a presentation based around it. That description was:

Should face recognition change the way we interact with our customers? What if, for example, I can greet a person by using their last name as soon as he/she gets to the lobby because I have an iPad that will immediately show me the customer’s name, reservation, or even current fees? What near-future technologies will be enabled by AI, and which of them will be useful to libraries? Join us and learn how to make decisions about the good and bad aspects of AI technologies.

When I initially read this description, my first thought was “Say What?”. Given what we know about the realities of the racist and sexist inequities built into facial recognition, it seemed extremely odd to me to suggest libraries should be using it.

So, I decided to make that thought explicit, and the result is this presentation.

Below you’ll find video of my talk, as well as the PDF of my slides. If you’d like to download the slides, you can do so here.

Categories
Library Issues Technology

Using TagTeam

When starting the design work on the Blockchain & Decentralization course, I knew that I would have many many more resources that students might find useful than I could possibly assign to them. I wanted to find a way to make those resources easily findable by the students that wanted to dive deeper on any particular piece of the admittedly very complex subject.

Enter TagTeam.

A tool designed originally for the Harvard Open Access Project, and written and supported by a superb group of developers, TagTeam is a librarian’s dream of a web resource collection tool. It allows for, as the documentation so pithily says “folksonomy in, ontology out.” With the ability to add a website to the Hub, allow folksonomy-style tagging when adding…but then, on the backend of the tool, to turn those arbitrary tags into a controlled vocabulary. You can even set up automatic replacements for unwanted tags.

One of my favorite built-in functions is the ability to craft URLs that will drill down to any level of the set of resources you might want: tagger, tag, both, set of tags, in any combination. You can subscribe to RSS feeds that will automatically feed your Hub, and TagTeam also provides the ability to extract resources via RSS or JSON, and to remix feeds while doing so.

There are a few things I’d love to see added to TagTeam. The biggest would be that it would be fantastic to be able to integrate the tagging of a resource with the ability to cache it in some way. The ability to combine TagTeam with a tool like Amber or ArchiveBox would be a fantastic way to ensure the continued availability of webpages, especially for educational use. It would also make TagTeam an amazing curricular tool for Academic Libraries to offer for their campuses (hint, hint).

Overall, I’ve been thrilled with using TagTeam in my course, and can see so many uses for academic libraries to provide an instance for their campuses. If you haven’t seen TagTeam, explore some of the public hubs, and see if it fits in your (or your library’s) toolchain. And if you want to see what sort of resource can be put together using it, take a look at the Hub for the Blockchain course.

Categories
Blockchain Library Issues Technology

Blockchain & Decentralization for the Information Industries

Announcing the launch of the first Massive Open Online Course on Blockchain & Decentralization specifically focused around libraries, museums, archives, publishers, and the rest of the information ecosystem! Registration is now open and the course itself begins March 11th and runs for 6 weeks. Did I mention that the course is free?

I am the course designer and instructor for this MOOC, which is my first time designing a learning experience like this. Myself along with 4 very talented San Jose State University i.School students who will be acting as TAs for the course, will be monitoring the course and participating in the discussion boards to make sure that everyone progresses through the following outcomes.

  • Describe and explain the early uses of distributed ledger technology and the design of current blockchain systems.
  • Recognize the differences and similarities among various decentralized systems, and determine the most appropriate blockchain applications.
  • Compare and evaluate the advantages/disadvantages of using blockchain or other types of technologies for different applications.
  • Identify the ways blockchain can be applied in the information industries.

This course is the penultimate outcome of an IMLS grant given to San Jose State for the Blockchain National Forum, which was held in 2018. The final outcome will be a book which will be published this year, with chapters written by attendees and experts, summarizing and expanding on the lessons from the Forum (full disclosure: I wrote one of the chapters for the book as well).

The course is designed without any expectation that participants know anything about blockchain or decentralized technologies before beginning the course. It will walk you through details and introductions to the technology, all the way through existing services and systems and finally to what a decentralized future might look like. The full course breakdown looks like this:

  • Week 1 – March 11-17
    • Overview and History of Blockchain
  • Week 2 – March 18-24
    • Issues, Considerations, Problems
  • Week 3 – March 25-31
    • Decentralization
  • Week 4 – April 1-7
    • Systems & Services
  • Week 5 – April 8-14
    • Use Cases – Public Libraries, Academic Libraries, Museums, Archives
  • Week 6 – April 15-21
    • Future Directions & Next Steps

The course is a combination of mini-lectures that set up each week’s content, a selection of content relating to the topic (including readings, video, and audio), and then a discussion board where people can ask questions and talk about each week’s topic. At the end of each week there is a short quiz, and successful complete of the quiz will earn badges for each week, as well as a cumulative course badge and certificate at the end.

Please share this announcement widely! I’d like everyone who is even remotely interested in learning about Blockchain and decentralized tech to sign up and work through the course.

I’ll see you March 11th in the course!

Categories
ALA Personal Technology

Artificial Intelligence and Machine Learning in Libraries

Cover image of Library Technology Report

Now available is a publication I’m particularly proud of, “Artificial Intelligence and Machine Learning in Libraries” from ALA Techsource. I edited the volume, as well as authoring two of the chapters. The real stars are the three other librarians who contributed: Bohyun Kim, Andromeda Yelton, and Craig Boman. Bohyun wrote up her experience at the University of Rhode Island in setting up the first library-based multidisciplinary Artificial Intelligence lab, Andromeda talked about the development and possible future of AI-based library search as illustrated by her fantastic service HAMLET, and finally Craig talked about his experience in attempting AI-driven subject assignment to materials.

I wrote the Introduction, where I try to give a summary of the current state of AI and Machine Learning systems, and show some examples of how they work and are structured in practice. I also am particularly proud of drawing a line from Mary Shelley to the Google Assistant…you’ll have to read it to get the full effect, but here’s a different section to whet your appetite for more AI talk:

What changes in our world when these nonhuman intelligences are no longer unique, or special, or even particularly rare? …. AI and machine learning are becoming so much a part of modern technological experience that often people don’t realize what they are experiencing is a machine learning system. Everyone who owns a smartphone, which in 2018 is 77 percent of the US population, has an AI system in their pocket, because both Google and Apple use AI and machine learning extensively in their mobile devices. AI is used in everything from giving driving directions to identifying objects and scenery in photographs, not to mention the systems behind each company’s artificial agent systems (Google Assistant and Siri, respectively). While we are admittedly still far from strong AI, the ubiquity of weak AI, machine learning, and other new human-like decision-making systems is both deeply concerning and wonderful.

I also wrote the Conclusion and suggested some further reading if people are really interested in diving deeper into the world of AI and ML. In the conclusion, I try to talk about some of the likely near-future aspects of AI, and the impact it is likely to have on the information professions, from individualized AI assistants to intelligent search. From the conclusion:

As with much of the modern world, automating the interaction between humans is often the most difficult challenge, while the interactions between humans and systems are less difficult and are the first to be automated away. In areas where human judgment is needed, we will instead be moving into a world where machine learning systems will abstract human judgment from a training set of many such judgments and learn how to apply a generalized rubric across any new decision point. This change will not require new systems short term, but in the longer term a move to entirely new types of search and discovery that have yet to be invented is very likely.

I hope this work is useful for librarians, libraries, library students, and any other information professional who is trying to wrap their heads around the possibilities and potential for Artificial Intelligence and the world of information creation, consumption, organization, and use.

If your organization would like to talk to me about AI or Machine Learning and how it might make a difference to your business or operations, please get in touch. I’d love to work with you.

Categories
Library Issues Maker Personal Technology

Joining MetaLAB

I am beyond thrilled to announce I’ll be working with the outstanding group of scholars and artists at Harvard’s MetaLab this upcoming academic year as an affiliate, working mainly on their Library Test Kitchen project. I’m joining a team with some of my favorite makers and doers, people like Matthew Battles, Sarah Newman, and Jessica Yurkofsky, and many more that I am looking forward to meeting. I’ll still be in TN, working with them remotely and joining the team in Cambridge whenever possible.

I’ve been inspired by their work for years now, especially projects like Electric Campfire, which are right in my sweet spot of making with a goal of increased social connectivity. If you’ve not taken a look at the stuff that LTK has done, browse through and see what might inspire you.

Personally, I’m super excited to stretch my own knowledge of design and making through working with MetaLab. I’ve been consciously paying more attention to the design and making side of my brain recently, and while my instincts are not always to the artistic (I tend toward the more functional) I do have some aesthetic opinions that I like to embed in the work I do. I’m looking forward to expanding this bit of my brain.

Thank you to the gang for inviting me onboard. I’m excited to see what we can do together!

And lastly: MetaLab and Library Test Kitchen will be making an appearance at the 2018 LITA Forum in Minneapolis in November, so watch for more information about that very soon!

Categories
Apple Digital Culture Technology

About FaceID

I’ve seen the hottest of terrible hot-takes over the last couple of days about Apple’s announcement this past Tuesday (although leaked a few days before) that their new flagship iPhone, the iPhone X, will use a biometric system involving facial identification as the secure authentication mechanism for the phone. No more TouchID, which uses your fingerprint as your “key” to unlock the phone, we are now in the world of FaceID.

Let’s get this out of the way early in this essay: biometrics are for convenience, passcodes are for security. This doesn’t mean that biometrics aren’t secure, but they are secure in a different way, against different threats, for different reasons. The swap of FaceID for TouchID does nothing to lessen the security of your device, nor does it somehow given law enforcement or government actors increased magical access to the information on your phone.

You’d have thought, from the crazed reactions I’ve seen on Twitter and in the media, that Apple had somehow neglected to think of all of the most obvious ways this can be cheated.

 

and my personal favorite

The Wired article above, by Jake Laperuque, includes the breathless passage:

And this could in theory make Apple an irresistible target for a new type of mass surveillance order. The government could issue an order to Apple with a set of targets and instructions to scan iPhones, iPads, and Macs to search for specific targets based on FaceID, and then provide the government with those targets’ location based on the GPS data of devices’ that receive a match.

If we’re throwing out possibilities…any smartphone could do that right now based on photo libraries. If there was a legal order to do so. And IF the technology company in question (either Google or Apple, if we’re sticking to mobile phones as the vector) did indeed build that functionality (which would take a long, long time) and then did employ it on their millions and millions of phones (also: long time), it would involve an enormous amount of engineering resources. Coordination of the “real” target vs family members who just happened to have photos on their phones of Target X should be fairly easy to do via behavioral profiling and secondary image analysis.

But that, like the FaceID supposition above, is bonkers to believe. If anything, FaceID is more secure in every way than the equivalent attack via standard photo libraries. If a nation-state with the power to compel Apple or Google into doing something this complicated and strange really wanted to know where you were…they wouldn’t need Apple or Google’s help to do so.

The truth of the matter is that FaceID is no less secure than the systems we have now on Apple devices (here I am not including Android devices as there are simply too many hardware makers to be certain of the security). TouchID, the fingerprint authentication process that is available for use on every current iPhone (and the new iPhone 8 and 8 plus), every current iPad, and multiple models of MacBook, uses your fingerprint as the “key” to a hash that is stored on a hardware chip known as the Secure Enclave on the phone. When you place your finger on the TouchID sensor, it isn’t taking a picture of your print, or storing your print in any way. The information that is stored in the Secure Enclave isn’t retrievable by anything except your phone. Your fingerprints aren’t being stored at Apple Headquarters on some server. There is no “master database” of the fingerprints of all iPhone users. The authentication is entirely local, as witnessed by the fact that you have to enroll your print on every iOs device separately.

FaceID appears to be exactly the same setup, with exactly the same security oversight as TouchID. It’s entirely local to the phone, and all of the information (a “hash” of information about your face…it’s really not fair to call it a “picture”) is stored on the Secure Enclave within the iPhone. We haven’t seen the full security report on FaceID and iOS 11 yet, but I am certain it will be available soon (iOS 10 and TouchID is available here). Given the other well-considered aspects of security on iOS 11 that we have seen, such as requiring a passcode before trusting an untrusted computer, I am confident that iOS 11 and FaceID will be at least as secure as their previous iterations.

Is it possible that Apple, the most valuable technology company in the world in large part due to their ability to develop hardware and software in concert with each other, completely missed something in making FaceID? Of course it’s possible. But all of the ways that technology of this sort has failed from other companies (racial bias, poor security models, data leakage) have not yet been true for TouchID. I do not believe they will be true for FaceID either.

Even setting aside the purely technical aspects, legally there is no difference in the risks of using FaceID over using TouchID. In the tweet above about police holding your phone up to your face to unlock it, it would be important to note that they can compel a fingerprint now. It is entirely legal (with a lot of “if”s and “but”s) for a police officer to force your finger onto your phone to unlock it. No warrant is necessary for that to happen. FaceID is exactly the same, as far as legal allowances and burden of proof and such, as TouchID is now. In the case of preventing law enforcement access to your phone, the only answer is a strong password and your refusal to give it to someone.

It isn’t clear to me if FaceID is going to be a good user experience…without devices in user’s hands, we have no idea. But the knee-jerk response that somehow Apple is building a massive catalog of faces is neither true, nor possible given the architectures of their hardware and software.

This isn’t to say that there isn’t some real danger somewhere:

I think Zeynep has this (as most things) exactly right. This technical implementation is really quite good. The normalization of the technology in our culture may well not be…but this is why I am so vehement about defending this positive implementation as positive. Let Apple’s method of doing this be the baseline, the absolute minimum amount of care and thought that we will accept for a system that watches us. They are doing it well and thoughtfully, so let’s understand that and not let anyone else do it poorly. And for goodness sake don’t cry wolf when technologies understand their risks and are built securely. Because just like the story, when the real wolves show up, it will be that much harder for those of us paying attention to raise the alarm.

EDIT: After writing this entire thing, I found Troy Hunt’s excellent analysis, which says many of these same things in a much better way than I. Go read that if you want further explication of my take on this, as I agree with his essay entirely.

Categories
Personal Technology

Personal International Infosec

This year I have a small number of international speaking engagements, and I just returned from the first of those in 2017…which means it was the first since the recent spat of increased DHS and Customs enforcement. It was also my first trip to a Muslim-majority country, and while not one on the magic list, it still made me consider my re-entry into the US and the possible attention therein. These things combined to make me far more attentive to and aware of my personal information security (infosec) than every before. This post will be an attempt to catalog the choices I made and the process I used, as well as details of what actual technological precautions I took prior to leaving and when actively crossing the border.

This trip was to the SLA Arabian Gulf Library Conference, held this year in Manama, Bahrain, where I was on a panel discussing future tech. This means flying internationally through a major city, which for me meant flights from Nashville to JFK to Doha International Airport in Qatar, then finally to Manama, Bahrain. The return was was the same, with the exception of flying back into the US via O’Hare in Chicago rather than JFK. This meant crossing into at least 2 foreign countries physically on each leg of the trip, although in Qatar I remained in the international section of the airport and didn’t go through customs and enter the country proper. Still, there were LOTS of checkpoints, which meant lots of potential checks of my luggage and technology.

Threat Model

What was my concern, and why was I thinking so hard about this prior to the trip? After all, I’m a law-abiding US citizen, and as the saying goes, if you’ve nothing to hide, why worry? First off, the “if you’ve nothing to hide” argument is dismissible, especially given the last 6 weeks of evidence of harassment and aggression at the US border. I am a citizen of the US, but I have also been very outspoken online regarding my feelings for the actions of the current administration. On top of that, information security isn’t just about the individual…it’s about everyone I’ve exchanged email with, texted, messaged on Facebook, sent a Twitter DM, and the like….the total extent of my communications and connections could, if dumped to DHS computers, theoretically harm someone that isn’t me, and that was not ok in my book. A primary goal was to prevent any data about my communications or contacts from being obtained by DHS.

DHS and Border Control has very, very broad powers when it comes to searching electronic devices at the border. I was not certain of the power granted to Border Agents in Qatar and Bahrain, but my working assumption was they had at least the powers that the US Agents did. I also assumed that the US agents would probably have better technological tools for intrusion, so if I could protect my data against that threat, I was safe for the other locations as well.

A secondary goal in my particular model was to attempt to limit the possibility for delay in my travels. If I could comply with requests up to a certain point without breaking my primary goal of data protection, that would likely result in less delay. When considering these levels of access, I thought about questions like: could I power on my devices without any data leakage? Could I unlock my devices if requested and allow the Agent to handle my phone, for instance, without risking data leakage? Could I answer questions about my device and the apps on it (or other apps in question, for instance social media accounts such as Facebook or Twitter) honestly without risking data leakage?

With all of that in mind, here’s how I secured my technology for border crossing. Your mileage may vary, as your threat model may be very different, and the manner in which you choose to answer the various questions above may be different. If everything had gone south and my devices were impounded, I’d be writing a very different post (and contacting the EFF). But for this particular trip, this is my story.

What to Take

First off, I decided quickly that I wasn’t going to travel with my MacBook Pro. I was lucky enough that I didn’t need it for this trip, because there wasn’t any work that I would be doing on the road that necessitated a general purpose computer. I had work to do, but it all involved writing…some email, some writing text for a project, some viewing of spreadsheets and analysis of them. Simple and straightforward things that luckily could easily be done with a tablet and a decent keyboard. I already had an iPad with the Apple keyboard case, which made for an easily-carried and totally capable computing device for the trip. I could load some movies and music on it, fire up a text editor, answer email, and generally communicate without issue. It’s also iOS based, which makes it enormously more secure than Mac OS from first principles.

Since both my main computing device and my phone ran the same OS, I was able to also double-up any planning and efforts in security, as any decision I made could be equally applied to both devices. This turned out to be very, very convenient, and saved me time and effort.

The first thing that I did was backup the both the iPad and iPhone to a local computer here at my house (not iCloud) and ensure that those backups were successful. I stored those backups on my home network to ensure their safety…if anything went wrong later, these would be my “clean” images that I could revert to upon returning home. Then I used Apple Configurator 2 to “pair lock” my devices to my laptop, which would remain at home.

Pair Locking

This process was best described back in 2014 by security researcher Jonathan Zdziarski. While his instructions are fairly out of date, the general idea is still there and still works in iOS 10 and Apple Configurator 2. Basically, pair-locking an iOS device is a method by which the device is flashed with a cryptographic security certificate that prevents it from allowing a connection to any computer that doesn’t have the other half of the cryptographic pair on it. This means that once locked to my laptop (which, again, wasn’t in my possession and was still at my home), my iPhone and iPad would simply refuse to connect to any other computer in the world…whether that was someone that stole it from me and and attempted to reflash it using iTunes on their computer, or whether that is a diagnostic device being used by law enforcement.

This process is designed with the concept of using it for enterprise installation of iOS devices that need high security procedures to prevent employees from being able to connect their home computer to their work phone and retrieve any information. But it works very well for the purposes of preventing any possible attacker from accessing the phone’s memory directly through it’s lightning port. This processes ensures that even if the phone is unlocked and taken from my possession, DHS or other attacker cannot dump the memory directly or examine it using typical forensic information gathering devices.

Password Manager

Once both devices were pair-locked, I was left with two freshly installed iOS devices that I needed to reload with apps and content that would be useful for me. After loading a set of games and apps that would allow me to pass the time and still get some work done, as well as media I might want to consume on the road, I loaded my password manager (I use and am very happy with 1Password) and created a very, very long and complicated vault password that there was no possibility I could remember. I recorded that password on paper (left at home in a fireproof safe) and gave it to a trusted person that had instructions not to give the password to me until I had cleared the border and only over a secured channel.

I then changed the 1Password vault password to be that password plus a phrase that I knew and could remember (a sort of salt). 1Password was set up to allow me to login with TouchID, so I could still operate normally (logging into services and such) until such a time as that TouchID credential was revoked. Once revoked, I would be completely locked out of my passwords, with no ability to access them, until through a pre-arranged time and secure channel I got the vault password from either of the mentioned trusted sources. Those trusted sources, meanwhile, couldn’t access my password vault either, since the salt was resident only in my head.

It may be obvious, but I also ensured that everything in my life that was accessed with a password had a very strong one that was held by 1Password, and that I didn’t know and couldn’t memorize even if I tried. My bank, social media, dropbox…everything that could get a password, had a very, very secure one. Any service that supported 2-factor authentication had said 2 factor turned on, with the second factor set to an authentication app that supports a PIN (or, in the case of Very Important Accounts, a physical Yubikey that was left in TN as well). This is security 101, and not directly related to my border crossing…but if you don’t have the basics covered, nothing else really matters.

Sanitization

I made sure that iOS had most iCloud sync services off….no contact syncing, no calendar syncing, really the only thing I left syncing was my photo gallery. I did not install any social media apps (no Facebook app, no Twitter app, etc) and only logged in and out on the websites in question. The browser on both devices was set to not remember passwords, and I clear cache and history regularly when traveling. As far as I could, I eliminated anything that stored conversations or messages between myself and others…no Facebook Messenger app, etc. I deleted my email app, and didn’t enter my account information for email into the standard iOS mail app.

This was, keep in mind, just for the transit period. Once in country and across borders, I could use a VPN to connect to the ‘net and download any apps needed, log into them after retrieving the password from one of the trusted sources, and effectively use both devices normally (with basic security measures in place all the time, of course).

Crossing Borders

At this point, I had a device that couldn’t be memory dumped, that had very little personal information on it, and even less information about my contacts on it. It mostly acted normally for me, because 1Password handled all of my logins and I used TouchID during daily usage…right up until I needed to cross a border. Before I did so, I deleted my TouchID credentials via Settings (by deleting the fingerprint credential), and powered-cycled my phone. Those two actions did several things all at once:

The first was that it prevented me from being able to know or retrieve any passwords for anything in my life. That’s a pretty scary situation, but I knew it was fixable in the future (this wasn’t a permanent state). It also meant that if I were asked to unlock my phone, I could do so pretty much without anything of interest being capable of access. Without the ability to dump the phone forensically, officers could ask me for passwords for accounts and I could truthfully say that I had no way of telling them, because the password manager knew them all and I didn’t. And I couldn’t give them the password vault login because I literally didn’t know it.

The idea with all of this was to create a boundary of information access beyond which, if DHS wanted to try and access, they would need to impound the phone and potentially subpoena the information from me with a warrant. My guess (which turned out to be correct) was that they would ask to have it powered on, and maybe they would ask to see it unlocked, but that would be it. If they pried further, well…I was prepared to tell them truthfully that I didn’t know, that I couldn’t know. And I would call a lawyer if detained, and proceed from there.

The worst case scenario for me was minimal delay and discomfort. I am enormously privileged in my position to be able to think about this sort of passive resistance without actual fear for bodily harm or other forms of retribution. For me, the likely worst case, even if things had escalated to asking for social media passwords, would have been the confiscation of my devices and my being detained for a time. This is assuredly not the worst case for many, and it is extraordinarily important that each person judge their own risks when deciding on security practices.

For some, it is far better to simply not carry anything. Or to carry a completely blank device. Or purchase an inexpensive device when you arrive in the country of your destination. For me, I had the ability to prepare and be ready for resistance if needed. Your mileage may, and should, vary.

Conclusion

The results of all this thought and effort? Nothing at all. Not a single bit of attention was paid to me at the various border crossings, by either US or foreign agents. On the leg of my flight leaving Qatar, I went through no fewer than 4 security checkpoints from the time I landed until getting onto the plane taking me to O’Hare, and at each one there was a baggage scanner and metal detector, agents pulling people out of line for additional screening, and the like. When I finally got to my gate, it had its own private security apparatus,  again with metal detector and baggage X-ray. At this security checkpoint, I was randomly selected for additional screening, but the agent in question (a Qatar security agent) was incredibly professional, thorough, and neither invasive nor abusive. I got a pat down (much less severe than those I’ve been given at US airports), and they asked to look inside my carryon…they even asked me to power on my iPhone and iPad. But they didn’t ask to unlock them, and they didn’t ask for passwords of any type.

When entering into the US at O’Hare, the plane was greeted by DHS agents at the gate, who asked to check passports upon exiting the plane. The agent I was greeted by barely had time to glance at my US Passport before waving me through…again, the privilege of my appearance and nationality was evidenced by the fact that several of my fellow passengers were not waved through so easily. The last thing I heard as I walked up the jetway towards Customs was a DHS Agent saying to the robed gentleman behind me “So you don’t speak very much English, huh….”

The current state of our country cannot stand. We are a nation of immigrants many peoples1, and a nation that believes in the privacy of our affairs and effects. This concern I had for my own and my friends’ information shouldn’t have been necessary. We should be able to be secure in our possessions, even and especially when those possessions are information about ourselves and our relationships to others. I do not want to be in a position where I have to threat model crossing the border of my own country. And yet, here we are.

I’d love any thoughts about the process described above, especially from security types or lawyers. Any holes or issues, any thoughts about what was useless, anything at all would be great to hear. I hope, as I so often hope these days, that all of this information never becomes applicable to you and that you never need to use it. But if you do, I hope this helped in some way.

I was called out on Twitter for my use of “immigrant” as an inclusive term for people in the US, when, of course, many US citizens ancestry is far more complicated and difficult than “they chose to come here”. It was written in haste and while it works for the emotion I was attempting to convey, it definitely undercuts the violent and difficult history of many people in the US. I’ve edited the text to reflect the meaning more clearly and left the original to indicate my change.

Categories
Personal Technology

Disaster Scenario Part One

I was honored to give the opening keynote for the SEFLIN 2016 Virtual Conference, entitled “Innovation & Disruption: Past, Present, Future” where I talked about why innovation is important in libraries, how structures disempower innovation, and what technologies I am watching for their capacity for disruption. It was this last topic that garnered the most comments during the talk, and even afterwards via email and twitter.

I have come to believe that we’re on the cusp of some truly weird societal changes due to the exponential growth of technology. AI/Machine Learning, Robotics, ubiquitous presence and sensornets via the Internet of Things, decentralization….all of these things are beginning to turn the corner from interesting ideas into realized technologies in the world. A couple of them in particular that I spoke about have what I think are truly frightening outcomes over the next decade, and I’m hoping to expand my thinking on why and how here. Let’s start with robots, in the form of autonomous automobiles.

Robots, in the form of self-driving or autonomous vehicles, are going to transform the US economy in ways that, for certain populations, may be disastrous. I think it’s fairly clear at this point that we are moving towards autonomous vehicles at breakneck speeds, and there seems to be a pretty clear map that gets us from the current state of somewhat-partial autonomy to autonomous-on-interstates and finally to fully doesn’t-need-a-human vehicles. The consensus among people who do this stuff is that the easiest problem to solve is that of long-distance interstate or highway travel, and the largest target for disruption to this type of driving is that of commercial trucking.

When it comes to automation, commercial trucking has a lot of things going for it. From the perspective of the companies doing the movement of goods around the country, fewer drivers is better in almost every way: fewer accidents almost assuredly, but also lower fuel costs (as computers are very, very good at optimization algorithms), fewer delays (same reason), and over time huge costs savings…robots do not yet require health coverage and retirement plans. There are benefits of partial autonomization as well…we don’t have to have fully self-driving trucks for there to be huge benefits for the companies involved, since the reduction of humans in the equation will garner cost savings immediately, and one can easily imagine a pathway that begins reducing drivers gradually: instead of needing 3 drivers for 3 trucks heading across country, 1 driver in the first truck acting as “lead” could be followed by 2 robotic trucks in sort of psuedo-autonomous caravan.

This move from One-Human-per-Truck to One-Human-per-X-Trucks to No-Human-At-All is going to happen over the course of the next 5-10 years. Currently, one of the most common middle-income jobs in the entire US is that of a truck driver…not always over-the-road, but again as we move from pseudonymous to autonomous the disruption will happen at ever-more-local levels. As this job is displaced by automation, there will be larger and larger numbers of workers that go from middle-income to greatly reduced or no income over the course of the next decade. These workers disproportionately live in rural areas of the country, and are the most vulnerable economically as there are fewer secondary labor options for them.

The people in rural areas often also have higher than average relocation burdens to overcome. Simply “moving to where the jobs are” isn’t really an option at all, for both emotional and practical reasons. In my areas of interest (KY, TN, the rural South) there is a huge emotional and psychological connection to the place and the community…getting out has a huge cost and those that do move to more economically vibrant areas are seen as deserters or traitors. More practically, there is a cost-of-living gap between the rural US and cities/suburbs that is a barrier for movement for many. When you sell your $50,000-$100,000 home and the land that your great-grandfather settled and was passed down to you, the move to any city is simply impossible financially. The math just doesn’t work to be able to reasonably move your family into a home even in the suburbs for that much, and trading the stability of a mortgage for renting an apartment when the entire reason you are moving is wage depression and loss…well, it just isn’t possible.

We have a situation where, over the course of the next decade, one of the most common middle-income jobs in the rural US could disappear, and it could mostly affect areas where the secondary job market for these workers is very constrained. The social services for everything to information about re-skilling to job application fulfillment will fall to the public library in their communities, as they are very often the only easily accessible and well-trusted governmental program in rural areas.

In addition to largely helping to deal with this crisis on an individual level, libraries will be stuck with ever-decreasing budgets in areas where said budgets are based on local taxes. The slow-motion economic collapse of rural america that has played out in the areas that I care about the most (the rural south, Eastern Kentucky, Middle Tennessee) will accelerate, and as these jobs collapse, families will be devastated and the tax base for library support will dwindle.

Libraries will be in a situation where they are asked, yet again, to do more for their local communities when the very communities that they are trying to save can’t possibly contribute to their budgets.

Since we love to argue with each other, when I pointed out on Twitter than I thought this round of economic upheaval due to automation was different, Tim Spalding of Librarything said:

 

Tim points out a common refrain from people who are skeptical of the ability of automation to “take jobs” from humans. He’s right to be skeptical, as every previous time this has happened, the overall economy has grown and individuals have re-skilled and found new jobs. Automation hasn’t, in the past, actually ended in a removal of jobs on average from a country, nor has it decreased average earnings.

The problem with that argument is that it generalizes from large-scale to small-scale. On average, the numbers for the US might still look ok…but the small towns, the places that are only still places at all because of their ready access to an interstate, those places and the people in them are going to have a very rough time of it. There are more jobs in the energy sector than ever before, but that doesn’t help the coal miners in Appalachia.

This highly localized effect will disproportionately affect the rural parts of the US, and thus will also disproportionately affect the libraries in those areas….libraries that are often already vulnerable to small changes in budget. My concern is that as this change begins, we will see a sort of wave of challenges: first the trickle of job loss, which begins to put pressure on local economies, and as the trickle becomes a swell and then a wave the combination of decreasing wages and localized economic depression will increase the movement out of rural areas. This will further depress the tax base, and rural library systems will see an ongoing downwards slope of budget.

There is an admitted problem of what the timeline looks like for this entire process. The automation of trucking is likely to begin affecting local economies in the next 5-10 years, but the rest of my prediction (the increased exodus of youth from rural areas, the mobility of those that can move quickly as opposed to the generational resettling that this movement begets) will take perhaps decades to fully unfold.

Is it possible that there will be a counter-balancing effect of some type that maintains the economy of these areas? Some form of job replacement that offsets, even partially, the jobs lost to autonomous trucks? For the country as a whole, of course there will be. There are going to be yet-unimagined new opportunities. As depressing as this possible-future is for rural America, I am overall a technological utopianist. I think that big-picture, we are moving in positive directions. If nothing else, I would absolutely be willing to trade localized economic disruption for the massive savings of human life that we will see as humans are replaced as drivers.

But the places that I love are going to be hurt. And even when I know that the good outweighs the bad, the bad is still bad.

How can libraries make a difference? Simply being aware of this possible future is the first step, and watching for leading indicators in their communities. Strategically, getting in front of the job loss wave by preparing for re-skilling and educational opportunities, making connections with other community resources in those arenas as well as other governmental offices that will be needed could be a way of preparing to be of the most use to the community. Rural libraries should have a relationship with their nearest community colleges or other formalized higher education options and should have strategies in place that help people move into formalized training or other economic recovery options.


In the next installment of Disaster Scenarios, I hope to take a look at AI/Machine Learning and see if there’s a similar story to tell about the way it is going to change not only how people interact with information, but how they are able to interact with information and the risks therein. I think information professionals might be in for some real weirdness in the next decade.

 

Categories
Gadgets Maker Technology

Sexism, meeting dynamics, attention analysis: who talks during meetings

Yesterday, Andromeda Yelton posted this excellent blog entry, Be Bold, Be Humble: Wikipedia, libraries, and who spoke. It’s about the well-known social sexism dynamic of meetings, where in a meeting that has both women and men, men speak more frequently, use fewer self-undercutting remarks (“I’m not sure….” or “Just…” or “Well, maybe…”), and interrupt others speech at a much higher rate than women in the same meeting.

The post got passed around the social nets (as it should, it’s wonderfully written and you should go read it now) and one of the results was this great exchange:

 

Which prompted me to reply:

I couldn’t get the idea out of my head, which basically means that it needs to show up here on the blog. I thought all night about how to architect something like that in hardware/software as a stand alone unit. There is always Are Men Talking Too Much?, which Andromeda linked to in her essay, but it has the downside of requiring someone to manually press the buttons in order to track the meeting.

I’ve been basically obsessing over attention metrics for the last couple of years as a part of bringing Measure the Future to life. The entire point of Measure the Future is to collect and analyze information about the environment that is currently difficult to capture…movement of patrons in space. The concept of capturing and analyzing speakers during a meeting isn’t far off, just with audio instead of video signal. How could we built a thing that would sit on a table in a meeting, listen and count men’s vs women’s speaking, including interruptions, and track and graph/visualize the meeting for analysis?

Here’s how I’d architect such a thing, if I were going to build it. Which I’m not right now, because Measure the Future is eating every second that I have, but…if I were to start tinkering on this after MtF gives me some breathing room, here’s how I might go about it.

We are at the point in the progress of Moore’s Law that even the cheapest possible microcomputer can handle audio analysis without much difficulty. The Raspberry Pi 3 is my latest object of obsession…the built-in wifi and BTLE changes the game when it comes to hardware implementations of tools. It’s fast, easy to work with, runs a variety of linux installs, and can support both GPIO or USB sensors. After that, it would just be selecting a good omnidirectional microphone to ensure even coverage of vocal capture.

I’d start with that for hardware, and then take a look at the variety of open source audio analysis tools out there. There’s a ton of open source code that’s available for speech recognition, because audio interfaces are the new hotness, but that’s actually overcomplicated for what we would need.

What we would want is something more akin to voice analysis software rather than recognition…we don’t care what words are being said, specifically, we just care about recognizing male vs female voices. This is difficult and has many complicating factors…it would be nearly impossible to get to 100% success rate in identification, as the complicating factors are many (multiple voices, echo in meeting rooms, etc). But there is work being done in this area: the voice-gender project on Github has a pre-trained software that appears to be exactly the sort of thing we’d need. Some good discussion about difficulty and strategies here as well.

If we weren’t concerned about absolute measures and instead were comfortable with generalized averages and rounding errors, we could probably get away with this suggestion, which involves fairly simply frequency averaging. These suggestions are from a few years ago, which means that the hardware power available to throw at the problem is 8x or better what it was at that point.

And if we have network connectivity, we could even harness the power of machine learning at scale and push audio to something like the Microsoft Speaker Recognition API, which has the ability to do much of what we’d ask. Even Google’s TensorFlow and Parsey McParseface might be tools to look at for this.

Given the state of cloud architectures, it may even be possible to build our gender meeting speech analysis engine entirely web-based, using Chrome as the user interface. The browser can do streaming audio to the cloud, where it would be analyzed and then returned for visualization. I have a particular bias towards instantiating things in hardware that can be used without connectivity, but in this case, going purely cloud architecture might be equally useful.

Besides gender, the other aspect that I had considered analyzing was interruptions, which I think could be roughly modeled by analyzing overlap of voices and ordering of speech actors. You could mark an “interruption event” by the lack of time between speakers, or actual overlap of voices, and you could determine the actor/interrupter by ordering of voices.

Once you have your audio analysis, visualizing it on the web would be straightforward. There are javascript libraries that do great things with charts like Chart.js or Canvas, or if working in the cloud you could use Google Chart Tools.

If any enterprising developer wants to work on something like this, I’d love to help manage the project. I think it could be a fun hackathon project, especially if going the cloud route. All it needs is a great name, which I’m not clever enough to think of right now. Taking suggestions over on Twitter @griffey.

Categories
Berkman Digital Culture Images Technology

OpenArchive

Sitting in the Internet Archive Great Room (see photo above for reference…yes, it’s in an old church….) I’m reminded that I never pushed out the link to the amazing new app that was created in part by my friend Nathan, available now for Android and coming soon for iOS that allows you to use the Internet Archive like your own personal Instagram:

Screen Shot 2016-06-07 at 12.03.22 PMOpenArchive

and because Nathan and his group are awesome, the app is also open source:

Github repo for OpenArchive

and finally, direct link to the Google Play store for the app.

I’ve not seen an easier way to add photos to the Internet Archive directly than this app, and it’s got some really fantastic side benefits..the primary one being that it works transparently over Orbot if you’d like, so that uploads and connections can be driven over the Tor network without any extra effort on the user’s part.

UPDATE

The Guardian Project just posted their own announcement for the app. Their take on it is also timely since I’m spending this week at the Decentralized Web Summit:

We see this as a first step towards a more distributed, decentralized way of managing and sharing your personal media, and publishing it and synchronizing it to different places and people, in different ways.