There’s been much recent attention paid to the addressability of book content on the web, with a “Publishing Hackathon” in New York, and HarperCollins’ creation of an API-fueled hackathon “Programming Challenge“, both of which received a mix of criticism and praise; nonetheless they are a good start. But in the rush to try to entice a more technically savvy element, I think publishers are missing a more elemental approach – borrowing simple and well-established web standards. Continue reading
I just spent my first day at ALA Midwinter in Dallas, locked in a room with far greater experts than I on a ALCTS session on linked data. Linked data, in a nutshell, is the use of RDF to characterize metadata as sets of ordered relationships. This permits data from disparate sources to be combined in useful ways via machine processing. My counterparts: Eric Miller, Ross Singer, Corey Harper, and Karen Coyle, are all linked data enthusiasts, and have done much to advance the concepts both in the U.S. and abroad. The hope is that exposing datasets in linked data format will enable users to combine data and make new associations in innovative ways.
I think linked data is an appealing concept, but I’m not sold on it as billed, because I do not see how linked data actually assists the discovery process in meaningful ways. As I describe in my talk, I suspect linked data approaches are most attractive as linked closed data, in large aggregations of content where the owning platform can control the descriptions, tools, and associations to best meet their own needs. I could readily imagine Amazon having a tremendous linked data system, just not available at the data layer for external use. Using linked data in closed systems also obviates some of the very tricky rights issues that might emerge through CC-SA or copyleft licenses, or mandatory inclusion of commercial derivatives on cultural data, that would otherwise hinder downstream data use.
More than anything else, I think the best user experiences in the discovery of well-characterized content, for now, come through aggregations of content rather than through distributed searching. This is in contrast to open-ended discovery, where web search remains paramount; to some extent my issue with linked data reprises the battle between metasearch library systems versus centralized databases of 10 years ago. Linked data can’t do a good job with discovery because it doesn’t know the intent of the general user; the only way to guess at that effectively is by observing a great deal of search activity and user behavior, and you don’t see it by sitting on one small corner of a network. That’s why Amazon, Apple, and Facebook can provide compelling user experiences.
The corollary of this arises through the observation that both cataloging of books and their digital delivery are moving to platforms removed entirely from libraries, via the Library of Congress, OCLC, Bowker, Overdrive, 3M, and maybe DPLA in the future. In other words, library cataloguing departments are not likely to be touching linked data for books in any direct way, although they may manipulate it and use it through other interfaces. Eric Miller of Zepheira really crystallized this when he (paraphrasing) said, “This is not about linked data for libraries, but linked data for the web.” I think that’s mostly right, although I think it is probably more about linked data for archives and museums, which have the complex objects that arguably most benefit from linked data.
What did make more sense to me, and emerged at the end of our day, is envisioning how linked data might well be invisibly integrated into the workflow of libraries. In the same way that a blogger need not know HTML (or much of it) to write a web page, a library documenting upcoming community lectures, educational outings, author talks, and literacy programs might well be describing these events in a linked data format through software tools that remove the intricacy of the RDF syntax. In that way, I could envision libraries building webs of data that really would be useful, creating living calendars and information resources that spring directly out of the community well.
In other words, linked data might be one of the tools that libraries use to enter the data-driven age, at the same time they leave the world of print books they’ve historically known; an unexpected but wonderful path.
We’ve wrapped up the meetings of the National Digital Public Library in Los Angeles. It was a an intense three days, and I felt lucky to be surrounded by so many insightful people driven by the passion of getting more information out into the world. Thanks to the LA Public Library under Martin Gomez, the LA Library Foundation, IMLS, and the Sloan Foundation, among so many others, for helping to make the meeting possible. Kudus as well to all of the staff at LAPL who made navigating around the library and downtown LA so effortless.
In a lot of ways, however, the final half-day was frustrating. It was an attempt to recapitulate prior sessions via conceptual silos such as “content”, “communication”, and so forth. After a compelling opening by Ken Brecher, the head of the Library Foundation of Los Angeles, discussion was led primarily by Gary Strong, the University Librarian of UCLA. The tenor was far more conservative than what the audience wished. Core principles such as “free to all” were questioned; people would have preferred to have been roused.
Earlier this year, the National Information Standards Organization (NISO) received funding from the Andrew W. Mellon Foundation for two meetings, coordinated with the Internet Archive, that would encourage standards discussions around bookmarks and annotations. The aim of the meetings was to bring together as many stakeholders as possible – social reading startups, standards groups like the IDPF and EDItEUR, academic and research initiatives, and large ebook retailers – to discuss the challenges in creating standard-format, portable bookmarks and annotations.
As we described in the grant application materials, “The ability to accurately refer to a specific location within a digital text is fundamental for bookmarking and annotations in a digital environment. For both casual readers as well as professional and academic researchers, such pointers must be recognized across reading systems to enable social uses of books, articles and grey literature that range from personal memory aids to citations and critical analysis, as well as deep inter-linking. At present, no standards exist in this space.”