VIVO: A Year of Innovation
by Barbara Brynko
[During the past year, IT has been following the development of the VIVO project, a National Institutes of Health-funded project to build the National Networking of Scientists. At the end of Year 1, the VIVO team reports that it has reached, and often surpassed, its goals to build a network of scientists to promote research collaboration. Here are a few highlights. —Ed.]
Twelve months, seven participating universities and institutions, four teams, 120 members, and three major software releases. While the numbers may sound ordinary, the work generated is anything but. Within its first year, VIVO has morphed from an innovative project based in a single institution to an open source software initiative that is encompassing a growing community on local, national, and international levels.
The concept for the national network actually began after several University of Florida (UF) librarians saw the first version of VIVO at Cornell University. The faculty profile database, created by Cornell’s Mann Library, was generating plenty of buzz. The UF librarians returned south and modified VIVO into Gator Scholar. As a core group of other universities and institutions saw the value in such a profile database, the vision of building a network of researchers and scientists became clearer. This collaborative vision was enough to secure a $12-million grant from the National Institutes of Health to establish a national network of scientists by building profile systems in each of the seven institutions and eventually linking the systems together. Once the grant was in place, key players in each institution were selected for four teams (software development, ontology development, implementation, and outreach). The game plan was for each of the seven institutions in the pilot (Cornell University, Indiana University, Ponce School of Medicine, The Scripps Research Institute, University of Florida, Washington University School of Medicine, and Weill Cornell Medical College) to implement a VIVO system, load profile data for the respective institution’s faculty into VIVO, and then create discovery and visualization tools across all the institutions as the first phase of the network.
In theory, it worked well. But developing an ontology that met the needs of each of the diverse institutions wasn’t easy. During Year 1, one of the most important new features for VIVO was its linked data compatibility. Each institution’s VIVO data can be accessed as Resource Description Framework (RDF) so information can be exchanged, aggregated, and searched by others through standard protocols. VIVO’s ontology integrates data from human resource systems, grants databases, faculty annual reporting systems, and publication databases within a common framework that can be shared. Developers are busy creating applications that can display the power behind the rich structured data as it conforms to a published ontology.
All seven of the VIVO-implementation sites finalized their faculty profiles for at least one department to showcase at the inaugural National VIVO Conference in August in New York City. Each of the profile collections was proof that the VIVO system not only worked but that it worked well and exhibited remarkable depth. The profiles included educational background, professional service, webpages, images, biographical information, research activity, selected publications, awards and distinctions, and more. Going forward, the sites will be upgraded with the latest VIVO software releases and loaded with updated data from local grant and publication information, courses, and supplementary sources, including PubMed.
Visualization in Action
Katy Börner, head of VIVO visualization and the Indiana University team leader at the School of Library and Information Science at Indiana University, has spent much of the past year tending to the details. She’s been tracking web visits to the VIVO website (http://vivoweb.org) from countries across the world, cultivating VIVO People Profiles for all known global installations, emailing communications about VIVO, and monitoring the number of VIVO code downloads. She says activity across the board has continued to increase, with a noticeable flurry of activity around the VIVO conference, which can be viewed on an animated map of the U.S. and the world or as a static version at http://vivo.slis.indiana.edu/gallery.html.
“The visualizations show the growing interest in VIVO,” she says, “and the correlations among email contacts, code downloads, new installation sites, and new people profiles.” With visualization technology, she can view the adoption of a core science infrastructure as it is mapped in near real time.
The Outreach Team
Kristi Holmes, national outreach coordinator and bioinformaticist at the Becker Medical Library at Washington University School of Medicine, has been getting the word out about VIVO, from internal adoption efforts at the seven VIVO installation sites to national and international channels. Holmes says that outreach is as deep as it is broad, creating VIVO support materials, collaborating on the project site, publicizing the VIVO brand, acquiring additional data for VIVO, lining up speaking opportunities, and coordinating the annual VIVO conference.
“We have enjoyed a successful effort thus far, thanks to the hard work, dedication, and collegial spir it of team members across the project,” she says. The outreach team is busy getting the word out via Facebook (www.facebook.com/VIVOcollaboration), Twitter (http://twitter.com/VIVOcollab), and Linked In. In addition, work on the VIVO project has generated hundreds of blog posts, FriendFeed posts, tweets, and plenty of articles in print.
The first annual National VIVO Conference, Enabling National Networking of Scientists, held Aug. 12–13 in New York City, was one of the biggest events for the VIVO team. The 2-day conference featured three workshops, several tutorials, and keynotes from Jim Hendler, Tetherless World Professor of Web Science at Rensselaer Polytechnic Institute, and Noshir Contractor, the Jane S. & William J. White Professor at Northwestern University. “Reviews from the sold-out conference have been overwhelm ingly positive,” says Holmes, “and the VIVO team is already preparing for the 2011 VIVO Conference,” set for Aug. 24–26 at the Gaylord Hotel and Conference Center in Washington, D.C.
The Development Team
Jon Corson-Rikert, VIVO national development lead and head of Information Technology Services at Cornell’s Albert R. Mann Library, works on the nuts and bolts of the project. Despite the diversity in the seven core institutions that range from an independent research institution to major universities, his team has developed a semantic web-based software system using a flexible ontology that works for all seven partners.
The differences in the VIVO institutions also proved that people without any previous exposure to semantic web technologies could successfully load institutional and national data into VIVO. Sure, “there is something of a learning curve to become comfortable,” he says, “but the experience we’ve gained at one institution seems to translate well to others.”
Corson-Rikert sees the interplay of policy and technical questions as among the most difficult challenges. “Sometimes this becomes a chicken-and-egg story in trying to articulate specifications,” he says. “It becomes[,] ‘We can’t tell you what we need until you can tell us what you can do’ vs. ‘Until you tell us what your requirements are, we can’t figure out the right technical approach.’” To speed up the process, his team started user testing on a sampling of scientists and on members from the implementation sites to get feedback about technical solutions before the official release.
The high volume of data being loaded at the larger universities initially present ed a few challenges, but the development team created strategies to boost scalability for large and small institutions that will work going forward. “We now have enough data in VIVO at most of our implementation sites that VIVO’s contribution to the research endeavor can begin to be evaluated,” he says, noting that visualizations and SPARQL-based reporting are helping the cause. “Many issues that have arisen can’t be solved by VIVO alone, but we feel VIVO is becoming an important part of the landscape at our partner institutions.”
Christopher Barnes, VIVO University of Florida (UF) development lead (Harvester development) and associate director of software engineering at the Clinical and Translational Research Informatics Program at the University of Florida, is concentrating his efforts in security and authentication, software tools deployment, packaging, and data acquisition from local and national systems. During the first year, the UF team integrated Shibboleth authentication into the VIVO system. Anyone with the appropriate UF credentials can use single sign-on capabilities for VIVO administration and VIVO profile self-editing. These authentication options will let other institutions customize their sign-ins accordingly.
Once the VirtualBox and VMware virtual appliance was developed, anyone in the world could test out the VIVO system with an easy, single software download (http://vivo.sourceforge.net). Once deployed, the virtual appliance lets users log into the VIVO web interface and begin using it.
The UF team has also developed a set of free tools so institutions can acquire data from remote sources, match key data to a profile in VIVO, and then insert and link the data into the VIVO system profiles. These Harvester tools can upload grant data from the PeopleSoft Enterprise ERP system as well as grant and contracts data from the UF Division of Sponsored Research. On a national level, the Harvester can automatically download and match the National Library of Medicine’s PubMed citations to authors in VIVO and then link faculty members with their publications on the VIVO profiles.
At the end of Year 1, the VIVO team is on track with all project deliverables and calls its efforts a “work in progress.” Major institutions, including the U.S. Department of Agriculture, are already committed to VIVO in Year 2. Other interested parties have continued to approach VIVO as if it was a commercial-ready product, but there’s plenty of work ahead. The VIVO team is just getting started.