Sustaining the VIVO Network
by Barbara Brynko
First steps are not always easy. With the National Network for Scientists project now passing its 6-month mark, the VIVO platform has been tweaked and reshaped, and the seven participating institutions and universities are busy pumping data and profiles into local VIVO repositories for harvest.
Now that much of the fine-tuning and modifications have turned the original Facebook for Scientists concept into a drive for a scalable national network, the VIVO team is concentrating its efforts for the next 1.5 years of the National Institutes of Health-funded project to ensure its growth, marketability, and sustainability.
“The goal in the next 18 months is to make VIVO an indispensible tool for scientists,” says Mike Conlon, director of data infrastructure at the University of Florida. “And for VIVO to be indispensible, it needs to be creating value for not only the scientists, but the institutions and the funding organizations as well.” He says once something of value is created, then an entire community begins to emerge around it since the participants are motivated to work out sustainability in detail.
The Broader Brushstrokes
The early adopters understand the value of the VIVO platform; they understand how a website containing scientists’ profiles can be consumed in a semantic context and connected to a larger community. “That said, the institutions on the VIVO platform are expected to take ownership of their respective VIVO installations,” he says. The seven original schools in the project (University of Florida, Cornell University, the Weill Cornell Medical College, Indiana University, Washington University–St. Louis, the Scripps Research Institute, and the Ponce School of Medicine in Puerto Rico) implemented VIVO because it was good for their scientists, and by association with those scientists, it is also good for their institutions. He draws the comparison between the schools today implementing VIVO and those in 1993 trying to find the value in putting up a website. “You can wrestle with it for a bit,” he says, “but in the end, it will be evident.”
On the technology side, there are “challenges related to scalability, but making the institution the fundamental level of operation helps to contain growth to what we believe can be addressed by off-the-shelf hardware solutions, database clustering, and load balancing,” says Jon Corson-Rikert, VIVO creator and head of information technology services at Cornell University’s Mann Library. “By sticking firmly to an open source solution with a public data model, and by making the data itself the fundamental unit of sharing, we are both at the mercy of and empowered by the adoption of linked data in general. There are complications and challenges, but the fundamental principles have been demonstrated to scale quite well.”
Building a Community
As the VIVO platform is being built and expanded, the issues of maintenance, growth, and updates are still being resolved. Corson-Rikert sees the team’s
focus in the next few months on building software, gaining participation, and populating software with data to get the
network started. “We know it will take resources to continue,” he says, “but if the effort benefits the participating individuals, institutions, and the research community, we feel confident resources can be found to continue, very likely with different partnerships and evolving goals.” Both Corson-Rikert and Conlon agree that the main efforts will continue to come from the teams at Cornell University, Indiana University, and the University of Florida.
The VIVO developers have already streamlined the workflow, so there’s much less effort required from the scientists than in the original plan. “We’re trying to drive down the cost significantly and drive up the benefits by creating applications and collaboration with a wide range of partners,” says Conlon. Data is now coming from standardized sources, institutions, publishers, and grant-funding agencies, which is ultimately reducing the effort it takes for scientists to take care of their profiles, says Conlon.
Corson-Rikert believes that VIVO should continue to be free and open source, while being financed by a combination of institutional support and grant funding targeted at demonstrating network-driven benefits for long enough to become self-sustaining or support related value-added services. Conlon has already been contacted by open source software consortia and fee-based hosting companies, so models charging for support services and/or hosting at some point in the future are always a possibility, says Corson-Rikert.
Ensuring the sustainability of activity output depends on several factors, says Conlon. “Like any open source project, it doesn’t take long before you create an open source community around it,” he says. “We already have several vendors that are making code, and once you have code, then you create a community.” He says that participants all have vested interests in particular features for VIVO and want to see that their individual needs are being met. “An open source community can contribute to those features and then you create sustainable activity,” he says. As soon as a central hub for the work activity is established, the code changes will roll out in standardized releases.
When people see the opportunity that VIVO can provide, Conlon says you have it all. “People will have to ‘take stock’ and realize that VIVO is important,” he says. “They will want to participate in the advisory boards, in the software development, and in maintaining the software.
Conlon finds it “gratifying to see the response so far.” But different people see different kinds of value, and that’s OK. “It’s a beautiful thing,” he says. “Technologists are excited because they see this as the next big thing to create a common data platform that their applications can use. They understand the concept of a ‘national network’ and that the applications that have been built and those that will be built in the future can maintain the data. That sets them free.”
A Positive Upward Spiral
Conlon emphasizes the importance of creating a positive upward spiral to lock in sustainability: If the institutions like it, the application providers like it, and the more the application providers like it, the better it is for the scientists. The more the institutions like it, the more data there is for the application providers to use. Making the tool consume the data automatically creates an ease of use that benefits everyone, from the scientists to the institutions to the grant funders.
But for Corson-Rikert, sustainability is just the tip of the iceberg. “I feel that communication and coordination will be bigger issues as participation increases,” he says. The team will face “coordination around needed evolution in the ontology, coming to agreement on how to detect and handle duplicate information such as publications co-authored from different universities, and developing policies on issues such as unaffiliated individuals.”
And the VIVO community is expanding exponentially to help with the work that still needs to be done. Conlon and the VIVO team are looking forward to the first annual National VIVO Conference (http://vivoweb.org) on Aug. 12 and 13 at the New York Hall of Science, where developers,
scientists, publishers, funding agencies, research officers, and students will converge to brainstorm about the National Network of Scientists. The 2-day conference will feature workshops, tutorials, and keynoters, who will discuss the semantic web, Linked Open Data, and the role that VIVO will play to support team science.