Article Image
Douglas Kessel / Senior Staff Photographer

From left, Gerhard Klimeck, from the School of Electrical and Computer Engineering at Purdue University, Mark Hahnel of Figshare, Micah Altman from Massachusetts Institute of Technology Libraries, Denis Tenen, a Columbia English professor, and Kenneth Crews, director of the Copyrite Advisory Office at Columbia, speak at Wednesday's panel.

With data, presentation is everything.

Professors, researchers, publishers, and journalists discussed how to make data more accessible at the Research Data Symposium, a series of panels on Wednesday organized by Columbia's Center for Digital Research Scholarship and Columbia University Libraries that addressed the question of how best to present the statistics and figures gathered through research.

"We talked a lot about preserving scholarship and preserving the academic journal and the fact that universities, librarians, and researchers are having a hard time collecting research data, standardizing it, and making it accessible," said Kelechi Okere, regional sales director of Elsevier, a publishing company for medical and scientific research, and one of the event's directors. 

The symposium focused on the four stages of the "data life cycle"—collecting data through research, preserving the information, analyzing the findings, and sharing it with the public—and featured experts from each of these stages to discuss how to present data in the most comprehensible way.

Panelist Dennis Tenen, a Columbia English professor who plans to launch a humanities-based data lab on the impact of piracy and culture, noted the importance of data accessibility.

"Everyone would agree that data sharing, discovering, and impacting the world is a good thing, and I think that very few people will say no to open access and sources of data," Tenen said.

"It is very difficult to train our students and faculty and to get a whole discipline moving in the direction towards data sharing because people are unsure whether they should use one source or the other," he said. "So there is this tension, that is even in this room, among the proliferation of platforms."

Tenen said successful data sharing can be streamlined by creating a social network of researchers.

"It works best when there is only one platform," Tenen said. "When we're all on Facebook, we can all easily access one another. The solution is that there must either be huge integration or widening of the market until there are only several clear sources."

The panelists also discussed how data should be reproduced and presented in publications.

Because so much data is shared, "it makes sense now to have linking to specific papers and results," which is currently discouraged, panelist Victoria Stodden, a professor of statistics, said. "We need to advocate for changes in the research process."

The panelists also discussed how research in the humanities can be coordinated with scientific research and the importance of approaching data with a holistic view.

Robert Hilliker, the digital repository manager at the Center for Digital Research and Scholarship, said that the key to sharing research findings is to make people outside the field understand what the numbers mean.

"The reality is that you need to be thinking about the whole process by which data regenerates, processes, and then find a way through which people can discover and reuse them," Hilliker said. "If you don't, you end up with a bunch of numbers that nobody understands because there is no context to them."

Bob Chen, the director of the Center for International Earth Science Information Network at the Earth Institute, agreed with Hilliker.

"I'm very supportive that open data is key to being more efficient, but it is still a means to an end," Chen said. "We do have to keep the perspective that the goal of science is to benefit people, and to use this science in the best way possible is to make it most efficient." | @ColumbiaSpec

Research data science