Thursday, April 30, 2015

The Best Laid Data Plans

Sometimes researchers have data that simply gets away from them.  The project started small and grew over time and the data that grew along with the project simply became unwieldy.  Often when researchers receive a grant they simply want to get started doing the project.  The data that is collected gets stored in files on someone's computer.  The researcher thinks they will surely remember how things are organized.  Time rolls on and the data gets bigger and bigger.  Looking at the data several years down the road the researcher team realizes that the data wasn't really organized or labelled very well.  This underscores the need to organize and decide on access points at the beginning of a project.

It is known the by far the largest hazard to the sustainability is human error.  I was reminded this week of a project where many of these stories have been documented for the community.  These could all be tagged as #avoidthesepitfalls DataONE has collected these stories that they describe as "cautionary tales from researchers."  You can find these stories - some accompanied by video - at https://www.dataone.org/data-stories  

I think these stories would be very useful as mobilization tools for the librarians and IT personnel charged with helping to solve these issues for researchers, if not for the researchers themselves.


Friday, March 20, 2015

NSF's OSTP response released: "Today's Data, Tomorrow's Discoveries"

The community has been waiting with baited breath to see what NSF would do regarding the White House Office of Science and Technology Policy's mandate for public access to results of federally funded research.  The memorandum from OSTP, "Expanding Public Access to the Results of Federally Funded Research", was issued in February 22. 2013 and directs Federal agencies with more than $100M in R&D expenditures to develop plans to make the published results of federally funded research available to the public within one year of publication and requiring researchers to better account for and manage the digital data resulting from federally funded scientific research. 

The National Science Foundation has now issued a plan: "Today's Data, Tomorrow's Discoveries."   The requirement will apply to new awards resulting from proposals submitted or due, on or after the effective date of the Proposal & Award Policies & Procedures Guide that will be issued in January 2016. 

Section 3.1 provides the following detail:

NSF will require that either the version of record or the final accepted peer-reviewed manuscript in peer-reviewed scholarly journals and papers in juried conference proceedings or transactions described in the scope above (Section 2.0) and resulting from new awards resulting from proposals submitted, or due, on or after the January 2016 effective date must: 
  •  Be deposited in a public access compliant repository designated by NSF;
  •  Be available for download, reading, and analysis free of charge no later than 12 months after initial publication;
  •  Possess a minimum set of machine-readable metadata elements in a metadata record to be made available free of charge upon initial publication (Section 7.3.1); 
  •  Be managed to ensure long-term preservation (Section 7.7); and 
  •  Be reported in annual and final reports during the period of the award with a unique persistent identifier8 that provides links to the full text of the publication as well as other metadata elements. 

UK Libraries has been preparing for this since the OSTP memorandum was released and we are developing services and training to help our University of Kentucky researchers comply with this mandate.  Stay tuned for more information in the coming weeks!  



Thursday, March 12, 2015

Article of the Week: On the importance of being negative

You may have missed this article from The Guardian on Sunday March 8, 2015.  I believe that more attention should be paid to the topic discussed - the fact that not all research is successful. Hypotheses are not always proven, experiments don't always work out, things fail.  In a climate where success and tenure depend on being published in the most prestigious journals we rarely hear about the research that doesn't pan out.  In my experience (just in life, mind you - I am no scientist!) one learns as much or more from failure as from success.

The article in The Guardian addresses this issue and is well worth reading.  You can find it here: On the importance of being negative  Happy reading!

Tuesday, March 10, 2015

Open Data as Open Educational Resources

I came across an interesting blog post by Marieke Guy on the Open Education Working Group blog that addresses the idea of using open data as a form of OER.  Open data is gaining rapid traction as essential to good research practice as funding agencies are demanding that research data be made available for verification, transparency, and reuse. This blog post addresses how using open research data can improve student learning, research and literacy skills.  This is an article that I will be bookmarking!

Read the post here:
The 21st Century’s Raw Material: Using Open Data as Open Educational Resources

Friday, February 27, 2015

Article of the Week: It's Good to Share: Why Environmental Scientists’ Ethics Are Out of Date


  1. Abstract


    Although there have been many recent calls for increased data sharing, the majority of environmental scientists do not make their individual data sets publicly available in online repositories. Current data-sharing conversations are focused on overcoming the technological challenges associated with data sharing and the lack of rewards and incentives for individuals to share data. We argue that the most important conversation has yet to take place: There has not been a strong ethical impetus for sharing data within the current culture, behaviors, and practices of environmental scientists. In this article, we describe a critical shift that is happening in both society and the environmental science community that makes data sharing not just good but ethically obligatory. This is a shift toward the ethical value of promoting inclusivity within and beyond science. An essential element of a truly inclusionary and democratic approach to science is to share data through publicly accessible data sets.


To read the entire article go to:

http://bioscience.oxfordjournals.org/content/65/1/69.full



______________________________________________

BioScience65 (1):69-73. doi: 10.1093/biosci/biu169

PLOS Clarifies its Publication Fee Assistance Policy


There has been quite a bit of confusion regarding the publication fees that Public Library of Science charges to publish in PLOS.  I have heard several researchers say that this was a barrier to publication and that they can't afford the fees.  On January 16, 2015 they clarified the policy and stated that they do not intend for authors to fund publication fees through their personal funds.

The complete posting that clarifies the issue may be found at:

http://www.plos.org/plos-clarifies-its-publication-fee-assistance-policy/

Friday, February 13, 2015

Article of the Week: Sharing Detailed Research Data Is Associated with Increased Citation Rate

This is not a new article, but one that resonates with me as I help researchers understand that it is in their best interest to share their data. 

Abstract

Background

Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available.

Principal Findings

We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression.

Significance


This correlation between publicly available data and increased literature impact may further motivate investigators to share their detailed research data.

Introduction

Sharing information facilitates science. Publicly sharing detailed research data–sample attributes, clinical factors, patient outcomes, DNA sequences, raw mRNA microarray measurements–with other researchers allows these valuable resources to contribute far beyond their original analysis[1]. In addition to being used to confirm original results, raw data can be used to explore related or new hypotheses, particularly when combined with other publicly available data sets. Real data is indispensable when investigating and developing study methods, analysis techniques, and software implementations. The larger scientific community also benefits: sharing data encourages multiple perspectives, helps to identify errors, discourages fraud, is useful for training new researchers, and increases efficient use of funding and patient population resources by avoiding duplicate data collection.