To celebrate the one-year anniversary of my blog, I want to finish my data in projects series I began earlier in August. I’ve talked about using data throughout all stages of a project. Today I want to look at how data can be used as you’re finishing a project.
In an article by Parallel Project training, they mention that often ending a project properly is often overlooked because of all of the planning and execution that happens earlier in the project lifecycle. A data management plan is a big part of a great exit strategy when the project is concluding. Having a plan includes thinking about roles and responsibilities of the team as the project ends, costs, types of data, data standards, access, policies for re-use and data preservation as described by NC State University Libraries.
Thinking about how data will be used and archived in the future is also important. Without foresight into how to handle the data generated during a project, it’s likely to be a bit too haphazard. A great example is from the University of California – San Diego’s Data Management Plan by Rob Pinkel. Pinkel describes what data will be collected during the project, where the data will be stored and accessed and data formats and how this will affect future archiving activities. The data will be shared with NSF and to interested parties upon request. He recognizes and documents archiving shortfalls which will help any follow-on activities or new project team members in the future.
I want to finish today’s discussion by looking at another example of a Data Management Plan that is far too common and that lacks foresight into future data needs. UCSD’s plan by Sameer Shah starts out by listing some project background information, categories of data and even noting some of the data that won’t be published. (The topic of whether all data should be published, good and bad, is a topic for another discussion.) Due to funding and data infrastructure limitations, they plan to archive the data within hard drives and servers only accessible within their School of Engineering. Shah then says the data will be stored for a minimum of 5 years in case the data collection is ‘terminated or transitioned.’ This type of Data Management Plan is counterproductive to furthering research and innovation but is probably the rule more than it is the exception.