Data in the humanities: storage, backup, and publication
This is the final post in a trio of posts about research data in the humanities.
Final tips and tricks for our overview of data in the humanities! As with the previous post, each of the below could appear as a consideration in a data management plan (DMP) for a grant application (or if you’re a PhD student, in a DMP for your first-year review).
Storage options endorsed by the University include:
- Centrally-managed storage
For humanities datasets of less than 1TB, OneDrive is an excellent option. It can be worked in online, supports most file types, is shareable, and there is lots of online information and support readily available online in addition to support from the IT Service Desk at the University. It’s important to plan ahead though: the file storage will be purged 18 months after a member of staff leaves the institution, or 6 months after a student leaves the institution, even if someone else has access to the files.
Your University Teams account provides 100GB of storage as default. Requests to expand a team up to 1 TB can be made to IT Services and will be considered on a case-by-case basis. If any sensitive data is involved, however (including any data collected from humans: interviews, surveys, focus groups, etc.), there are limitations on how Teams may be used to store data. If you have sensitive data, get in touch with [email protected] or IT Services for guidance. The files will remain stored on Teams even if the staff member or student leaves the University.
Centrally-managed storage is the best primary storage for sensitive data. Each PI/member of staff engaged at the level of lecturer or above can apply for 0.5TB centrally-managed storage for free through the Unidesk Research Data Storage Request form. Supervisors of PhD students may allow their supervisees to store data in the 0.5TB allowance. Centrally-managed storage can also be purchased at the rate of £200/TB or £400/TB with backups.
Regular backups will help protect you against data loss. These are some of the most common reasons for data loss that are mitigated by backups:
- Hardware failure
- Software or media faults
- Virus infection or malicious hacking
- Power failure
- Human error
A backup plan should take into account your circumstances and potential risks to your specific data. Generally, good practice for backing up research data follows the 3-2-1 rule: at least three copies of your data, stored in at least two different formats (such as on OneDrive and on an encrypted hard drive), with one of the three copies stored in a different location to the other two.
Other considerations for research data backup can include:
- Backing up individual files vs. the whole computer system
- Frequency of backup
- Which backup tools and strategies are available for automated backup (such as working online in OneDrive)
- How to organise and label backup files and media
- Where to store master copies of your data
- How to back up sensitive or confidential data (this should be considered as part of the ethics application)
If you have questions about the technical process of back ups or tools available for backing up, IT Services can provide support. For other back up questions or considerations, contact your RDM team: [email protected].
The best form of data publication is in a recognised repository or data centre. The RDM team can help you choose a repository that fits if you don’t already have one in mind, and we are happy to discuss licensing and embargos to help you understand what best suits your data. The University also has its own research repository for open-access datasets, under the research information system Pure.
The benefits of publishing and sharing data include increased opportunities for future research, higher impact of your work, and solidifying your research reputation. The ongoing availability of your research data provides opportunities to both validate existing results and build upon them with future research; this in turn increases the impact of your work. Securing open access to an important dataset increases the likelihood that it will be used in future research, which elevates the impact and visibility of your original research. This, of course, is a great benefit not only to you as a researcher but also to your affiliated institution.
If you found this helpful, check out the previous posts on data for the humanities.