A longitudinal data journey
The first in our series of posts to mark Love Data Week 2018 comes from Dr Fiona Cox, Research Fellow in the School of Geography & Sustainable Development. We asked Fiona about her data journey:
What type or types of data do you use in your research?
The key dataset I am currently using is the Scottish Longitudinal Study (SLS). This is a large scale linkage study that contains data for a large representative sample of the Scottish population (approximately 5%) drawing from the Censuses from 1991-2011 plus additional linkages to vital events, health, education and other data. It is a very rich resource that can be used for everything from psychology to geography to economics!
Are data usually shared in your discipline? If so, how?
Many other datasets used in my discipline, such as Understanding Society or the Scottish Household Survey, are available for researchers to download. However, the Scottish Longitudinal Study and its sister studies, the ONS LS and Northern Ireland Longitudinal Study contain such a rich range of individual level data that access is only allowed in designated “safe-settings” by approved researchers, and data cannot be shared beyond the immediate research team.
Do you often re-use data that’s out there already? Do you find this easy?
Yes, in the past I have carried out secondary data analysis on a number of sources such as the Scottish Household Survey, Scottish Census Online, and data available from NHS Scotland. I am also about to explore data from the Health Behaviour in School-age Children study (HBSC) to compliment my SLS research findings. These datasets are mostly available online, usually requiring just a simple application/disclaimer, and can be downloaded so that I can work on them at my desk, so they are easy to access.
What do you see as the main reasons for sharing data in your field or keeping it private?
Widely available data such as Understanding Society, SARS and data held by the UK Data Archive are important in allowing researchers to replicate and extend existing research. The Census-based Longitudinal Studies are also invaluable because of the additional components of time, scale and context that they provide, however because of the additional potential risks to privacy this creates, it is understandable – and reassuring! – that their use is carefully protected.
What do you think are the major issues/gaps with regards to data in your field?
The main issue in finding the data required is that often there is no single data source that contains all of the information needed. This leads to trying to harmonise and combine questions from separate studies. There are few, if any datasets that can rival the breadth of data in the Census Longitudinal Studies, but even then there are highly desired linkages that are missing – eg the ONS LS for England and Wales cannot currently be linked to NHS data. Hopefully that is something that will change though.
How would you like the data landscape (anything from data collection to data sharing – processes, tools, software, metrics, etc.) to look like in 10 years?
It would be great to see the Longitudinal Studies going from strength to strength both in terms of increasing their linkages to other administrative data sources, and funding to expand the Research Support Unit teams at UCL, QUB and Edinburgh University and the excellent work that they do. More approved safe-settings would make the data easier to access and more attractive to researchers. The good news is that exploratory work is already under way to add new data linkages, and to make the LSs accessible in more locations – possibly using the new Micro Safe Settings Network (MSSN) – so in 10 years this might actually be a reality!
Posted on behalf of Dr Fiona M Cox