BigSurv18 program
Wednesday 24th October Thursday 25th October Friday 26th October Saturday 27th October
Getting Aggressive about Passive Data Collection |
|
Chair | Dr Stephanie Eckman (RTI International) |
Time | Friday 26th October, 16:00 - 17:30 |
Room: | 40.250 |
Combining Survey and Wearable Data on Exercise and Sleep
Dr Stephanie Eckman (RTI International) - Presenting Author
Dr Ashley Amaya (RTI International)
Dr Robert Furberg (RTI International)
High quality data on physical activity is difficult to collect via surveys. Respondents tend to overreport physical activity due to social desirability bias. Consumer health wearables, such as Fitbits, may offer a method of collecting higher quality data. To explore whether wearable devices provide a reasonable alternative to survey data collection, we collected survey data on exercise and sleep patterns from 500 respondents. These questions were modelled after the Behavioral Risk Factor Surveillance Survey in the US. The respondents also provided their previous month’s Fitbit data. These data capture activity, steps, heartrate and sleep every minute. In addition to these survey and passive data, we also have the nationwide BRFSS data and the nationwide data on Fitbit users. Combining these data sources, we will comment on the relative quality of survey and wearable reports of physical activity. Our results will be informative for all researchers thinking of integrating consumer wearables into their studies, as well as anyone who collects or analyzes data on physical activity.
Using Call Detail Records to Conduct a Commuting Survey in Poland
Mr Piotr Kaluzny (Poznan University of Economics and Business) - Presenting Author
Dr Maciej Beresewicz (Poznan University of Economics and Business / Statistical Office in Poznan)
Dr Agata Filipowska (Poznan University of Economics and Business)
Big data and the Internet as a data source have become an important issue in statistics, in particular the official statistics. Current research focuses on quality and suitability of estimates based on new data sources to support, enhance or even replace the officially published indicators. As such, mobile phone data are considered a potential source to measure mobility and commuting. In the paper we present a methodology for utilizing telco data (Call Detail Records, CDR) as a proxy for performing commuting surveys.
Using a trajectory based stay-extraction method that estimates stay time at a given location for a particular user, the aggregation is performed to obtain two characteristics of users: spatial distribution of home and workplaces, along with the most frequent paths of how users commute. The model is implemented and evaluated on a large telecom dataset for Poland (from 2013), spanning over 7 billion of records of users’ telco activity with geographical information provided by a proxy of the nearest BTS (Base Transceiver Station).
The estimates obtained by the model are analysed and compared with the register-based Commuting Survey 2011, conducted during the National Census of Population and Housing 2011 in Poland, to test the extent of the applicability and usefulness of CDR proxy for deriving insights consistent with the outcomes of the commuting survey.
The paper also provides a visual representation and analysis of commuting statistics for chosen geographical regions, with a focus on large metropolitan areas. The advantages, drawbacks and potential use cases for the analysis of mobility utilizing CDRs are also provided.
Capture-Recapture Techniques for Transport Survey Estimate Adjustment Using Road Sensor Data
In review process for the special issue
Mr Jonas Klingwort (University of Duisburg-Essen) - Presenting Author
Dr Bart Buelens (Statistics Netherlands)
Professor Rainer Schnell (University of Duisburg-Essen)
Winner of the Student Paper Competition
In contrast to traditional social surveys, big data is often collected by sensors. This process-generated data is highly valuable for social research and official statistics if it can be linked with survey and administrative data. The application discussed here is such an example of multisource statistics, in which survey, sensor and administrative data are combined. We apply capture-recapture methods in this setting, which is a new approach to multisource estimation in official statistics. We use survey data of the Dutch Road Freight Transport Survey (RFTS) and road sensor data produced by the weigh-in-motion sensor network (WIM) operated by the Dutch national road administration. In the RFTS, a probability sample of registered truck owners report the trips and the weight of the cargo for the sampled truck in a specified week. Nine WIM stations on Dutch highways continuously weigh every passing truck and use a camera system scanning the license plates to identify trucks. Since the empty weight of each truck is known, the weight of the load can be computed. Thus, the sensor system and the survey independently measure the same target variable: the weight of the cargo. Through the license plate numbers, each individual truck in the RFTS sample can be linked one-to-one with the corresponding WIM observation. Additional variables are available from administrative registers, such as technical specifications of the trucks and administrative details of the truck owners. An important aim of the survey is to provide estimates of transported weight at quarterly and annual intervals. Due to survey non-response and underreporting, the RFTS-based estimates are downward biased. We use WIM sensor data to assess, quantify and correct this bias. The correction is based on an application of capture-recapture techniques. The RFTS survey and the WIM-sensor observations are considered as two captures. We start with a Lincoln-Petersen estimator to obtain baseline estimates. In the next step, conditional inclusion probabilities for RTFS and WIM are modelled using information of auxiliary variables. Differences between conditional and unconditional approaches are discussed.
A New Smart Meter Research Portal
Mr Simon Elam (University College London (UCL)) - Presenting Author
Background
The Smart Metering Implementation Programme (SMIP) aims to install approximately 53 million smart electric and gas meters in around 27 million domestic properties in Great Britain by 2020. Smart meters provide high resolution (e.g. half-hourly) electricity and gas consumption data which enables many potential opportunities for energy consumers to benefit from energy saving and energy shifting services.
In order to leverage the investment in the SMIP and provide tangible benefit to the research community, the Engineering and Physical Sciences Research Council (EPSRC) have provided a £6m grant for a 5-year, multi-partner project to develop a Smart Meter Research Portal (SMRP) to provide vital access to energy data for the UK research community.
SMRP Vision
Our vision is to deliver a world leading multi/inter-disciplinary research programme, facilitated by a smart meter data portal. The portal will transform GB energy research through the long-term provision of high quality, high-resolution energy data that will support the development of a reliable evidence base for intervention, observational and longitudinal studies across the socio-technical spectrum.
The goals of the portal are to provide:
- a consistent, trusted, and sustainable channel for researchers to access large-scale, high-resolution energy data, thereby providing a reliable empirical dataset for research;
- an effective mechanism for collecting energy data alongside other variables from national surveys (e.g. English Housing Survey) or individual research projects;
- a confidential, ongoing repository of smart meter data enhanced with contextual dwelling, household and neighbourhood attributes for use in secondary data analysis;
- an Energy Advice Service for participants who want their smart meter data to be used for this purpose.
The ambition of the research programme is to undertake research that will:
- support government policy;
- kick-start the development of new products, services and energy markets;
- help provide solutions to the energy trilemma (security, affordability and environmental sustainability);
- facilitate better research by developing best practice guidelines and methods to improve data security and enable innovative uses of smart meter data.
Conclusion
This paper will discuss the benefits, challenges and methods in developing a national data resource that aims to support a wide range of research across the energy sector. Issues discussed will include:
- benefits to government, industry and charities/NGOs
- benefits to the general public e.g. through the provision of tailored, data-driven energy advice enabled by the use of smart meter data.
- handling confidentiality, privacy and informed consent
- defining and describing data quality across multiple data sources
- discussing methods for combining Big Data with traditional data sources
- assessing new data processing and analytical tools for Big Data
- implementing data visualization methods for Big Data.