Big data set to increase the efficiency of stroke outcomes research


A topical review published in Stroke investigates new approaches for the collection of outcomes data after stroke. In the context of global technological growth, the authors, David Ung, Monique F Kilkenny (senior author), Monash University colleagues (Melbourne, Australia) and international colleagues from the USA and Canada, propose: “Continued advancements in streamlining data linkage, without compromising privacy, will not only lead to an explosive increase in the use of data linkage for stroke outcomes research, but will also fill research gaps as we work towards understanding other patient groups.” They further conclude: “These real-world data offer a promising opportunity for evaluations to improve evidence-based practice and policy decision-making.”

The authors acknowledge that currently “the quality of outcomes data such as mortality, adverse events, and disability can vary by type and timing of the collected data, such as the inhospital, short-term and long-term periods”. Proposing that there are three primary strategies for collecting outcomes data after stroke: clinical studies, clinical stroke registries, and administrative data, the authors weigh up their advantages and shortcomings against the utility of linking datasets.

David Ung

For example, while Ung et al suggest that administrative databases were not designed for research or clinical purposes, they propose that such databases “reflect real-world practice, are large and comprehensive, population-based, have potentially indefinite follow-up, and are less expensive and resource intensive than clinical studies”.

However, the authors also allude to their limitations, the primary one concerning selection biases. “Selection biases can make it difficult to appropriately define the population, [as] patients with better access to healthcare or who present with more severe and non-fatal symptoms may be more likely to be captured.”

Furthermore, Ung and colleagues speculate that these databases may include inaccurate or incomplete data that may not encompass the entire population. They also add: “Another limitation arises when codes that lead to greater reimbursement of funds to hospitals are favoured over the principal reason a patient is hospitalised.”

Monique F Kilkenny

According to the authors, the latter phenomenon is termed “code creep”, and has been estimated in an American study to increase payments for physician visits by 2.2% every year. In light of the aforementioned factors, Ung et al surmise: “The combination of these factors not only impacts the ability to identify stroke cases, but also affects the ability to adjust outcomes by stroke severity and compare performance across health centres or hospitals.”

The expansion of data linkage

In the context of patient records, the process of data linkage can be defined as the merger of administrative databases with registries or clinical studies. “It can be particularly valuable in stroke research,” say the authors, “as, in developing countries at least, most people attend healthcare-related services before, during and after their stroke event.”

Ung and colleagues further put forward that as excessive resources are required to track outcomes after discharge, linkages with administrative data can provide “an adjunct approach to cost-effectively and longitudinally capture longer-term outcomes in both clinical studies and registries”.

Referring to the key benefits of data linkage, the authors postulate that cost-effectiveness and efficiency comes out on top. “Not only does it reduce the overall time and financial cost associated with collecting these data, but it also reduces responder burden.”

Moreover, alluding to examples of successful linkage between prospective stroke research cohorts and registries, and administrative data, Ung and colleagues acknowledge: “These linkages have provided the opportunity to determine outcomes after patients are discharged from hospital, validate the diagnosis of the index stroke, and enable capture of comorbidities.” Additionally, the use of data linkage in prospective studies can minimise the effect of patients lost to follow-up, say the authors.

Lastly, they posit that a further area where this technological asset may serve to be useful is in evaluating the translation of research knowledge, or an evidence-based treatment, into routine practice by enabling the observation of trends in treatment overtime.

Yet, limitations of data linkages exist, while its application poses an array of challenges. “The availability and quality of databases for linkage are country-dependent and can be inconsistent,” say Ung and his team. They also highlight the issue of public approval and governance, claiming that they are “impediments to linking data in some countries, such as Taiwan and Australia”.

For instance, they allude to a study published last year by Cheng-Yang Hsieh (Department of Neurology, Tainan Sin Lau Hospital, Tainan, Taiwan) and colleagues, in Epidemiol Health, confirming that researchers in Taiwan are prevented from linking clinical registry databases with administrative databases due to the public’s concern for privacy, protests from human rights organisation, and the amount of red tape necessary to get permission to link data.

In accordance, Ung and colleagues agree that, in this increasingly digitalised world, patients have increasing rights to control their own digital fingerprint. Yet, they write: “By understanding the nature, historic context and limitations of individual administration databases, researchers can unpack these large data and use these to enrich comprehensive assessments in clinical studies for outcomes of stroke.”

The future

Throughout the past two decades, the authors recognise that only a handful of countries, such as Sweden, Denmark, Finland, Scotland, New Zealand, Canada, and Australia, claim to realise the utility of such “high-quality, complete, and linked heath data”.

The use of data linkage in the context of stroke-related outcomes remains in its infancy, state the authors, as currently many population-based studies only link with a national death registry to ascertain mortality. Yet, Ung et al argue that after successful examples of data linkage in the research of stroke, “there is a renewed impetus to use this approach around the world”.

Ultimately, Ung and his team propose: “Many countries that do not currently use linked data in stroke research already possess systems capable of collecting routine health information at local and national levels, especially with the increasing adoption of electronic medical records. Unlocking this wealth of data could help generate new hypotheses in stroke, minimise data waste, and justify future comprehensive research into important and untested outcomes.”


Please enter your comment!
Please enter your name here