大数据时代谈数据 – FDA的观点
首页 > 资讯 > 资讯公开 > 大数据时代谈数据 – FDA的观点 出自识林
大数据时代谈数据 – FDA的观点
笔记 2015-12-11 FDA Voice 【编者按】 FDA深究数据、信息和证据的含义及关系:什么是真实数据?什么是真实证据?什么是“真实”?大数据时代,人们往往在“大”字上做文章,易忽视对数据本身的深度认知。这篇文章是少有的例外,反映了FDA追求科学的态度。 (摘自FDA Voice “What We Mean When We Talk About Data” 2015年12月10日,作者:Robert Califf,医师,FDA副局长;Rachel Sherman,医师,FDA助理副局长。Robert Califf医师被Obama总统提名接任FDA局长,正在等候参议院批准) 医疗保健和生物医学研究正处在数据革命之中。网络系统、电子健康记录、电子保险理赔数据库、社交媒体、患者登记,以及智能手机和其它个人设备仪器组成一个巨大的关于健康和医疗保健的新的数据来源集合。此外,这些“真实”来源可以提供患者在其环境背景中(无论是在家中或是在工作中)和生活的社会情境中的数据。许多研究者都渴望挖掘这些数据流,以对有关患者健康和药品安全性和有效性的问题提供更准确和更细致的回答,并且以快速、高效和比以往更低的成本来达成这样的目的。 但在我们认识到医疗保健数据革命的巨大潜力之前,必须克服一些客观存在的、逻辑的和科学的挑战。首先需要解决的问题之一就是术语问题。 术语界定 尽管“数据”、“信息”和“证据”经常被当作好像相互之间可互换的术语使用,但实际上并不是。对数据最好的理解是某些事情或过程的原始度量。数据本身是没有意义的;只有当我们添加关于被测量的是什么和如何变成信息的关键上下文以后才是有意义的。信息可以被分析和整合而得到证据,进而可以用于指导决策制定。换句话说,仅仅拥有数据是不够的,即使数据量非常庞大。最终我们所需要的是可以应用于回答科学和临床问题的证据。 到目前为止一切都很好。但是,当我们在谈论“真实数据”或“真实证据”时,我们意指什么? 临床研究往往在高度控制的环境中开展,也许不能反映典型患者的日常护理实际情况,或患者在医疗保健体系之外的生活。此外,那些加入临床试验的患者是依据标准认真挑选过的,可能会排除许多患者,尤其是患有其它疾病,服用其它药物或无法前往临床研究场地的患者。换而言之,从这类研究中收集的数据实际上可能没有描绘许多患者和医疗提供者将经历的“真实世界”,这可能会导致我们对药品有效性和安全性理解的重要限制。医师和患者必须能够把在控制环境中某些患者被排除在外从而对推广普及产生挑战的临床研究的结果与自身专业和个人经验联系起来。那么可能会简单地认为:包括一个更全面和更多样化的个体和临床情况范围的研究,最终会为医药产品使用决定和医疗决策提供更好的科学证据。 但是“真实证据”具有必须认真了解和处理的自身问题。首先,模糊的术语“真实”可能意味着与真相更密切的关系,真实测量更适合于在受控环境中的患者。例如,从个人设备或健康应用程序中收集的“真实”血压数据比在医生办公室进行的血压测量更好吗(例如,更可靠、更准确)?可能会更好,因为患者血压可能会在看医生的时候反常升高。但与此同时,我们对从患者个人设备收集来的数据是否足够了解,从而可以使用这些数据生成证据?数据有多准确?患者是否正确测量血压?还有哪些因素可能影响数据?我们已经被提醒了以非最初目的收集到的数据的潜在复杂性。 在大多数情况下,“真实证据”被认为是已收集数据的反映,例如,研究人员审查和回顾分析的流行病学或队列数据。同样有趣的是,随机试验是否可在“真实”环境中开展。当考虑比较治疗方法时,我们必须始终考虑到治疗未随机分配的可能性,但仍反映了一些相关患者的特征。当然,这就是做随机临床试验的原因。 复杂问题的更优条件 毫无疑问,现在对研究人员、医师和患者开放的新的数据来源对于提高医疗质量、安全和效率具有巨大的潜力。但是,当我们努力了解影响深远的技术变革的希望和危险时,我们需要一个功能更加强大的词汇来谈论这些复杂问题,允许我们以质量和适用性多个层面的方式思考数据、信息和证据(例如,在监管决策中的适当使用)。“真实证据”(即从所有多样性的实际患者经验收集的数据而产生的证据)的纳入在很多方面代表了从根本上更好地理解疾病和健康状态的重要一步。随着我们开始将“真实数据”应用于我们产生科学证据的过程中,以及随着我们开始意识到并有效地解决“真实数据”的挑战,我们很可能会发现我们获得的答案质量很大程度上取决于我们能否以有意义的方式提出问题。 编译:识林-椒 What We Mean When We Talk About Data Medical care and biomedical research are in the midst of a data revolution. Networked systems, electronic health records, electronic insurance claims databases, social media, patient registries, and smartphones and other personal devices together comprise an immense new set of sources for data about health and healthcare. In addition, these “real-world” sources can provide data about patients in the setting of their environments—whether at home or at work—and in the social context of their lives. Many researchers are eager to tap into these streams in order to provide more accurate and nuanced answers to questions about patient health and the safety and effectiveness of medical products—and to do so quickly, efficiently, and at a lower cost than has previously been possible. But before we can realize the dramatic potential of the healthcare data revolution, a number of practical, logistical, and scientific challenges must be overcome. And one of the first that must be tackled is the issue of terminology. Defining Terms Although “data,” “information,” and “evidence” are often used as if they were interchangeable terms, they are not. Data are best understood as raw measurements of some thing or process. By themselves they are meaningless; only when we add critical context about what is being measured and how do they become information. That information can then be analyzed and combined to yield evidence, which in turn, can be used to guide decision-making. In other words, it’s not enough merely to have data, even very large amounts of it. What we need, ultimately, is evidence that can be applied to answering scientific and clinical questions. So far, so good. But what do we mean when we talk about “real-world data” or “real-world evidence”? Clinical research often takes place in highly controlled settings that may not reflect the day-to-day realities of typical patient care or the life of a patient outside of the medical care system. Further, those who enroll in clinical trials are carefully selected according to criteria that may exclude many patients, especially those who have other diseases, are taking other drugs, or cannot travel to the investigation site. In other words, the data gathered from such studies may not actually depict the “real world” that many patients and care providers will experience—and this could lead to important limitations in our understanding of the effectiveness and safety of medical treatments. Clinicians and patients must be able to relate the results of clinical trials—studies that are done in controlled environments with certain patient populations excluded and which may therefore be challenging to generalize—to their own professional and personal experiences. It seems straightforward, then, to think that studies including a much fuller and more diverse range of individuals and clinical circumstances could ultimately lead to better scientific evidence for application to decisions about use of medical products and healthcare decisions. But “real-world evidence” has its own issues that must be understood and dealt with carefully. First of all, the vague term “real-world” may imply a closer relationship with the truth—that the real-world measurement is preferable to one taken in a controlled environment. For example, is “real-world” blood pressure data gathered from an individual’s personal device or health app better (e.g., more reliable and accurate) than a blood pressure measurement from a doctor’s office? It could be, because a patient’s blood pressure might be uncharacteristically elevated during a visit to the physician. But at the same time, do we know enough about the data gathered from the patient’s personal device—how accurate is it? Is the patient taking their own blood pressure correctly? What other factors might be affecting it?—to use it for generating evidence? Already we are being reminded of the complexities of potentially relying on data that were gathered for purposes other than the ones for which they were originally intended. In most cases “real-world evidence” is thought of as reflecting data already collected, i.e., epidemiologic or cohort data that researchers review and analyze retrospectively. Also of interest is whether randomized trials can be conducted in these “real-world” environments. In considering comparisons of treatments, one must always consider the possibility that the treatments were not assigned randomly, but reflected some relevant patient characteristic. This is, of course, the reason for doing randomized clinical trials. Better Terms for Complex Subjects There is little doubt that the new sources of data now being opened to researchers, clinicians, and patients hold enormous potential for improving the quality, safety, and efficiency of medical care. But as we work to understand both the promise and pitfalls of far-reaching technological changes, we need a more functional vocabulary for talking about these complex subjects, one that allows us to think about data, information, and evidence in ways that capture multiple dimensions of quality and fitness for purpose (e.g., for appropriate use in regulatory decision making). The incorporation of “real-world evidence”—that is, evidence derived from data gathered from actual patient experiences, in all their diversity— in many ways represents an important step toward a fundamentally better understanding of states of disease and health. As we begin to adapt “real-world data” into our processes for creating scientific evidence, and as we begin to recognize and effectively address their challenges, we are likely to find that the quality of the answers we receive will depend in large part on whether we can frame the questions in a meaningful way. Robert M. Califf, M.D., is FDA's Deputy Commissioner for Medical Products and Tobacco. |