|Fourteen of the nation’s largest health systems announced this week that they have joined together to form a new, for-profit data company aimed at aggregating and mining their clinical data. Called Truveta, the company will draw on the de-identified health records of millions of patients from thousands of care sites across 40 states, allowing researchers, physicians, biopharma companies, and others to draw insights aimed at “improving the lives of those they serve.” |
Health system participants include the multi-state Catholic systems CommonSpirit Health, Trinity Health, Providence, and Bon Secours Mercy, the for-profit system Tenet Healthcare, and a number of regional systems. The new company will be led by former Microsoft executive Terry Myerson, who has been working on the project since March of last year. As large technology companies like Amazon and Google continue to build out healthcare offerings, and national insurers like UnitedHealth Group and Aetna continue to grow their analytical capabilities based on physician, hospital, and pharmacy encounters, it’s surprising that hospital systems are only now mobilizing in a concerted way to monetize the clinical data they generate.
Like Civica, an earlier health system collaboration around pharmaceutical manufacturing, Truveta’s launch signals that large national and regional systems are waking up to the value of scale they’ve amassed over time, moving beyond pricing leverage to capture other benefits from the size of their clinical operations—and exploring non-merger partnerships to create value from collaboration. There will inevitably be questions about how patient data is used by Truveta and its eventual customers, but we believe the venture holds real promise for harnessing the power of massive clinical datasets to drive improvement in how care is delivered.
The order to reroute CDC hospitalization figures raised accuracy concerns. But that’s just one of the problems with how the country collects health data.
TWO WEEKS AGO, the Department of Health and Human Services stripped the Centers for Disease Control and Prevention of control of national data on Covid-19 infections in hospitalized patients. Instead of sending the data to the CDC’s public National Healthcare Safety Network (NHSN), the department ordered hospitals to send it to a new data system, run for the agency by a little-known firm in Tennessee.
The change took effect immediately. First, the hospitalization data collected up until July 13 vanished from the CDC’s site. One day later, it was republished—but topped by a note that the NHSN Covid-19 dashboard would no longer be updated.
Fury over the move was immediate. All the major organizations that represent US public health professionals objected vociferously. A quickly written protest letter addressed to Vice President Mike Pence, HHS secretary Alex Azar, and Deborah Birx, the coordinator of the White House’s Coronavirus Task Force, garnered signatures from more than 100 health associations and research groups. The reactions made visible the groups’ concerns that data could be lost or duplicated, and underlined their continual worry that the CDC is being undercut and sidelined. But it had no other effect. The new HHS portal, called HHS Protect, is up and running.
Behind the crisis lies a difficult reality: Covid-19 data in the US—in fact, almost all public health data—is chaotic: not one pipe, but a tangle. If the nation had a single, seamless system for collecting, storing, and analyzing health data, HHS and the Coronavirus Task Force would have had a much harder time prying the CDC’s Covid-19 data loose. Not having a comprehensive system made the HHS move possible, and however well or badly the department handles the data it will now receive, the lack of a comprehensive data system is harming the US coronavirus response.
“Every health system, every public health department, every jurisdiction really has their own ways of going about things,” says Caitlin Rivers, a senior scholar at the Johns Hopkins Center for Health Security. “It’s very difficult to get an accurate and timely and geographically resolved picture of what’s happening in the US, because there’s such a jumble of data.”
Data systems are wonky objects, so it may help to step back and explain a little history. First, there’s a reason why hospitalization data is important: Knowing whether the demand for beds is rising or falling can help illuminate how hard-hit any area is, and whether reopening in that region is safe.
Second, what the NHSN does is important too. It’s a 15-year-old database, organized in 2005 out of several streams of information that were already flowing to the CDC, which receives data from hospitals and other health care facilities about anything that affects the occurrence of infections once someone is admitted. That includes rates of pneumonia from use of ventilators, infections after surgery, and urinary tract infections from catheters, for instance—but also statistics about usage of antibiotics, adherence to hand hygiene, complications from dialysis, occurrence of the ravaging intestinal infection C. difficile, and rates of health care workers getting flu shots. Broadly, it assembles a portrait of the safety of hospitals, nursing homes, and chronic care institutions in the US, and it shares that data with researchers and with other statistical dashboards published by other HHS agencies such as the Center for Medicare and Medicaid Services.
Because NHSN only collects institutional data, and Covid-19 infections occur both inside institutions such as nursing homes and hospitals, and in the outside world, HHS officials claimed the database was a bad fit for the coronavirus pandemic. But people who have worked with it argue that since the network had already devised channels for receiving all that data from health care systems, it ought to continue to do so—especially since that data isn’t easy to abstract.
“If you are lucky enough to work in a large health care system that has a sophisticated electronic medical record, then possibly you can push one button and have all the data flow up to NHSN,” says Angela Vassallo, an epidemiologist who formerly worked at HHS and is now chief clinical adviser to the infection-prevention firm Covid Smart. “But that’s a rare experience. Most hospitals have an infection preventionist, usually an entire team, responsible for transferring that data by hand.”
There lies the core problem. Despite big efforts back during the Obama administration to funnel all US health care data into one large-bore pipeline, what exists now resembles what you’d find behind the walls of an old house: pipes going everywhere, patched at improbable angles, some of them leaky, and some of them dead ends. To take some examples from the coronavirus response: Covid-19 hospital admissions were measured by the NHSN (before HHS intervened), but cases coming to emergency departments were reported in a different database, and test results were reported first to local or state health departments, and then sent up to the CDC.
Covid-19 data in particular has been so messy that volunteer efforts have sprung up to fix it. These include the COVID Tracking Project—compiled from multiple sources and currently the most comprehensive set of statistics, used by media organizations and apparently by the White House—and Covid Exit Strategy, which uses data from the COVID Tracking Project and the CDC.
Last week, the American Public Health Association, the Johns Hopkins Center, and Resolve to Save Lives, a nonprofit led by former CDC director Tom Frieden, released a comprehensive report on Covid-19 data collection. Pulling no punches, they called the current situation an “information catastrophe.”
The US, they found, does not have national-, state-, county-, or city-level standards for Covid-19 data. Every state maintains some form of coronavirus dashboard (and some have several), but every dashboard is different; no two states present the same data categories, nor visualize them the same way. The data presented by states is “inconsistent, incomplete, and inaccessible,” the group found: Out of 15 key pieces of data that each state should be presenting—things such as new confirmed and probable cases, new tests performed, and percentage of tests that are positive—only 38 percent of the indicators are reported in some way, with limitations, and 60 percent are not reported at all.
“This is not the fault of the states—there was no federal leadership,” Frieden emphasized in an interview with WIRED. “And this is legitimately difficult. But it’s not impossible. It just requires commitment.”
But the problem of incomplete, messy data is older and deeper than this pandemic. Four scholars from the health-policy think tank the Commonwealth Fund called out the broader problem just last week in an essay in The New England Journal of Medicine, naming health data as one of four interlocking health care crises exposed by Covid-19. (The others were reliance on employer-provided health care, financial losses in rural and primary-care practices, and the effect of the pandemic on racial and ethinic minorities.)
“There is no national public health information system—electronic or otherwise—that enables authorities to identify regional variation in the demand for, and supply of, resources critical to managing Covid-19,” they wrote. The fix they recommended: a national public health information system that would record diagnoses in real time, monitor the materials hospitals need, and link hospitals and outpatient care, state and local health departments, and laboratories and manufacturers to maintain real-time reporting on disease occurrence, preventive measures, and equipment production.
They are not the first to say this is needed. In February, 2019, the Council of State and Territorial Epidemiologists launched a campaign to get Congress to appropriate $1 billion in new federal funding over 10 years specifically to improve data flows. “The nation’s public health data systems are antiquated, rely on obsolete surveillance methods, and are in dire need of security upgrades,” the group wrote in its launch statement. “Sluggish, manual processes—paper records, spreadsheets, faxes, and phone calls—still in widespread use, have consequences, most notably delayed detection and response to public health threats.”
Defenders of the HHS decision to switch data away from the CDC say that improving problems like that is what the department was aiming for. (“The CDC’s old hospital data-gathering operation once worked well monitoring hospital information across the country, but it’s an inadequate system today,” HHS assistant secretary for public affairs Michael Caputo told CNN.) If that’s an accurate claim, during a global pandemic is a challenging time to do it.
“We were opposed to this, because trying to do this in the middle of a disaster is not the time,” says Georges Benjamin, a physician and executive director of the American Public Health Association, which was a signatory to the letter protesting moving data from the NHSN. “It was just clearly done without a lot of foresight. I don’t think they understand the way data moves into and through the system.”
The past week has shown how correct that concern was. Immediately after the switch, according to CNBC, states were blacked out from receiving data on their own hospitals, because the hospitals were not able to manage the changeover from the CDC to the HHS system. On Tuesday, Ryan Panchadsaram, cofounder of Covid Exit Strategy and former deputy chief technology officer for the US, highlighted on Twitter that data on the HHS dashboard, advertised as updating daily, was five days old. And Tuesday night, the COVID Tracking Project staff warned in a long analysis: “Hospitalization data from states that was highly stable a few weeks ago is currently fragmented, and appears to be a significant undercount.”
When the Covid-19 crisis is over, as everyone hopes it will be someday, the US will still have to wrestle with the questions it raised. One of those will be how the richest country on the planet, with some of the best clinical care in the world, was content with a health information system that left it so uninformed about a disease affecting so many of its citizens. The answer could involve tearing the public-health data system down and building it again from scratch.
“This is a deeply entrenched problem, where there is no single person who has not done their job,” Rivers says. “Our systems are old. They were not updated. We haven’t invested in them. If you’re trying to imagine a system where everyone reports the same information in the same way and we can push a button and have all the information we might want, that will take a complete overhaul of what we have.”
The image of scientists standing beside governors, mayors or the president has become common during the pandemic. Even the most cynical politician knows this public health emergency cannot be properly addressed without relying on the scientific knowledge possessed by these experts.
Yet, ultimately, U.S. government health experts have limited power. They work at the discretion of the White House, leaving their guidance subject to the whims of politicians and them less able to take urgent action to contain the pandemic.
The Centers for Disease Control and Prevention has issued guidelines only to later revise them after the White House intervened. The administration has also undermined its top infectious disease expert, Dr. Anthony Fauci, over his blunt warnings that the pandemic is getting worse – a view that contradicts White House talking points.
And most recently, the White House stripped the CDC of control of coronavirus data, alarming health experts who fear it will be politicized or withheld.
In the realm of monetary policy, however, there is an agency with experts trusted to make decisions on their own in the best interests of the U.S. economy: the Federal Reserve. As I describe in my recent book, “Stewards of the Market,” the Fed’s independence allowed it to take politically risky actions that helped rescue the economy during the financial crisis of 2008.
That’s why I believe we should give the CDC the same type of authority as the Fed so that it can effectively guide the public through health emergencies without fear of running afoul of politicians.
The paradox of expertise
There is a paradox inherent in the relationship between political leaders and technical experts in government.
Experts have the training and skill to apply scientific knowledge in complex biological and economic systems, yet democratically elected political leaders may overrule or ignore their advice for ill or good.
This happened in May when the CDC, the federal agency charged with controlling the spread of disease, removed advice regarding the dangers of singing in church choirs from its website. It did not do so because of new evidence. Rather, it was because of political pressure from the White House to water down the guidance for religious groups.
The ability of elected leaders to ignore scientists – or the scientists’ acquiescence to policies they believe are detrimental to public welfare – is facilitated by many politicians’ penchant for confident assertion of knowledge and the scientist’s trained reluctance to do so.
Experts with independence
Given these constraints on technical expertise, the performance of the Fed in the financial crisis of 2008 offers an informative example that may be usefully applied to the CDC today.
The Federal Reserve is not an executive agency under the president, though it is chartered and overseen by Congress. It was created in 1913 to provide economic stability, and its powers have expanded to guard against both depression and crippling inflation.
At its founding, the structure of the Fed was a political compromise designed make it independent within the government in order to de-politicize its economic policy decisions. Today its decisions are made by a seven-member board of governors and a 12-member Federal Open Market Committee. The members, almost all Ph.D. economists, have had careers in academia, business and government. They come together to analyze economic data, develop a common understanding of what they believe is happening and create policy that matches their shared analysis. This group policymaking is optimal when circumstances are highly uncertain, such as in 2008 when the global financial system was melting down.
The Fed was the lead actor in preventing the system’s collapse and spent several trillion dollars buying risky financial assets and lending to foreign central banks – decisions that were pivotal in calming financial markets but would have been much harder or may not have happened at all without its independent authority.
Putting experts at the wheel
A health crisis needs trusted experts to guide decision-making no less than an economic one does. This suggests the CDC or some re-imagined version of it should be made into an independent agency.
Like the Fed, the CDC is run by technical experts who are often among the best minds in their fields. Like the Fed, the CDC is responsible for both analysis and crisis response. Like the Fed, the domain of the CDC is prone to politicization that may interfere with rational response. And like the Fed, the CDC is responsible for decisions that affect fundamental aspects of the quality of life in the United States.
Were the CDC independent right now, we would likely see a centralized crisis management effort that relies on the best science, as opposed to the current patchwork approach that has failed to contain the outbreak nationally. We would also likely see stronger and consistent recommendations on masks, social distancing and the safest way to reopen the economy and schools.
Independence will not eliminate the paradox of technical expertise in government. The Fed itself has at times succumbed to political pressure. And Trump would likely try to undermine an independent CDC’s legitimacy if its policies conflicted with his political agenda – as he has tried to do with the central bank.
But independence provides a strong shield that would make it much more likely that when political calculations are at odds with science, science wins.
The Trump administration told hospitals to stop reporting data to the CDC, and report it to HHS instead. Vice President Mike Pence said the information would continue to be released publicly. It hasn’t worked out as promised.
In mid-July, the Trump administration instructed hospitals to change the way they reported data on their coronavirus patients, promising the new approach would provide better, more up-to-the-minute information about the virus’s toll and allow resources and supplies to be quickly dispatched across the country.
Instead, the move has created widespread confusion, leaving some states in the dark about their hospitals’ remaining bed and intensive care capacity and, at least temporarily, removing this information from public view. As a result, it has been unclear how many people are in hospitals being treated for COVID-19 at a time when the number of infected patients nationally has been soaring.
Hospitalizations for COVID-19 have been seen as a key metric of both the coronavirus’s toll and the health care system’s ability to deal with it.
Since early in the pandemic, hospitals had been reporting data on COVID-19 patients to the U.S. Centers for Disease Control and Prevention through its National Healthcare Safety Network, which traditionally tracks hospital-acquired infections.
In a memo dated July 10, the U.S. Department of Health and Human Services told hospitals to abruptly change course — to stop reporting their data to the CDC and instead to submit it to HHS through a new portal run by a company called TeleTracking. The change took effect within days. Vice President Mike Pence said the administration would continue releasing the data publicly, as the CDC had done.
Almost immediately, the CDC pulled its historical data offline, only to repost it under pressure a couple days later. Meanwhile the website for the administration’s new portal promised to update numbers on a daily basis, but, as of Friday morning, the site hadn’t been updated since July 23. (HHS is posting some data daily on a different federal website but not representative estimates for each state.)
“The most pernicious portion of it is that at the state level and at the regional level we lost our situational awareness,” said Dave Dillon, spokesman for the Missouri Hospital Association. “At the end of this, we may have a fantastic data product out of HHS. I will not beat them up for trying to do something positive about the data, but the rollout of this has been absolutely a catastrophe.”
The Missouri Hospital Association had taken the daily data submitted by its hospitals to the CDC and created a state dashboard. The transition knocked that offline. The dashboard came back online this week, but Dillon said in a follow-up email, “the data is only as good as our ability to know that everyone is reporting the same data, in the correct way, for tracking and comparison purposes at the state level.”
Other states, including Idaho and South Carolina, also experienced temporary information blackouts. And The COVID Tracking Project, which has been following the pandemic’s toll across the country based on state data, noted issues with its figures. “These problems mean that our hospitalization data — a crucial metric of the COVID-19 pandemic — is, for now, unreliable, and likely an undercount. We do not think that either the state-level hospitalization data or the new federal data is reliable in isolation,” according to a blog post Tuesday on the group’s website.
Making matters more complicated, the administration has changed the information that it is requiring hospitals to report, adding many elements, such as the age range of admitted COVID-19 patients, and removing others. As of this week, for instance, HHS told hospitals to stop reporting the total number of deaths they’ve had since Jan. 1, the total number of COVID-19 deaths and the total number of COVID-19 admissions. (Hospitals still report daily figures, just not historical ones.)
“Massachusetts hospitals are continuing to navigate the dramatic increase of daily data requirements,” the Massachusetts Health and Hospital Association said in a newsletter on Monday. “MHA and other state health officials continue to raise concerns about the administrative burden and questionable usefulness of some of the data.”
“Hospitals across the country were given little time to adjust to the unnecessary and seismic changes put forth by the U.S. Department of Health and Human Services, which fundamentally shift both the volume of data and the platforms through which data is submitted,” the association’s CEO, Steve Walsh, said in the newsletter.
A number of state websites also noted problems with hospital data. For days, the Texas Department of State Health Services included a note on its dashboard that it was “reporting incomplete hospitalization numbers … due to a transition in reporting to comply with new federal requirements.” That came just as the state was experiencing a peak in COVID-19 hospitalizations.
California likewise noted problems.
A spokesperson for HHS acknowledged some bumps in the transition but said in an email: “We are pleased with the progress we have made during this transition and the actionable data it is providing. We have had some states and hospital associations report difficulty with the new collection system. When HHS identifies errors in the data submissions, we work directly with the state or hospital association to quickly resolve them.
“Our objective with this new approach is to collaborate with the states and the healthcare system. The goal of full transparency is to acknowledge when we find discrepancies in the data and correct them.”
Last week, HHS noted, 93% of its prioritized list of hospitals, excluding psychiatric, rehabilitation and religious nonmedical facilities, reported data at least once during the week. (The guidance to hospitals asks them to report every day.)
Asked about the lack of timely data on its public website, HHS said it will update the site to “make it clear that the estimates are only updated weekly.” HHS is now posting a date file each day on healthdata.gov with aggregate information on hospitalizations by state.
But unlike the prior releases from CDC, which provided estimates on hospital capacity based on the responses, this file only gives totals for the hospitals that reported data. It’s unclear which hospitals did not report, how large they are, or whether the reported data is representative.
It’s also unclear if it’s accurate. New York state, for instance, reported that fewer than 600 people were currently hospitalized with COVID-19, as of Friday. Federal data released the same day pegged the number of suspected and confirmed COVID-19 hospitalizations at around 1,800.
Louisiana says more than 1,500 people are currently hospitalized with COVID-19. The federal data puts the figure at fewer than 700.
Nationally, The COVID Tracking Project reports that more than 56,000 people were hospitalized around the country with the virus, as of Thursday.
The data released by HHS on Friday puts the figure at more than 70,000.
NPR reported this week that it had found irregularities in the process used by the Trump administration to award the contract to manage the hospital data. Among other things, HHS directly contacted TeleTracking about the contract and the agency used a process that is more often used for innovative scientific research, NPR reported.
An HHS spokesperson told NPR that the contract process it used is a “common mechanism … for areas of research interest,” and said that the system used by the CDC was “fraught with challenges.”
Ryan Panchadsaram, co-founder of the tracking website CovidExitStrategy.org, has been critical of the problems created by the hospital data changeover.
“Without real-time accurate monitoring, you can’t make quick and fast and accurate decisions in a crisis,” he said in an interview. “This is just so important. This indicator that’s gone shows how the health system in a state is doing.”
Dillon of the Missouri Hospital Association said the administration could have handled this differently. For big technology projects, he noted, there is often a well-publicized transition with information sessions, an educational program and, perhaps, running the old system and the new one in parallel.
This “was extremely abrupt,” he said. “That is not akin to anything you would expect from HHS about how you would implement a program.”
Governors join calls for delay of administration plan to shift control from the CDC as Trump administration pledges to make data available to the public.
And on Thursday, the nation’s governors joined the chorus of objections over the abruptness of the change to the reporting protocols for hospitals, asking the administration to delay the shift for 30 days. In a statement, the National Governors Association said hospitals need the time to learn a new system, as they continue to deal with this pandemic.
The governors also urged the administration to keep the information publicly available.
The disappearance of the real-time data from the CDC dashboard, which was taken down Tuesday night before resurfacing Thursday morning, was a ripple effect of the administration’s new hospital reporting protocol that took effect Wednesday, according to a federal health official who spoke on the condition of anonymity to discuss internal deliberations.
Without receiving the data firsthand, CDC officials were reluctant to maintain the dashboard — which shows the number of patients with covid-19, the disease caused by the virus, and hospital bed capacity — and took it down, the federal health official said. The CDC dashboard states that its information comes directly from hospitals and does not include data submitted to “other entities contracted by or within the federal government.” It also says the dashboard will not be updated after July 14.
The dashboard “was taken down in a fit of pique,” said Michael R. Caputo, the assistant secretary for public affairs at the Department of Health and Human Services. “The idea CDC scientists cannot rely upon their colleagues in the same department for data collection, or any other scientific work, is preposterous.”
This week, the CDC, the government’s premier public health agency whose medical epidemiologists analyze the hospital data, also stopped producing reports about trends in the pandemic that had gone twice a week to states, and six days a week to officials at multiple federal agencies. Adm. Brett Giroir, an assistant secretary in the HHS who oversees coronavirus testing, was unhappy that the CDC hospital report stopped Wednesday and Thursday mornings, according to the federal health official.
Caputo said that the administration’s goal is to maintain transparency, adding that conversations were still taking place between HHS officials and the CDC on a plan to keep producing the dashboard updates and the reports. “We expect a resolution,” he said.
Another HHS spokesperson said the CDC might create a new dashboard, based on a wider set of information.
During a conference call for journalists Thursday on coronavirus testing, Giroir did not acknowledge his displeasure with the reports’ discontinuation. But he said: “Those data are really critical to all of us. … I wake up in the morning and first thing I do, I look at the data. I look at midday. I look at it at night before I go to bed. … We drive the response based on that.”
The CDC site had been one of the few public sources of granular information about hospitalizations and ICU bed capacity. About 3,000 hospitals, or about 60 percent of U.S. hospitals, reported their data to the CDC’s system.
The president of the American Medical Association, Susan R. Bailey, spoke out Thursday on the uncertainties about access to data. “[W]e urge and expect that the scientists at the CDC will continue to have timely, comprehensive access to data critical to inform response efforts,” she said.
Governors, hospital officials and state health officers were given scant notice of the change in the reporting system. Two top administration health officials said in a letter to governors early this week that some hospitals were not complying with the previous protocols, suggesting that states might want to consider bringing in the National Guard to help gather the information. Hospital industry leaders vehemently protested that characterization, as well as the idea that they should be assisted by the National Guard in the midst of a pandemic.
HHS and CDC officials have said the protocol was changed to streamline reporting of data that is used, among other things, to determine the federal allocation of therapeutics, testing supplies and protective gear. Instead of reporting to the long-standing CDC system, hospitals must send data about covid-19 patients and other metrics to a recently hired federal contractor, called TeleTracking, or to their state health departments.
At least some state health departments that have been collecting data for their hospitals and sending it to Washington have already said the switch will make it impossible for them to continue, at least for now. The changed protocol includes a requirement that hospitals send several additional types of data that some state systems are not equipped to handle, state health officials said.
The Pennsylvania Department of Health sent a notice to hospitals Tuesday night saying that its platform was not ready to accommodate the new federal requirements, so that hospitals needed to report every day to both the state and to TeleTracking.
Charles L. Gischlar, spokesman for the Maryland Department of Health, said the reporting change “is a heavy lift for hospitals.”
The new system “exceeds the capacity of the current statewide system” to which hospitals had been reporting, he said, so the state no longer can send consolidated information to the federal government. As a result, he said in a statement, hospitals must provide data individually to the government.