How AI and analytics could solve healthcare’s big data problems
What’s bigger than one trillion? The number of data points the healthcare industry is producing globally. The industry currently generates 30% of the world’s data volume, and the International Data Corporation (IDC) predicts that there are over 2,314,000,000 terabytes of data today, which has grown a staggering 110X since 2013.
By 2025, healthcare will have become the fastest-growing source of data worldwide. The problem is that while bad data in enterprise means a lost sale, bad data in healthcare could mean a lost life. But that’s also why we at SignalFire see huge opportunities for startups to build better data infrastructure for the healthcare industry.
Why the sudden data explosion within U.S. healthcare? We’re seeing increased adoption of electronic health records (EHRs), regulatory enforcement, and a growing popularity of wearables and other health tracking devices. These are producing a breadth of data types, including patient information, clinical notes, test results, imaging data, and claims data.
However, managing and analyzing this massive amount of data presents significant challenges. To get anything done with the data, you have to solve for the interoperability problem of systems communicating effectively with each other, data normalization, privacy, and security issues—before you even get to the sophisticated applications of data science.
Now’s the time, though. The world just got equipped with fresh AI tooling that can make sense of the seas of data flowing out of healthcare. And with everyone thinking about AI, many slower-moving incumbents will feel a sense of urgency to modernize their data stack.
In this post we’ll lay out some of the biggest technology and regulatory shifts affecting the healthcare data space, and a dozen specific opportunities where SignalFire is looking to invest. We’re going deep on healthcare data and the AI space given our bread and butter—we spent the past decade building our own proprietary AI data platform, Beacon, which tracks more than half a trillion data points, giving our portfolio companies unique insights into market intelligence. Our in-house expertise on data and machine learning gives us a unique lens into the power of data in healthcare and specific areas where we’re excited to back founders.
Now let’s dive into the trends and opportunities around AI and analytics for hospitals, payors, pharma, and patients.
Part one: The data infrastructure layer
With AI, it’s garbage in, garbage out, so the industry first needs infrastructure to improve data quality. Before a model or analytics can be built on top of a data set, we need to address the following questions:
- Where do we get access to raw data?
- How do we cleanse and structure the data?
- How do we accurately join different datasets to create a full data record on a single patient?
- How do we store this data in a way that protects the patient’s privacy?
Until recently, raw healthcare data was becoming increasingly commoditized. But beginning in 2016 major new regulations emerged, starting with the 21st Century Cures Act.
The Cures Act mandated the bi-directional exchange of patient clinical data through the Trusted Exchange Framework and Common Agreement (TEFCA), with a growing number of approved use cases for data sharing. Essentially, TEFCA required every healthcare organization to make their data more accessible across states, hospitals, and provider networks so patients’ care teams always had the information they needed.
Now, in order to be eligible for access to this shared data, entities must receive the Qualified Health Information Network (QHIN) designation. After QHIN networks are fully established, only QHIN designees will be permitted to access the broader network of U.S. healthcare data, thereby raising the bar for other companies trying to solve the challenges at the data infrastructure layer.
To be awarded the QHIN license, businesses need to build a highly compliant platform that can scale to enormous volumes of data. Among the first six entities to get a QHIN license alongside incumbents like Epic and Commonwell was SignalFire portfolio company Health Gorilla.
With a data lake that has access to the full longitudinal medical records of more than 90% of the U.S. patient population, Health Gorilla is opening up an extremely powerful data source for healthcare software developers and modeling how newer companies can work in tandem with regulators. They’ve solved a lot of the raw data access, cleanliness, integration, and privacy-safe storage issues to build a technical foundation for the next generation of solutions.
Part two: Building analytics and AI models on top of data
With improved infrastructure, companies can build unique analytics and AI models in highly verticalized categories within healthcare. These use cases often require specific data sets, allowing startups in this space to build data moats as a core part of their defensibility. Given this is a highly regulated space involving sensitive patient data, solutions here can distinguish themselves with top-grade privacy and security practices.
1. Analytics and AI applications for providers and hospitals
Providers are one of the major contributors to healthcare data—every time someone in this country completes a doctor’s visit, a medical record is generated. This data set, called clinical data, is one of the most valuable data sets because it captures the essence of what we need in order to practice healthcare—what are the patient’s symptoms, blood test results, medication history, etc. Here are a couple of areas where SignalFire is particularly excited:
- Personalized patient engagement: Knowing everything we ought to about a patient’s medical history, demographic information, and their consumer preferences, how do we proactively engage with them in a way that encourages them to come in for preventive visits, obtain further education on conditions they may be at higher risk for, provide education on offerings available to them, and ultimately help them achieve better outcomes? This would help providers proactively engage with their patients over the long term, increasing the hospital’s brand loyalty while reducing costs vs. reactively seeing patients as they need care.
- Clinical intake intelligence: How many times have you sat at a doctor’s office with a clipboard and pen in hand, already five minutes late to your appointment but still needing to fill out a basic questionnaire? There’s been an effort to digitize this experience, but Health Note takes it to the next level by sending patients a digitally powered (i.e., via SMS) dynamic questionnaire (the next question changes based on your responses to the previous question) before their visit, mirroring what a doctor would ask in the first five minutes of the actual visit. The solution not only saves time for a front desk administrator but also a doctor whose clinical note is already halfway auto-generated at the time of the visit.
- Clinical decision support: Having access to the entire patient medical record plus an AI tool enables more precise diagnoses and real-time intervention at a higher accuracy than what humans can accomplish alone. The overall adoption of these models is relatively limited today and typically needs continuous data from outside the four walls through an integrated continuous management system like all.Health. Where we see great potential is in tools that can assist clinicians, not replace them (e.g., providing a second set of eyes), speeding up diagnosis time by providing an assessment that a clinician can review. For example, Recora Health’s virtual cardiac platform is able to surface to providers who is more likely to have another heart attack after only several virtual visits.
- Coding automation: Empowered providers get paid faster and bill more accurately using AI models to autogenerate a billing code based on an unstructured doctor’s note. SignalFire led the Series A in CodaMetrix, which has a unique competitive advantage in this space, having spun out of Mass General Brigham—making it a data moat around high-quality training data (read more about our investment here).
2. Analytics and AI applications for payors
Payors’ business models—effectively an insurance business—inherently create incentive alignment with solutions that are using AI and analytics to drive down the cost of care while improving outcomes. Below are several examples of problem statements solved by companies using data and AI:
- Medication adherence and management: The entire payor ecosystem pays an estimated $300 billion annually for: medications that don’t get consumed; more expensive medications vs. generic equivalents; and medications that patients no longer need. Better data can help create a fuller picture of a patient’s existing conditions and engage with them in a highly personalized way, using behavioral economics principles to nudge them to take the right medicine at the right time. It’s why we invested in Wellth.
- Population health management: Every payor typically manages hundreds of thousands to millions of lives. Because they’re ultimately responsible for paying the bill, it’s important they understand how healthy their population is and which segments would benefit from proactive management of their health. A data-driven solution like Color would review the entire patient population data across all attributes and help patients navigate to the appropriate care they need.
- Payment integrity: Annually, $200–300 billion is spent on claims waste, fraud, and abuse. Ninety percent of the time, the reason payors overspend on claims comes down to human error—the person on the provider side has made a mistake and asked for more money than they should collect for a visit. The autonomous coding solution from CodaMetrix not only directly addresses this problem, but—with increased adoption—could establish the common language that would allow payors and providers to transact in an equal and fair manner.
3. Analytics and AI applications for pharma
Pharma spends, on average, over $1 billion and 10 years for a successful drug to come to market. Any data-driven and AI solutions that can expedite the drug development timeline or reduce costs are highly attractive to pharma:
- Drug discovery: AI algorithms can analyze vast amounts of biological data, such as genomics, proteomics, and metabolomics, to identify potential therapeutic targets. By integrating diverse data sources and applying machine learning techniques, AI can predict target-drug interactions and prioritize targets with the highest probability of success. These models can also analyze molecular structures, predict their interactions with target proteins, and propose modifications to enhance drug efficacy, safety, and pharmacokinetics (the branch of pharmacology concerned with the movement of drugs within the body). A strong, valuable dataset like that of Ovation.io can help more quickly identify which approaches to pursue—a key benefit when considering it can take years to get a new drug to the market.
- Synthetic control arm for clinical trials: One innovative clinical design approach made increasingly feasible with burgeoning digital data and enhanced analytic tools is the use of synthetic control arms. Instead of collecting data from patients recruited for a trial who have been assigned to the control group in a traditional randomized control trial, synthetic control arms model comparators using data. Pharmaceutical companies can save substantial money, shorten trial timelines, and inform development decisions. Synthetic control arms can also bring benefits to patients who may be leery of landing in an arm requiring use of a placebo or ineffective standard-of-care. Synthetic control arms ensure that all trial participants will receive active treatment, obviating an important patient concern which could result in increased patient recruitment and retention.
- Post-approval targeting: After a new drug has been approved—allowing it to be marketed—there is an opportunity for a pharmaceutical company to harness predictive analytics and machine learning to enable precise physician targeting. The approach might allow a company to identify physicians caring for patients with the highest need for a given therapy, and whose prescribing patterns indicate potential openness to a novel mechanistic approach.
4. Analytics and AI applications for patients
At the end of the day, all these solutions above that work with providers, payors, and pharma will always benefit the patient downstream in one way or another, as the patient is the center of our healthcare ecosystem. However, here are several other ways in which data insights and availability can help us directly:
- Individual medical record access: Patients with chronic and rare diseases are currently tasked with manually assembling their information to get the best treatment possible. Currently under TEFCA, only certain use cases of data sharing are approved—a provider can pull information if they’re treating a patient, but a patient cannot directly pull information on themselves. We think an individual use case is going to be unlocked in the next year, helping everyone from the overburdened patient with clinical illness to the person who’s simply trying to keep track of their immunization records.
- Patient payments: Better data can help patients afford their healthcare. Payzen uses large amounts of patient data—spanning medical history, demographics, frequency of visits, and more—to provide patients with a personalized medical bill payment plan that has a 0% interest rate.
Building for healthcare? We want to hear from you
If you’re working on a startup in this space, we’d like to chat. Cold emails are welcome at [email protected] to connect with Yuanling Yuan (she goes by YY) from our healthcare investment team. You can also subscribe to our email updates for more on healthcare startup trends and opportunities
At SignalFire, we like to say, “Think of us as an extension of your team that scales with you.” Beyond our in-house Beacon AI for help with recruiting, we built our full-time Portfolio Experience team with world-class operators across a variety of functions, including the former chief people officer at Netflix for developing an engineer hiring strategy, the chief marketing officer at Stripe to optimize your sales process, and the former editor-at-large at TechCrunch to help you convert the value you deliver into a persuasive story. Our XIR program, meanwhile, pairs top industry leaders with high-potential companies as they scale and includes healthcare luminaries like Evolent Health ($EVH) founders Frank Williams and Tom Peterson.
We love helping healthcare companies solve their internal problems so they can heal the world. That approach of providing value far beyond our capital is why we have a net promoter score of 85 among founders, with 85% saying we are the most valuable investor on their cap table.
If you’re working on a company in the healthcare data, analytics, and AI space, come talk with us. We’ll share our full research and connections, and hope to earn the chance to hear about your next fundraise. By unlocking the secrets trapped within our medical data, we can build a healthier future for everyone. We can’t wait to see what you’re building.
SignalFire may engage Affiliate Advisors, Retained Advisors, and other consultants as listed above to provide their expertise on a formal or ad hoc basis. They are not employed by SignalFire and do not provide investment advisory services to clients on behalf of SignalFire. For more information on their specific roles, please contact us. Portfolio Company Endorsements: Certain portfolio company founders or Affiliate Advisors listed above may or may not be current investors in a SF fund in which they receive a fee reduction. Such fee reductions were not provided in exchange for or an incentive for their feedback, nor contingent upon the individual’s approval for SignalFire’s continued use. Please refer to our website for additional disclosures.