Published on
February 18, 2025
·
Written by
Dylan Klein

Healthcare Providers for developers

The challenges of referencing providers in FHIR data and how Flexpa solves them.

Flexpa is hosting a webinar on patient consent and payer-to-payer data exchange on March 12th. Join us to learn how the industry is tackling consent management, network connectivity, and data standardization. Save your spot now.

Help! Is anybody here a doctor?

Imagine this. You are a software developer who works at a startup called “ImmuniLink”. The startup has developed a care coordination platform specifically tailored for individuals with autoimmune diseases. It is critical that ImmuniLink can gather information on which doctors, hospitals and healthcare specialists that patients have previously visited.

Luckily your CTO has figured out a way to ingest data on patients’ past medical history via their insurance companies. Your task is to comb through the ingested data and create a map between patients and the healthcare providers they’ve visited. The data is delivered in the developer-friendly FHIR format, so in theory your task should be relatively straightforward.

You’re also aware of the National Provider Identifier (NPI) standard, so uniquely identifying providers by NPI seems like the correct approach. To create a map something like the following could work, right?

{
 "patient_a": [
   {
     "name": "Dr ABC",
     "npi": "1111111111",
     "address": "...",
     "contact": { "phone": "...", "fax": "...", "email": "..." }
   },
   {
     "name": "Dr DEF",
     "npi": "2222222222",
     "address": "...",
     "contact": { "phone": "...", "fax": "...", "email": "..." }
   }
 ],
 "patient_b": [
   {
     "name": "Hospital GHI",
     "npi": "3333333333",
     "address": "...",
     "contact": { "phone": "...", "fax": "...", "email": "..." }
   },
   {
     "name": "Dr ABC",
     "npi": "1111111111",
     "address": "...",
     "contact": { "phone": "...", "fax": "...", "email": "..." }
   }
 ],
}

But then… you run into problems. You find that the shape of the FHIR data is not consistent between payers. Inconsistencies between FHIR implementations have you rummaging through pages and pages of FHIR documentation. Moreover some data feeds are missing important details like addresses, contact information, names, or even NPIs. It seems like this task wasn’t so straightforward after all. Ugh.

Zooming in on the problems

Let’s get specific. There are two major problems with aggregating data sources when it comes to handling healthcare provider records.

A. Reference type variance

B. Missing and out of date information

Problem A: Reference type variance

One of the most common challenges when working with real world FHIR data is dealing with provider references. While the FHIR specification offers flexibility in how providers can be referenced, this flexibility can lead to significant variance in how different systems implement these references. This post explores how we tackled this challenge at Flexpa by building a robust normalization pipeline that converts diverse provider reference patterns into consistent, resolvable references.

In our experience integrating with over 200 different FHIR API implementations, we've encountered four main types for provider references. Each of these examples have appeared inside of ExplanationOfBenefit resources:

1. Literal (Relative) References

"provider": {
 "reference": "Practitioner/123"
}

Literal relative references are what we consider to be the easiest type of reference to work with in RESTful environments. Each reference points to another FHIR resource on the same FHIR server. It is easy for API consumers to follow literal relative references. All you need to do is append the reference string to the base URL of the FHIR server. For example:

GET https://fhir.example.com/Practitioner/123

Referenced resources are also decoupled from the resources that referenced them. This means that the Practitioner in our example will show up in a Practitioner search query:

GET https://fhir.example.com/Practitioner

2. Literal (Contained) References

"provider": {
 "reference": "#id-1"
},
"contained": [
 {
   "id": "id-1",
   "resourceType": "Practitioner",
   "name": "Dr. John Doe"
 }
]

Another valid FHIR implementation is literal contained references. This is when a FHIR system hides a resource inside another resource. It’s kind of like stapling an appendix to a document. You can’t look for the appendix directly, rather you have to first find the document that references it.

This has pros and cons for implementers working in RESTful environments. A pro is that the consumer need not make an additional API request to resolve the reference. All you need to do is look in the contained array and match the fragment ID (in this example “#id-1”).

The big con is that this implementation is difficult to work with if your system needs to reference a Practitioner in contexts outside of the base resource (e.g. the ExplanationOfBenefit) that referenced it. This means that the Practitioner in our example will NOT show up in a Practitioner search query:

GET https://fhir.example.com/Practitioner

3. Logical References

"provider": {
 "identifier": {
   "system": "http://hl7.org/fhir/sid/us-npi",
   "value": "1234567890"
 }
}

FHIR also allows for logical references that refer to a healthcare provider by their identifier. In the US, this means by their NPI number. Like in the literal contained reference example above, logical references bury providers’ information inside the base resource that referenced it, making it difficult to reference in another context.

Logical references pose an additional challenge for RESTful systems. NPIs can represent both individuals and organizations however FHIR requires that healthcare providers are expressed explicitly as either a Practitioner or Organization resource type. So if your system attempts to convert a logical reference to a standalone resource, your system would need to lookup the type of NPI as a prerequisite. There are two types as defined by the NPPES:

  1. Individual
  2. Organization

Doing this lookup would require an external request to the NPPES registry API. While the API is free and public, making requests at scale to a 3rd party have privacy and reliability drawbacks.

4. Reference Descriptions

"provider": {
 "display": "Sunshine Doctors"
}

Lastly, FHIR states that a display-only reference is totally kosher too. FHIR dubs this type of reference a reference description. This is the most difficult type of reference for RESTful implementations to handle because all you have is an open-text string, rather than a unique key like NPI.

It suffers from the same coupling problem as with the contained and logical reference types, but also suffers from the fact that creating a standalone reference from an open-text string will depend on building a reverse NPI lookup feature, which has its own complexities (e.g. duplicate matches).

Problem B: Missing and out of date information

Missing and out of date information poses another major challenge for handling provider records from multiple data sources. In fact, there are entire companies dedicated to solving this problem, for example Availity, Kyruus Health and Ribbon Health to name a few.

Given that most fields on the Practitioner and Organization resource profiles are optional, it is rare to find two implementations that look the same.

Who is this provider?
In fact, many provider records do not define most fields in the resource profile. When most or all fields are “undefined”, the resource is basically rendered useless for an end consumer who does not want to build their own NPI lookup system.

{
 "resourceType": "Practitioner",
 "identifier": [
   {"system": "http://hl7.org/fhir/sid/us-npi", "value": "1111111111"}
 ],
 "name": undefined,
 "telecom": undefined,
 "address": undefined,
 "gender": undefined,
 "qualification": undefined
}

Did they get married?
Another problem that implementers face is changing data. Some real world examples:

  • A doctor was recently married and changed their last name.
  • A clinic's practice is being renovated and was temporarily relocated.

In this example, Jane Doe’s maiden name (Jane Moe) is returned by some non-authoritative FHIR server, without the “maiden” flag. Yes there is a “maiden” flag in FHIR.

{
 "resourceType": "Practitioner",
 "identifier": [
   {"system": "http://hl7.org/fhir/sid/us-npi", "value": "1111111111"}
 ],
 "name": [
   {
     "given": ["JANE"],
     "family": "MOE"
   }
 ]
}

Problem statement

Overall, the above present challenges (oftentime blockers) for API consumers who need to:

  • Reliably identify and look up providers
  • Join provider data across different sources
  • Maintain referential integrity in their systems
  • Rely on up to date information about providers

Flexpa’s solution: NPI Normalization Pipeline

We've built a normalization pipeline that converts all provider reference patterns into consistent literal relative references (example 1 above), using our FHIR Provider Directory as the source of truth. Then we backfill as much information as we know about the provider. The goal is to maximize the usefulness of our output for RESTful applications. Here's how it works:  

NPI Provider Normalization Pipeline

The pipeline consists of four main phases:

  1. Import: Hosting a Provider Directory is a prerequisite for serving lookup requests in a privacy-conscious and reliable manner. Flexpa updates its Provider Directory with periodic (monthly) imports from the authoritative NPPES Registry.
  2. Discovery: The system scans incoming FHIR Bundles for NPIs. These identifiers are then used to query our Provider Directory in efficient batch operations, building a comprehensive lookup map of provider information.
  3. Reference Expansion: Using the lookup map, the system processes each type of provider reference (contained, logical and descriptions), converting them into consistent literal relative references and, as a byproduct, creating new resources. This phase handles the complexity of different reference patterns while maintaining data integrity.
  4. Backfilling: Finally, the system enhances provider resources with additional details from our Provider Directory such as names, addresses, contact information and qualifications, ensuring comprehensive and consistent provider information across the dataset.

Benefits of the solution

This normalization pipeline provides several key benefits:

  1. Consistent Reference Format: All provider references are converted to literal relative references, making it easier for consumers to reference providers from within other FHIR resources (eg. ExplanationOfBenefit.provider).
  2. Decoupled Resources: The pipeline decouples provider resources from the resources that reference them, which makes it easier for consumers to work with provider resources in isolation. For example, a FHIR search for GET /fhir/Practitioner will return practitioners that were once unsearchable because they were previously hidden inside other resources.
  3. Enhanced Data Quality: By leveraging our Provider Directory, we ensure that provider information is complete and accurate. Flexpa's Provider Directory is populated by data from the NPPES, which is the official source of truth for provider information in the United States. Records are updated monthly to ensure that the information is as accurate as possible.

Conclusion

To implementers of FHIR APIs (eg. Payers who offer Patient Access APIs), we urge you to consider the downstream effects that referencing structure has on usability for API consumers. Here is our rank of preferred reference structure:

  1. Literal, Relative (Easiest for developers to work with)
  2. Literal, Contained
  3. Logical
  4. Description (Most difficult for developers to work with)

When it comes to Provider data, we suggest considering including as much up to date information as possible. As far as what data fields are important to our customers, here are the top three that we’ve heard about:

  1. Provider name
  2. Provider address
  3. Provider contact information (email and phone number)

Reference normalization and backfilling are crucial steps in making FHIR data more usable and reliable. Our solution demonstrates how careful design and implementation can transform varied reference patterns and resources into a consistent, high-quality format that better serves the needs of healthcare API consumers.

Subscribe to our newsletter to stay up to date on our posts.

Head over to the report to read our full analysis and takeaways ->