Flexpa is hosting a webinar on patient consent and payer-to-payer data exchange on March 12th. Join us to learn how the industry is tackling consent management, network connectivity, and data standardization. Save your spot now.
Imagine this. You are a software developer who works at a startup called “ImmuniLink”. The startup has developed a care coordination platform specifically tailored for individuals with autoimmune diseases. It is critical that ImmuniLink can gather information on which doctors, hospitals and healthcare specialists that patients have previously visited.
Luckily your CTO has figured out a way to ingest data on patients’ past medical history via their insurance companies. Your task is to comb through the ingested data and create a map between patients and the healthcare providers they’ve visited. The data is delivered in the developer-friendly FHIR format, so in theory your task should be relatively straightforward.
You’re also aware of the National Provider Identifier (NPI) standard, so uniquely identifying providers by NPI seems like the correct approach. To create a map something like the following could work, right?
{
"patient_a": [
{
"name": "Dr ABC",
"npi": "1111111111",
"address": "...",
"contact": { "phone": "...", "fax": "...", "email": "..." }
},
{
"name": "Dr DEF",
"npi": "2222222222",
"address": "...",
"contact": { "phone": "...", "fax": "...", "email": "..." }
}
],
"patient_b": [
{
"name": "Hospital GHI",
"npi": "3333333333",
"address": "...",
"contact": { "phone": "...", "fax": "...", "email": "..." }
},
{
"name": "Dr ABC",
"npi": "1111111111",
"address": "...",
"contact": { "phone": "...", "fax": "...", "email": "..." }
}
],
}
But then… you run into problems. You find that the shape of the FHIR data is not consistent between payers. Inconsistencies between FHIR implementations have you rummaging through pages and pages of FHIR documentation. Moreover some data feeds are missing important details like addresses, contact information, names, or even NPIs. It seems like this task wasn’t so straightforward after all. Ugh.
Let’s get specific. There are two major problems with aggregating data sources when it comes to handling healthcare provider records.
A. Reference type variance
B. Missing and out of date information
One of the most common challenges when working with real world FHIR data is dealing with provider references. While the FHIR specification offers flexibility in how providers can be referenced, this flexibility can lead to significant variance in how different systems implement these references. This post explores how we tackled this challenge at Flexpa by building a robust normalization pipeline that converts diverse provider reference patterns into consistent, resolvable references.
In our experience integrating with over 200 different FHIR API implementations, we've encountered four main types for provider references. Each of these examples have appeared inside of ExplanationOfBenefit resources:
"provider": {
"reference": "Practitioner/123"
}
Literal relative references are what we consider to be the easiest type of reference to work with in RESTful environments. Each reference points to another FHIR resource on the same FHIR server. It is easy for API consumers to follow literal relative references. All you need to do is append the reference string to the base URL of the FHIR server. For example:
GET https://fhir.example.com/Practitioner/123
Referenced resources are also decoupled from the resources that referenced them. This means that the Practitioner in our example will show up in a Practitioner search query:
GET https://fhir.example.com/Practitioner
"provider": {
"reference": "#id-1"
},
"contained": [
{
"id": "id-1",
"resourceType": "Practitioner",
"name": "Dr. John Doe"
}
]
Another valid FHIR implementation is literal contained references. This is when a FHIR system hides a resource inside another resource. It’s kind of like stapling an appendix to a document. You can’t look for the appendix directly, rather you have to first find the document that references it.
This has pros and cons for implementers working in RESTful environments. A pro is that the consumer need not make an additional API request to resolve the reference. All you need to do is look in the contained array and match the fragment ID (in this example “#id-1”).
The big con is that this implementation is difficult to work with if your system needs to reference a Practitioner in contexts outside of the base resource (e.g. the ExplanationOfBenefit) that referenced it. This means that the Practitioner in our example will NOT show up in a Practitioner search query:
GET https://fhir.example.com/Practitioner
"provider": {
"identifier": {
"system": "http://hl7.org/fhir/sid/us-npi",
"value": "1234567890"
}
}
FHIR also allows for logical references that refer to a healthcare provider by their identifier. In the US, this means by their NPI number. Like in the literal contained reference example above, logical references bury providers’ information inside the base resource that referenced it, making it difficult to reference in another context.
Logical references pose an additional challenge for RESTful systems. NPIs can represent both individuals and organizations however FHIR requires that healthcare providers are expressed explicitly as either a Practitioner or Organization resource type. So if your system attempts to convert a logical reference to a standalone resource, your system would need to lookup the type of NPI as a prerequisite. There are two types as defined by the NPPES:
Doing this lookup would require an external request to the NPPES registry API. While the API is free and public, making requests at scale to a 3rd party have privacy and reliability drawbacks.
"provider": {
"display": "Sunshine Doctors"
}
Lastly, FHIR states that a display-only reference is totally kosher too. FHIR dubs this type of reference a reference description. This is the most difficult type of reference for RESTful implementations to handle because all you have is an open-text string, rather than a unique key like NPI.
It suffers from the same coupling problem as with the contained and logical reference types, but also suffers from the fact that creating a standalone reference from an open-text string will depend on building a reverse NPI lookup feature, which has its own complexities (e.g. duplicate matches).
Missing and out of date information poses another major challenge for handling provider records from multiple data sources. In fact, there are entire companies dedicated to solving this problem, for example Availity, Kyruus Health and Ribbon Health to name a few.
Given that most fields on the Practitioner and Organization resource profiles are optional, it is rare to find two implementations that look the same.
Who is this provider?
In fact, many provider records do not define most fields in the resource profile. When most or all fields are “undefined”, the resource is basically rendered useless for an end consumer who does not want to build their own NPI lookup system.
{
"resourceType": "Practitioner",
"identifier": [
{"system": "http://hl7.org/fhir/sid/us-npi", "value": "1111111111"}
],
"name": undefined,
"telecom": undefined,
"address": undefined,
"gender": undefined,
"qualification": undefined
}
Did they get married?
Another problem that implementers face is changing data. Some real world examples:
In this example, Jane Doe’s maiden name (Jane Moe) is returned by some non-authoritative FHIR server, without the “maiden” flag. Yes there is a “maiden” flag in FHIR.
{
"resourceType": "Practitioner",
"identifier": [
{"system": "http://hl7.org/fhir/sid/us-npi", "value": "1111111111"}
],
"name": [
{
"given": ["JANE"],
"family": "MOE"
}
]
}
Overall, the above present challenges (oftentime blockers) for API consumers who need to:
We've built a normalization pipeline that converts all provider reference patterns into consistent literal relative references (example 1 above), using our FHIR Provider Directory as the source of truth. Then we backfill as much information as we know about the provider. The goal is to maximize the usefulness of our output for RESTful applications. Here's how it works:
The pipeline consists of four main phases:
This normalization pipeline provides several key benefits:
ExplanationOfBenefit.provider
).GET /fhir/Practitioner
will return practitioners that were once unsearchable because they were previously hidden inside other resources.To implementers of FHIR APIs (eg. Payers who offer Patient Access APIs), we urge you to consider the downstream effects that referencing structure has on usability for API consumers. Here is our rank of preferred reference structure:
When it comes to Provider data, we suggest considering including as much up to date information as possible. As far as what data fields are important to our customers, here are the top three that we’ve heard about:
Reference normalization and backfilling are crucial steps in making FHIR data more usable and reliable. Our solution demonstrates how careful design and implementation can transform varied reference patterns and resources into a consistent, high-quality format that better serves the needs of healthcare API consumers.