We are now in the era of “big data” which, we are told, will
answer questions we could never answer and also identify individuals before
they are sick so we can intervene and prevent their illnesses and their
problems. It is exciting,
earth shattering and the subject of more articles, blog posts and conferences
than one can shake a stick at. However
is it true? Can “big data” or little
data for that matter really lead us to salvation (medically speaking at
least).
Forgive me for the religious association however it often
seems as though people are taking the pronouncements from the medical data
gurus as being holy writ from God. That
bothers me a bit because fundamentally being a monotheist I do tend to think
that we mere mortals are not godlike in our perfection, even if we are
physicians and even if we are the even godlier than physicians, health policy
experts. There is an old joke about a
good man dying and going to heaven. In
heaven he is shown around by one of the angels who take him to the dining hall
where a line of happy people are patiently waiting their turn to pick up their
food for the day. He then sees one of
the heavenly beings wearing a white coat with a stethoscope in a pocket cut
into line. He asks the angel who that is
and the angel says, “Oh that is just God – sometimes he thinks he is a
doctor.” Since we are not godlike in our
analyses, we must better understand what all this data means, and whether bad
data is better than no data. Ultimately
we need to know how to use data to help those in need. That is the essence of medicine – helping those
in need.
I speak as a physician and a health data expert who has helped
health care organizations, government and large corporations design programs
based on the populations they serve. The
work I do is data based and thus I must understand the strength and limitations
of data. I know that data can
potentially be used to positively impact the use of precious health care
resources and the care a patient receives.
At the same time, in my professional role, I am often the skeptic,
challenging those people who claim the data holds magical powers. Thus I enjoyed the article by Dr. Saurabh Jha
in the Health Care Blog entitled, “Quality
of Skepticism and Skepticism of Quality.” It was his section on bad data
being better than no data that inspired this post. He makes the point, and I admit it is a point
I make all the time, that perfection is the enemy of good but does not stop
there as many do. Dr. Jha understands the limitations and that while perfection
is the enemy of good, sometimes data analytics do not even achieve the standard
of good.
What then is this data we are talking about? All data depends on some information, being
put into the right format to translate into the binary code that computers
understand. When people speak of big
data, at this point in time, they are mainly speaking of claims databases which
take billing codes from insurance claims and assume that they accurately
reflect the care that is being rendered.
Billing codes are financial tools that drive payment and are used by
providers to maximize their revenues (there are courses and consultants that
constantly try to help people adjust codes to do just that) and are used by
insurance companies to minimize payments.
That results in a game with only passing interest in accurately
reflecting what is going on between a doctor and a patient. With electronic medical records, the hope is
that we will obtain more accurate information on what is really happening. The funny part is that the most popular and
widespread EMR gained its market dominance by being able to help hospitals
maximize their revenues by capturing all services and materials for accounting
and billing purposes, not by accurately telling the clinical story.
Saul Weiner
and Alan Schwartz, who I have spoken about in previous posts, have looked
at whether medical records actually reflect what happens in an interaction
between doctor and patient and have found, by comparing tape recorded
encounters, and using standardized actor patients, that the record does
not! Thus even the data inputs from a
medical record, considered to be much stronger than the claims records have
serious flaws. They point out in their research that the medical records leave
out the emotions, competing priorities, financial concerns, spiritual beliefs
and other aspects of being human that have a major impact on the care
rendered. They call this contextualized
care and have found the ability to understand the person and not only the
disease is much more important in driving quality care than the purely
bio-medical issues.
Data tends to suffer from observational bias, sometimes
called the “streetlight
effect” from a joke that scientists like to tell. Late at night, a
police officer finds a drunken man crawling around on his hands and knees under
a streetlight. The drunken man tells the officer he’s looking for his wallet.
When the officer asks if he’s sure this is where he dropped the wallet, the man
replies that he thinks he more likely dropped it across the street. “Then why
are you looking over here?” the befuddled officer asks. Because the light’s
better here, explains the drunken man.
We tend to look at these big databases, designed and optimized for
financial purposes, because the light is better, even though the answers, the
insights, are more likely found in data ‘across the street’ where it is not
captured.
But advances are being made.
Lab data is now included in some databases. Pharmacy information, which used to be
separate, is now incorporated. Methods
using word search and mining audio databases of phone calls between providers,
patients, and insurers are starting to be used with some potential
effectiveness. However the databases, on a sheer numbers
basis, are still overwhelmingly claims or EMR based, both of which are designed
for financial and not clinical purposes.
All this brings me back to the question which titles this
post. Is bad data better than no
data? I do not have a hard and fast
answer. Bad data can push you to make
bad decisions and when the data is big, the bad decisions can really be
whoppers. Big data used to identify
individuals is especially prone to mistakes as the variability in people is far
greater than can be seen from the financially based data in the databases. The danger is that we assume that the data is
correct. We assume it to be useful. Dr Jha takes exception to this and says, “The
burden is on proponents of the metrics to prove their usefulness.” Currently that is not the case and the burden
is on those who question the usefulness.
That does need to change and to be tempered by the medical tradition of
skepticism.
None of this is to suggest that the use of data be
abandoned. Perfection is the enemy of
the good. Let’s just understand what we
are looking at, what the limitations are, and stop using even good data as if
it is perfect. We need to take a breath
and study the use of data to evaluate its effectiveness rather than assume that
all answers lie in those numbers.
No comments:
Post a Comment