By Kendra Casey Plank
The Department of Health and Human Services Office for Civil Rights has
released guidance that addresses common questions about how entities subject to
the Health Insurance Portability and Accountability Act can de-identify
protected health information.
The guidance document
does not offer new ways for covered entities to meet the de-identification
standard in the HIPAA Privacy Rule. Instead it gives insight on the agency's
expectation for how the rule's existing de-identification methods should apply
to traditional paper, as well as digital, records.
De-identification involves stripping data from records that could link
personal health information to individuals. The Privacy Rule prohibits covered
entities, in most cases, from using and disclosing PHI. The prohibition,
however, does not apply to de-identified PHI.
The guidance, dated Nov. 26, was published more than two years after OCR
convened a two-day workshop to get input on de-identification issues. OCR was
required under the Health Information Technology for Economic and Clinical
Health (HITECH) Act to revisit the HIPAA Privacy Rule de-identification standard
and issue guidance, given concerns that existing rules are no longer adequate as
the health care industry moves rapidly toward digitized data.
The guidance addresses the two methods in the HIPAA Privacy Rule for
de-identifying data: the expert determination method and the safe harbor method.
OCR noted, however, that neither method is fail-safe.
“Both methods, even when properly applied, yield de-identified data that
retains some risk of identification,” OCR wrote in the guidance. “Although the
risk is very small, it is not zero, and there is a possibility that
de-identified data could be linked back to the identity of the patient to which
Even so, OCR added, PHI that is de-identified using either of the methods in
the Privacy Rule is no longer considered PHI and can be used or disclosed
without restriction under HIPAA.
During the March 2010 OCR workshop on de-identification, some privacy
advocates argued that new guidance was necessary to address the changing health
care data environment, in which organizations such as health plans and large
hospitals had huge data sets that, when cross-matched with de-identified PHI,
could lead to re-identification of sensitive patient information (see
Others, however, said that data stripped of too many identifiers was useless
to important health care research (see previous article).
OCR noted in the guidance document that covered entities may want to preserve
as much data as possible in de-identification efforts to maintain the usefulness
of the information.
“[C]overed entities may wish to select de-identification strategies that
minimize such loss,” the guidance said.
In its guidance on the expert determination method for de-identifying PHI,
OCR acknowledged that the Privacy Rule does not explicitly define who is
considered an expert for the purposes of the method; what levels of risk are
acceptable in determining whether de-identification efforts are sufficient; or
how experts retained by covered entities are to assess the risk of
And, OCR provides little additional specificity in the guidance, instead
offering examples of how covered entities might evaluate their efforts to
The expert determination method for de-identifying PHI, as defined by the
Privacy Rule, requires that covered entities retain an “expert” with
“appropriate knowledge of and expertise with generally accepted statistical and
scientific principles and methods for rendering information not individually
identifiable” to determine that the risk of re-identification is “very small.”
Such experts are expected to provide documentation and methodologies for
reaching their determinations, according to the rule.
OCR said in the guidance that there is no specific professional degree or
certification required of “experts” rendering re-identification risk
“From an enforcement perspective, OCR would review the relevant professional
experience and academic or other training of the expert used by the covered
entity, as well as actual experience of the expert using health information
de-identification methodologies,” OCR wrote.
Likewise, OCR said, there is “no explicit numerical level of identification
risk that is deemed to universally meet the 'very small' level indicated by the
Instead, OCR said experts assessing re-identification risk would have to
weigh many factors for the specific data set being evaluated.
OCR also said there is no particular process required of experts to reach
“However, the Rule does require that the methods and results of the analysis
that justify the determination be documented and made available to OCR upon
request,” the agency cautioned.
OCR said that among considerations experts often contemplate when assessing
re-identification risk is whether a data set can be linked to a data source that
would reveal the identity of corresponding individuals. Without such a data
source link, OCR said, re-identification would be difficult.
Under the safe harbor method for de-identification of PHI, covered entities
must redact specific identifying information about individuals and have no
actual knowledge that the remaining data could be used alone or in combination
with other information to identify individuals.
For example, geographic subdivisions smaller than a state, such as street
addresses and counties, must be removed, as well as all elements of dates,
except the year, that correspond with PHI.
However, the guidance clarifies how much information and under what
circumstances geographic data, such as ZIP codes, and dates can be left in and
still satisfy the de-identification standard.
OCR also addressed questions about the “actual knowledge” standard in the
safe harbor method, saying that PHI is not considered de-identified if the
covered entity knows that the remaining information in a data set could be used
to identify a person.
“The covered entity, in other words, is aware that the information is not
actually de-identified information,” OCR said.
For example, actual knowledge that remaining data could be used to positively
identify an individual could include a revealing occupation, a clear familial
relationship, or a publicized clinical event, OCR said.
The guidance and additional information from OCR about de-identification are