Department of Health and Human Services

NATIONAL COMMITTEE ON VITAL AND HEALTH STATISTICS

Working Group on HHS Data Access and Use

September 21, 2012

Herbert Humphrey Building
Washington, DC

Meeting Minutes


The Working Group on HHS Data Access and Use was convened on September 21, 2012 at the Herbert Humphrey Building in Washington, DC. The meetings were open to the public.

Present:
Working Group members:

  • Justine M. Carr, M.D., Chair
  • P. Kenyon Crowley, MBA, MS
  • Bruce Cohen, Ph.D.
  • Bill Davenhall, ESRI
  • Leslie Pickering Francis, J.D., Ph.D.
  • Mohit Kaushal, MD
  • Joshua Rosenthal, Ph.D.
  • Walter Suarez, M.D.
  • Leah Vaughan, MD

Absent:

  • M. Chris Gibbons, MD
  • Patrick Remington, MD
  • Kalahn Taylor-Clark, Ph.D.

Lead Staff and Liaisons:

  • Marjorie Greenberg, NCHS, Exec. Secretary
  • James Scanlon, ASPE, Exec. Staff Director
  • Ed Sondik, NCHS Susan Queen, ASPE

Others (not including presenters):

  • Debbie Jackson, NCHS
  • Katherine Jones, NCHS
  • Marietta Squire, NCHS
  • Vickie Mays, Ph.D., MSPH, NCVHS member
  • Paul Tang, M.D., MPH, NCVHS member
  • Judy Warren, Ph.D., NCVHS member
  • Susan Baird Kanaan, consultant writer

Note: The transcript of this meeting and speakers’ slides are posted on the NCVHS Web site. Use the meeting date to locate them.


EXECUTIVE SUMMARY

The meeting began with presentations on HHS data holdings from the perspective of three HHS agencies.

HHS Chief Technology Officer Perspective — Bryan Sivac, HHS CTO (slides)

Mr. Sivac briefed the group on HealthData.gov 2.0, which was launched in June 2012, with further improvements released in October 2012. Following a large effort to push data out, the next step is to help people understand what the data are all about and to facilitate contact between HHS sources and data users and potential users. His office is also building internal engagement and cultivating a “default setting of openness” at HHS. He announced that the Fourth Datapalooza will take place on June 3-4, 2013.

Overview of NCHS Data Products and Services — Jim Craver, NCHS (slides)

Mr. Craver said the National Center for Health Statistics maintains data in three large categories: the vital statistics system, surveys of people, and national health care surveys. He described the following tools: the Health Indicators Warehouse; Health, United States; Healthy People 2010/2020; Health Data Interactive; and FastStats. All are accessible through the NCHS homepage.

Overview of CMS Data Products and Services — Allison Oelschlaeger

Ms. Oelschlaeger is with the Office of Information Products and Data Analysis of the Centers for Medicare and Medicaid Services. The Office was launched in June 2012. She described CMS data holdings in the Health Indicators Warehouse and HealthData.gov, and discussed the Blue Button Initiative.

Discussion: Working Group Process and Possible Projects

The group spent the final portion of the meeting discussing projects they might pursue and processes that would help them carry out their charge. Mr. Scanlon observed that HHS created the Working Group to advise it on how to make its extensive data holdings more broadly known, accessible, and usable, and to help develop a community of users for the data. The purpose of this effort, it was noted, is to make data available to help individuals and communities make better decisions that improve their lives.

Using slides, Dr. Rosenthal gave a short presentation on this question: How can HHS improve outreach to developers? — Initial Suggestions on how to make the ecosystem more rich, and separate ‘cool fluff’ from serious use. (His recommendations are listed in the detailed summary, below.) This presentation and written suggestions from Mr. Davenhall (both posted internally), contributed to the group’s discussion of process and projects, summarized below.

Process

Crowd-sourcing via learning centers was highlighted as a good way to engage more people in the present conversation and to elicit answers to the questions on the table. In this context, members stressed the need to reach beyond the “usual suspects” and to think inclusively about the breadth of the potential community of developers, including those in the world of population and community health.

A member suggested distinguishing between data supply and demand issues and starting by focusing on improving the supply of datasets, then working on the demand side. He also proposed “jumping into one of the big issues” at the next meeting.

Project: What should the Working Group do?

Two ideas for projects, or components of a single project, were put forward:

  • Exploring the possibility of developing apps based on the comprehensive NCVHS diagram of the influences on health (page 9, 21st Century Final Report)
  • Addressing the need for an infrastructure to support community data use and analysis, and/or enabling more complete use of the existing infrastructure.

Next Steps

In preparation for their November meeting, Working Group members were asked to review all the documents and slides associated with this meeting with an eye to principles and priorities, and to continue to think about the practical applications of the ideas shared.

DETAILED SUMMARY

This Working Group meeting immediately followed the NCVHS full Committee meeting. After Dr. Carr’s call to order and introductions around the room, it began with a series of presentations on HHS data holdings.

HHS Chief Technology Officer Perspective — Bryan Sivac, HHS CTO (slides)

Mr. Sivac briefed the group on HealthData.gov 2.0, which was launched in June 2012 in concert with the third edition of the Health Data Initiative (HDI) Forum (AKA Datapalooza). Several improvements will be released in October 2012 that will help ensure that the datasets are high quality and protect privacy. The improvements include metadata enhancement and inclusion of stories on how data are being used, and work is under way on enhancing the design and engaging people. A big step forward was providing programmatic access to the metadata catalog through APIs. Ultimately, the goal is to enable even small HHS entities to post and maintain content on HeathData.gov. Over all, Mr. Sivac sees the site as a marketing tool for HHS data holdings, and he is interested in feedback from internal and external users. He added that in his office, they are”big fans of open source.”

Following the large effort to push data out, the next step is to help people understand what the data are all about. Mr. Sivac commented on the importance of using plain language, free of jargon, and of explaining things clearly in order to “broaden the tent” and talk to a wider group of people. He also wants to facilitate contact between HHS data sources and data users and potential users. Over time, the idea is to “build communities around” the data to create “a unified band of people, working together.” He noted the international interest in this initiative.

To build internal engagement within HHS, health data leads were asked to identify their internal teams and compile a catalog of their datasets. There will be a training webinar on October 10. Mr. Sivac said his office is working to “change the culture at HHS … to a default setting of openness.” It has been quite successful, with only “pockets of resistance.” They are holding a series of meetings with data owners at agencies to discuss objectives and get feedback on issues that need to be resolved, starting with the Administration for Children and Families. Another activity is an experimental partnership with the Greater Baltimore Technology Council to holdevents called “Unwired” and “Groundwork” that focus on the city’s problems. If successful, it will be replicated in other places on both coasts where interest has been expressed.

The biannual meeting of HDI data leads will take place in November. HHS is encouraging the regional affiliates to work in their regions. Finally, Mr. Sivac announced that the Fourth Datapalooza will take place on June 3-4, 2013.

Discussion

Participants asked about the relationship between HealthData.gov and the Health Information Warehouse and stressed the importance of precision and accuracy when technical language is translated into plain language.

Review of Working Group Charge — Mr. Scanlon

Mr. Scanlon explained that the Department created the Working Group to advise it on how to make its extensive data holdings more known, accessible, and usable for the public, and to help develop a community of users for the data. The process begins at this meeting by exposing the Working Group to some HHS data holdings.

Overview of NCHS Data Products and Services — Jim Craver, NCHS (slides)

The data sources of the National Center for Health Statistics (NCHS) are in three large categories: the vital statistics system, surveys of people, and national health care surveys. In this way, NCHS provides multiple perspectives on health topics through multiple sources.

Mr. Craver described the following tools for finding statistics from multiple data sources: the Health Indicators Warehouse; Health, United States; Healthy People 2010/2020; Health Data Interactive; and FastStats. (See transcript and slides for details.) They are all accessible via the NCHS homepage, which was recently revamped to provide more user-friendly access to these tools via red arrows.

The Health Indicators Warehouse provides three views of aggregate data — by geography, topic, or initiative (e.g., Healthy People). A separate page contains the metadata on each indicator. The NCHS Board of Scientific Counselors provides oversight. Health U.S. is legislatively mandated and covers a broad range of sources and topics. An interactive version with views of the data in considerable depth will soon be available worldwide. The transition from Healthy People 2010 to 2020 allowed a review of the indicators. An increase in the number of objectives increases the challenge of tracking the data and keeping them updated. HHS produces regular reports on 10leading indicators. Mr. Craver commended to the group the HP 2010 Final Review, posted on NCHS as a pdf. Health Data Interactive has tabular views of aggregate data, and allows tables to be customized and manipulated. Finally, FastStats enables access to data using descriptors. Ms. Greenberg later added that the NCHS has a Research Data Center (RDC) that gives researchers controlled access to restricted data. Dr. Cohen said the RDC also can be used as a data enclave to store confidential data from other sources, such as CDC and Census.

In conclusion, Mr. Craver said NCHS is paying new attention to harmonization and standardization.

Discussion

Dr. Rosenthal suggested that it would be useful to developers and others new to these resources if they included an entity diagram to provide context and show relationships — “what the world looks like through your eyes when you’re looking at your data.” Mr. Craver said there are plans to put up an entity relationship diagram. Mr. Davenhall asked about visitor statistics, and was told that NCHS does not take “fingerprints” on visitors but does collect data on usage, and it has a mechanism for complaints and suggestions. It is interested in ways to go back to users for more targeted feedback and stories on how people are using the data.

Dr. Vaughn asked about alignment with states and counties around data uses and resources, and Mr. Craver said they had not had much focused interaction with state representatives but will soon do an evaluation study in which state directors of public health will be asked for input. Dr. Cohen noted that there are 20 to 30 state-based, web-based query systems. He expressed hope that stimulating state and local developments would increasingly be an objective of these federal efforts around data.

Overview of CMS Data Products and Services — Allison Oelschlaeger

Ms. Oelschlaeger is with the CMS Office of Information Products and Data Analysis, which was launched in June 2012. She described CMS data holdings with respect to the Health Indicators Warehouse, HealthData.gov, and the Blue Button Initiative. In the Warehouse, CMS presents its data in specific reports. Regarding HealthData.gov, she said CMS is starting to come out with standalone public use files of claims data that are available for download, stripped of identifiable information. At Dr. Francis’ request, she described CMS’
de-identification methodology. CMS also posts a Medicare and Medicaid statistical supplement on HealthData.gov and provides “Compare” tools on hospitals, nursing homes, home health, and dialysis centers. The Blue Button Initiative shares personal claims data with Medicare and VA beneficiaries. The CMS Research Data Assistance Center (ResDAC), which just redesigned its website, is an external facing group to help people who want to get identifiable data after signing a data use agreement. There are penalties for violations, although typically people don’t mean to commit violations and CMS works with them to correct the problem. CMS is still trying to figure out the best way to make sure there are no serious violations, such as a data enclave model. Ms. Oelschlaeger suggested that the WorkingGroup might want to talk with the CMS privacy group. Finally, the Medicare Qualified Entity Program gives access to data on public provider performance.

Discussion

In response to a question, Ms. Oelschlaeger said there will soon be a data navigator function on the CMS website. Regarding mechanisms to understand and catalog users’ needs and data uses, she said CMS collects informationon researchers’ requests, and ResDAC is creating a page where researchers can describe how they will use the CMS data and their publications.

Mr. Davenhall noted the lack of standardization of the various HHS websites. He suggested that to encourage use of the data, HHS should make the sites consistent in look and feel. He also suggested that the Working Group think about how to make crosswalk files.

Suggestions from Dr. Rosenthal

Using slides (posted internally), Dr. Rosenthal gave a short presentation on this question: How can HHS improve outreach to developers? — Initial Suggestions on how to make the ecosystem more rich, and separate ‘cool fluff’ from serious use. He offered a series of recommendations, commenting on each one (see transcript for details):

  • Taxonomy: share the meaning of the data (publish ERD diagrams);
  • Learning center: give folks a place to go and share with each other;
  • Biz value in challenges (require prize contestants to show “biz” value;
  • Files (semi/synthetic): give folks something to play with
  • Data browsers: hook folks with information (without data)
  • Partnerships and products (private partners and data-driven products); and
  • Opt-in: allow individuals to actively contribute their data.

Discussion: Workgroup Process and Possible Projects

The group spent the remainder of the meeting discussing the projects they might pursue and what processes would best enable them to carry out the charge laid out by Mr. Scanlon.

Dr. Vaughn called attention to the “very rich public health heritage” in which people have been donating data for many decades, such as the Framingham Study. This heritage models data practices that already work very well.She also noted the breadth of the potential community of developers and urged the Working Group to think as inclusively as possible and remember the domain experts in health departments across the country.

Dr. Cohen observed that public health experts need advice from application developers, because they do not know how their rich data assets “resonate to the real world” and communities that might want to use them. Dr. Crowley proposed getting at this by inviting people into the conversation in a community-mediated way that is designed to generate answers and priorities. The question, Dr. Cohen said, is how to reach beyond the “usual suspects” to those who don’t use the data. Mr. Crowley suggested incentives and/or pilots.

Dr. Carr asked members to focus in on what the Working Group should, or might, do. She displayed an NCVHS diagram of the influences on health, and suggested that it might frame a project aligned with the NCVHS interest in empowering communities to use data to improve health. (page 9, 21st Century Final Report) Dr. Cohen proposed exploring the possibility of developing applications based on the diagram. Dr. Green called attention to the need for an infrastructure to support communities’ systematic use of data, knowledge, and technology and enhance their analytic capacities. Dr. Vaughn observed that it is also true that the nation has a lot of infrastructure that is not being used.

Mr. Scanlon asked the group to think about the practical applications of the suggestions made during this meeting, to give HHS practical advice on bridging the gap between data producers and potential users. He asked, How can we promote the use of the data for applications of all kinds? Are there other datasets that should be made available? Are there tools, technologies, and/or a more agile platform that could help? In the longer term, what are ways to bring communities together? As a possible approach for the Working Group, he suggested picking an area such as community-level data, focusing on it at a hearing, and identifying best practices whereby HHS could get its data assets to users who could benefit from them.

Mr. Davenhall offered a written document (posted internally) containing ideas for the Working Group’s work and for creating greater participation by software/information developers. He pointed out that to enrich this ecosystem, the Working Group needs to understand what problems it is being asked to solve. Dr. Vaughn proposed the following question (paraphrased): What information and resources would empower communities if they had easier access to them? Dr. Rosenthal stressed the power of crowdsourcing, through learning centers, as a good way to get to answers to such questions.

Dr. Carr noted the benefits that are already apparent from simply convening people who speak “different languages” in these Working Group meetings. She asked members to think about how to integrate the ideas shared in thissession in preparation for the Working Group’s November meeting.

Dr. Kaushal proposed that the Working Group start by focusing on improving the supply of datasets, and then work on the demand side; and he suggested “jumping into one of the big issues” at the next meeting.

Ms. Greenberg asked members to review all the documents associated with this meeting, including those from Mr. Davenhall and Dr. Rosenthal, and to tease out the principles and priorities that emerge.

In conclusion, members reiterated that the goal of this effort is to improve health and health care. The question before the group is how available data can be used to help individuals and communities make better decisionsand improve their health and their lives.

With the observation that the table has now been set for a robust and exciting discussion, Dr. Carr then adjourned the meeting.

I hereby certify that, to the best of my knowledge, the foregoing
summary of minutes is accurate and complete

  /s/                                                                            11/14/ 2012

____________________________________________________

Chair                                                                        Date