Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

US Medical Expense Analysis Through Frequency and Severity Bootstrapping and Regression Model

Fangjun Li, Gao Niu

Source Title: Biomedical and Business Applications Using Artificial Neural Networks and Machine Learning

DOI: 10.4018/978-1-7998-8455-2.ch007

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

For the purpose of control health expenditures, there are some papers investigating the characteristics of patients who may incur high expenditures. However fewer papers are found which are based on the overall medical conditions, so this chapter was to find a relationship among the prevalence of medical conditions, utilization of healthcare services, and average expenses per person. The authors used bootstrapping simulation for data preprocessing and then used linear regression and random forest methods to train several models. The metrics root mean square error (RMSE), mean absolute percent error (MAPE), mean absolute error (MAE) all showed that the selected linear regression model performs slightly better than the selected random forest regression model, and the linear model used medical conditions, type of services, and their interaction terms as predictors.

Chapter Preview

Top

Data Description And Preprocessing

The original data was collected by the Medical Expenditure Panel Survey (MEPS). MEPS is a set of large-scale surveys on the health services and the frequency, the cost of using these services, how they are paid, as well as the data about health insurance in the US. Among the two major components of MEPS, the Household Component (HC) provides information on household-reported medical conditions. The sample of families and individuals are from households that participated in the prior year's National Health Interview Survey (conducted by the National Center for Health Statistics) (Agency for Healthcare Research and Quality, 2019).

The data we use come from the HC summary data tables conducted by the American Agency for Healthcare Research and Quality (AHRQ) which provides number of people with care and mean expenditure per person from 2016 to 2018 (Agency for Healthcare Research and Quality, n.d.). The data are grouped by 53 condition categories collapsed from the household-reported conditions coded into ICD-10 and CCSR codes, and six event types (emergency room visits, home health events, inpatient stays, office-based events, outpatient events, prescription medicines).

Key Terms in this Chapter

Bootstrapping: A resampling method used to estimate statistics on a population by sampling a dataset with replacement. The process resamples a single dataset to create many simulated samples.

Multiple R-Squared: Also known as coefficient of determination, multiple R-squared is the proportion of the variation in dependent variable that can be explained by the independent variables. It provides a measure of how well observed outcomes are replicated by the model.

Mean Absolute Percentage Error (mape): A measure of how accurate a forecast system is and measures this accuracy as a percentage.

Mean Absolute Error (MAE): MAE measures the average magnitude of the errors in a set of predictions, without considering their direction.

Root-Mean-Square Error (RMSE): A frequently used measure of the differences between values predicted by a model or an estimator and the values observed, and it is sensitive to outliers.

Node: A node of a decision tree represents a “test” on an attribute. A root node is at the beginning of a tree where the entire population are analyzed. And each leaf node represents a class label.

goodness of fit: The goodness of fit of a statistical model describes how well it fits a set of observations. Some commonly used metrics include R squared, chi-squared test, etc.

MTRY: A parameter in Random Forest modeling that represents the number of variables sampled at each split.

Type of Events: A home health event is defined as one month during which home health service was received. For prescription medicines, an event is defined as a purchase or refill.

Adjusted R-Squared: It indicates how well terms fit a curve or line but adjusts for the number of terms in a model.

Predictive Modeling: A commonly used statistical technique to predict future behavior. Predictive modeling solutions are a form of data-mining technology that works by analyzing historical and current data and generating a model to help predict future outcomes.

Bagging: Bagging is an acronym for Bootstrap Aggregating. It is an ensemble meta-algorithm that is commonly used to reduce variance within a noisy dataset. Several data samples are generated by random selection with replacement, and then weak models are then trained independently to yield a more accurate estimate.

Split: One node can split into several branches and each branch represents the outcome of the test.

Medical Conditions: The data we used are after 2016, when the household-reported conditions are coded into ICD-10 codes and then collapsed into the Condition categories in the tables.

NTREE: A parameter in Random Forest modeling that represents the number of trees to grow.

Data Preprocessing: One important step in machine learning. Manipulating data before it is used to build a model for example, in order to enhance performance and more applicable to algorithms or models. It usually involves steps such as data cleaning, data integration, data reduction, and data transformation.

Akaike's Information Criteria (AIC): It is a standard to measure the goodness of statistical model fitting. Based on the concept of entropy, the criterion can weigh the complexity of the estimated model and the goodness of the model fitting data.

Simulated Dataset: New datasets that resemble but are not identical to the existing dataset by methods such as bootstrapping.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

US Medical Expense Analysis Through Frequency and Severity Bootstrapping and Regression Model

Abstract

Data Description And Preprocessing

Key Terms in this Chapter

Complete Chapter List