Disciplines

Biomedical Informatics | Community Health and Preventive Medicine | Dental Public Health and Education

Abstract (300 words maximum)

Dental caries remains a prevalent chronic disease among children, significantly affecting their quality of life, educational outcomes, and school attendance. Between 2011 and 2016, caries affected 17.4% of children aged 6-11 and 56.8% of adolescents aged 12-19, with higher incidences among non-Hispanic Black and Mexican American youth, and those from lower-income families. This study aims to develop a robust machine learning model to predict the presence of decayed, missing, or filled permanent molars (DMFT) in children and adolescents using demographic, dietary, and oral health examination data from the National Health and Nutrition Examination Survey (NHANES) for the years 2011 to 2014. Our study utilized merged NHANES data from two cycles (2011-2012 and 2013-2014) for training purpose and NHANES 2015-16 for testing, including demographic data, detailed oral examinations, dietary behavior records, and health insurance information for individuals aged 6 to 19. To develop the binary target variable, DMFT (“0”= no caries; “>0” = presence of caries) was constructed from the analysis of eight individual permanent molars. We employed a diverse array of machine learning algorithms—logistic regression, deep learning, XGBoost, support vector machines, and random forests—to enhance predictive accuracy and interpretability. Preliminary analysis identified significant predictors of dental caries, including income-to-poverty ratio, age, dietary sugar and carbohydrate intake, parental education levels, and race/ethnicity. The model's efficacy was assessed through metrics such as accuracy and area under the ROC curve. Detailed comparisons of the model performances will be presented to highlight the most effective models for DMFT predictions. Machine learning models have proven effective for early detection of dental caries risks among children and adolescents using NHANES data. The results can inform healthcare providers in implementing targeted preventive measures to reduce caries prevalence and improve public health outcomes. Future work will enhance the model by incorporating additional questionnaire data from NHANES and behavioral factors to improve prediction accuracy.

Academic department under which the project should be listed

SPCEET - Industrial and Systems Engineering

Primary Investigator (PI) Name

Christina Scherrer

Share

COinS
 

Machine Learning Approaches for Predicting Dental Caries in Permanent Molars of Children and Adolescents Using NHANES 2011-16 Data

Dental caries remains a prevalent chronic disease among children, significantly affecting their quality of life, educational outcomes, and school attendance. Between 2011 and 2016, caries affected 17.4% of children aged 6-11 and 56.8% of adolescents aged 12-19, with higher incidences among non-Hispanic Black and Mexican American youth, and those from lower-income families. This study aims to develop a robust machine learning model to predict the presence of decayed, missing, or filled permanent molars (DMFT) in children and adolescents using demographic, dietary, and oral health examination data from the National Health and Nutrition Examination Survey (NHANES) for the years 2011 to 2014. Our study utilized merged NHANES data from two cycles (2011-2012 and 2013-2014) for training purpose and NHANES 2015-16 for testing, including demographic data, detailed oral examinations, dietary behavior records, and health insurance information for individuals aged 6 to 19. To develop the binary target variable, DMFT (“0”= no caries; “>0” = presence of caries) was constructed from the analysis of eight individual permanent molars. We employed a diverse array of machine learning algorithms—logistic regression, deep learning, XGBoost, support vector machines, and random forests—to enhance predictive accuracy and interpretability. Preliminary analysis identified significant predictors of dental caries, including income-to-poverty ratio, age, dietary sugar and carbohydrate intake, parental education levels, and race/ethnicity. The model's efficacy was assessed through metrics such as accuracy and area under the ROC curve. Detailed comparisons of the model performances will be presented to highlight the most effective models for DMFT predictions. Machine learning models have proven effective for early detection of dental caries risks among children and adolescents using NHANES data. The results can inform healthcare providers in implementing targeted preventive measures to reduce caries prevalence and improve public health outcomes. Future work will enhance the model by incorporating additional questionnaire data from NHANES and behavioral factors to improve prediction accuracy.