J Clin Endocrinol Metab. 2021; 16;106(7):e2700e2710.PMID: 33606028 doi: 10.1210/clinem/dgab093
This paper illustrates the power of machine learning to successfully predict adult height using growth measurements before age 6 years, without the need for bone age.
Computers beat us in games of predictions, such as chess. They beat us also in the exercise of predicting adult height. This is because machine learning, after necessary pre-processing, is particularly suited to predictions based on existing data, such as anthropometric data. They are powered by the identification of strong, but theory-free, patterns in the data, capturing complex, nonlinear relationships that may markedly improve prediction accuracy over conventional regression models. Prediction of adult height is commonly used in clinical settings, using the TannerWhitehouse (TW2) or BayleyPinneau (BP) methods, which utilize data on current height and bone age. Whereas TW2 and BP were derived from on data from research studies that were carefully constructed to mitigate bias, routine clinical data sources are typically less clean.
This study aimed to use machine learning to predict adult height based on height and weight measurements up to age 6 years in 2282 children from a community-based, observational longitudinal growth study, the GrowUp 1974 Gothenburg cohort. Two additional validation cohorts were used to assess the accuracy of the algorithm a second Swedish cohort, and the Edinburgh Longitudinal Growth Study. The winning model was the Random Forest, with 51 regression trees, and the best predictor variables were sex and height at age 3.4-6.0 years. Observed and predicted adult height were 173.9±8.9 cm and 173.9±7.7 cm, respectively, with prediction average error only -0.4±4.0 cm. This models was more accurate than any other model with or without bone age assessment. Accuracy remained remarkably stable in the Swedish and Edinburgh validation cohorts, despite their use of different measurement protocols.
These findings show the utility of machine learning to predict adult height. The uneducated machine, employed by students of computer science, provided predictions that are as good, or even better, than methods developed by experts over decades. Yet, these new sophisticated algorithms are only moderately superior to the 19th century target height. The 4-5 cm prediction errors in all methods are not a methodological fault, but rather reflect the fact that additional environmental cues impact growth beyond age 6 years with a magnitude of 4-5 cm, whatever method is used.