Machine learning models can predict subsequent publication of North American Spine Society (NASS) annual general meeting abstracts
Background context: Academic meetings serve as an opportunity to present and discuss novel ideas. Previous studies have identified factors predictive of publication without generating predictive models. Machine learning (ML) presents a novel tool capable of generating these models. As such, the objective of this study was to use ML models to predict subsequent publication of abstracts presented at a major surgical conference.
Study design/setting: Database study.
Methods: All abstracts from the North American Spine Society (NASS) annual general meetings (AGM) from 2013–2015 were reviewed. The following information was extracted: number of authors, institution, location, conference category, subject category, study type, data collection methodology, human subject research, and FDA approval. Abstracts were then searched on the PubMed, Google Scholar, and Scopus databases for publication. ML models were trained to predict whether the abstract would be published or not. Quality of models was determined by using the area under the receiver operator curve (AUC). The top ten most important factors were extracted from the most successful model during testing.
Results: A total of 1119 abstracts were presented, with 553 (49%) abstracts published. During training, the model with the highest AUC and accuracy metrics was the partial least squares (AUC of 0.77±0.05, accuracy of 75.5%±4.7%). During testing, the model with the highest AUC and accuracy was the random forest (AUC of 0.69, accuracy of 67%). The top ten features for the random forest model were (descending order): number of authors, year, conference category, subject category, human subjects research, continent, and data collection methodology.
Conclusions: This was the first study attempting to use ML to predict the publication of complete articles after abstract presentation at a major academic conference. Future studies should incorporate deep learning frameworks, cognitive/results-based variables and aim to apply this methodology to larger conferences across other fields of medicine to improve the quality of works presented.