A personal health large language model for sleep and fitness coaching

Justin Khasentino^#¹, Anastasiya Belyaeva^#², Xin Liu^#³, Zhun Yang^#¹, Nicholas A Furlotte^#¹, Chace Lee¹, Erik Schenck¹, Yojan Patel¹, Jian Cui¹, Logan Douglas Schneider¹, Robby Bryant¹, Ryan G Gomes¹, Allen Jiang¹, Roy Lee¹, Yun Liu¹, Javier Perez¹, Jameson K Rogers¹, Cathy Speed¹, Shyam Tailor¹, Megan Walker¹, Jeffrey Yu¹, Tim Althoff¹, Conor Heneghan¹, John Hernandez¹, Mark Malhotra¹, Leor Stern¹, Yossi Matias¹, Greg S Corrado¹, Shwetak Patel¹, Shravya Shetty¹, Jiening Zhan¹, Shruthi Prabhakara¹, Daniel McDuff⁴, Cory Y McLean⁵

Affiliations

¹ Google Research, Mountain View, CA, USA.
² Google Research, Mountain View, CA, USA. belyaeva@google.com.
³ Google Research, Mountain View, CA, USA. xliucs@google.com.
⁴ Google Research, Mountain View, CA, USA. dmcduff@google.com.
⁵ Google Research, Mountain View, CA, USA. cym@google.com.

^# Contributed equally.

PMID: 40813712
DOI: 10.1038/s41591-025-03888-0

A personal health large language model for sleep and fitness coaching

Justin Khasentino et al. Nat Med. 2025.

. 2025 Aug 14.

doi: 10.1038/s41591-025-03888-0. Online ahead of print.

Authors

Affiliations

¹ Google Research, Mountain View, CA, USA.
² Google Research, Mountain View, CA, USA. belyaeva@google.com.
³ Google Research, Mountain View, CA, USA. xliucs@google.com.
⁴ Google Research, Mountain View, CA, USA. dmcduff@google.com.
⁵ Google Research, Mountain View, CA, USA. cym@google.com.

^# Contributed equally.

PMID: 40813712
DOI: 10.1038/s41591-025-03888-0

Abstract

Although large language models (LLMs) show promise for clinical healthcare applications, their utility for personalized health monitoring using wearable device data remains underexplored. Here we introduce the Personal Health Large Language Model (PH-LLM), designed for applications in sleep and fitness. PH-LLM is a version of the Gemini LLM that was finetuned for text understanding and reasoning when applied to aggregated daily-resolution numerical sensor data. We created three benchmark datasets to assess multiple complementary aspects of sleep and fitness: expert domain knowledge, generation of personalized insights and recommendations and prediction of self-reported sleep quality from longitudinal data. PH-LLM achieved scores that exceeded a sample of human experts on multiple-choice examinations in sleep medicine (79% versus 76%) and fitness (88% versus 71%). In a comprehensive evaluation involving 857 real-world case studies, PH-LLM performed similarly to human experts for fitness-related tasks and improved over the base Gemini model in providing personalized sleep insights. Finally, PH-LLM effectively predicted self-reported sleep quality using a multimodal encoding of wearable sensor data, further demonstrating its ability to effectively contextualize wearable modalities. This work highlights the potential of LLMs to revolutionize personal health monitoring via tailored insights and predictions from wearable data and provides datasets, rubrics and benchmark performance to further accelerate personal health-related LLM research.

PubMed Disclaimer

Conflict of interest statement

Competing interests: This study was funded by Google LLC. All authors are employees of Alphabet and may own stock as part of the standard compensation package.

References

1. Katz, D. M., Bommarito, M. J., Gao, S. & Arredondo, P. GPT-4 passes the bar exam. Philos. Trans. A Math. Phys. Sci. Eng. 382, 20230254 (2024).
1. Singhal, K. et al. Toward expert-level medical question answering with large language models. Nat. Med. 31, 943–950 (2025). - DOI - PubMed - PMC
1. Nori, H. et al. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. Preprint at https://arxiv.org/abs/2311.16452 (2023).
1. Saab, K. et al. Capabilities of Gemini models in medicine. Preprint at https://arxiv.org/abs/2404.18416 (2024).
1. McDuff, D. et al. Towards accurate differential diagnosis with large language models. Nature 642, 451–457 (2025). - DOI - PubMed - PMC

LinkOut - more resources

Full Text Sources
- Nature Publishing Group

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A personal health large language model for sleep and fitness coaching

Affiliations

A personal health large language model for sleep and fitness coaching

Authors

Affiliations

Abstract

Conflict of interest statement

References

LinkOut - more resources

Full Text Sources