Agrawal, R. & Prabakaran, S. Big data in digital healthcare: lessons learnt and recommendations for general practice. Heredity 124, 525–534 (2020).
Shilo, S., Rossman, H. & Segal, E. Axes of a revolution challenges and promises of big data in healthcare. Nat. Med. 26, 29–38 (2020).
Woldemariam, M. T. & Jimma, W. Adoption of electronic health record systems to enhance the quality of healthcare in low-income countries: a systematic review. BMJ Health Care Inform. 30, e100704 (2023).
Liang, H. et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat. Med. 25, 433–438 (2019).
Xu, H. et al. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188 (2024).
Feinberg, D. A. et al. Next-generation MRI scanner designed for ultra-high-resolution human brain imaging at 7 Tesla. Nat. Methods 20, 2048–2057 (2023).
Schuijf, J. D. et al. CT imaging with ultra-high-resolution: opportunities for cardiovascular imaging in clinical practice. J. Cardiovasc. Comput. Tomogr. 16, 388–396 (2022).
Goodwin, S., McPherson, J. D. & McCombie, W. R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333–351 (2016).
Metzker, M. L. Sequencing technologies—the next generation. Nat. Rev. Genet. 11, 31–46 (2010).
Karczewski, K. J. & Snyder, M. P. Integrative omics for health and disease. Nat. Rev. Genet. 19, 299–310 (2018).
Van de Sande, B. et al. Applications of single-cell RNA sequencing in drug discovery and development. Nat. Rev. Drug Discov. 22, 496–520 (2023).
Wratten, L., Wilm, A. & Göke, J. Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. Nat. Methods 18, 1161–1168 (2021.
Cao, Y. et al. Ensemble deep learning in bioinformatics. Nat. Mach. Intell. 2, 500–508 (2020).
Dubay, C. et al. Delivering bioinformatics training: bridging the gaps between computer science and biomedicine. Proc. AMIA Symp. 2002, 220–224 (2002).
Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).
Li, J. et al. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin. Chem. 48, 1296–1304 (2002).
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).
Fitzgerald, R. C. et al. The future of early cancer detection. Nat. Med. 28, 666–677 (2022).
McDonald, T. O. et al. Computational approaches to modelling and optimizing cancer treatment. Nat. Rev. Bioeng. 1, 695–711 (2023).
Misra, B. B. et al. Integrated omics: tools, advances, and future approaches. J. Mol. Endocrinol. 62, R21–R45 (2019).
Brooks, T. G. et al. Challenges and best practices in omics benchmarking. Nat. Rev. Genet. 25, 326–339 (2024).
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).
Goecks, J. et al. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11, 1–13 (2010).
Malhotra, R. et al. Using the seven bridges Cancer Genomics Cloud to access and analyze petabytes of cancer data. Curr. Protoc. Bioinformatics 60, 11.16.1–11.16.32 (2017).
Kaur, S. & Kaur, S. Genomics with cloud computing. Int. J. Sci. Technol. Res. 4, 146–148 (2015).
Wu, T. et al. A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Autom. Sin. 10, 1122–1136 (2023).
Brown, T. B. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).
Hao, M. et al. Large-scale foundation model on single-cell transcriptomics. Nat. Methods 21, 1481–1491 (2024).
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Tang, X. et al. MedAgents: large language models as collaborators for zero-shot medical reasoning. Find. Assoc. Comput. Linguist: ACL 2024, 599–621 (2024).
Guo, T. et al. Large language model based multi-agents: a survey of progress and challenges. Proc. Int. Joint Conf. Artif. Intell. 33, 8048–8057 (2024).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Gao, S. et al. Empowering biomedical discovery with AI agents. Cell 187, 6125–6151 (2024).
Boiko, D. A. et al. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).
Dai, T. et al. Autonomous mobile robots for exploratory synthetic chemistry. Nature 635, 890–897 (2024).
Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462–1465 (2024).
Lobentanzer, S. et al. A platform for the biomedical application of large language models. Nat. Biotechnol. 43, 166–169 (2025).
Tayebi Arasteh, S. et al. Large language models streamline automated machine learning for clinical studies. Nat. Commun. 15, 1603 (2024).
Zhou, J. et al. An AI agent for fully automated multi-omic analyses. Adv. Sci. 11, e2407094 (2024).
Xiao, Y. et al. CellAgent: an LLM-driven multi-agent framework for automated single-cell data analysis. Preprint at bioRxiv https://doi.org/10.1101/2024.05.13.593861 (2024).
Liu, H. & Wang, H. GenoTEX: a benchmark for evaluating LLM-based exploration of gene expression data in alignment with bioinformaticians. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.15341 (2024).
Mitchener, L. et al. Bixbench: a comprehensive benchmark for LLM-based agents in computational biology. Preprint at arXiv https://doi.org/10.48550/arXiv.2503.00096 (2025).
Gómez-López, G. et al. Precision medicine needs pioneering clinical bioinformaticians. Brief. Bioinform. 20, 752–766 (2019).
Hou, X. et al. Large language models for software engineering: A systematic literature review. ACM Trans. Softw. Eng. Methodol. (2023).
Xin, Q. et al. BioInformatics Agent (BIA): unleashing the power of large language models to reshape bioinformatics workflow. Preprint at bioRxiv https://doi.org/10.1101/2024.05.22.595240 (2024).
Su, H., Long, W. & Zhang, Y. BioMaster: multi-agent system for automated bioinformatics analysis workflow. Preprint at bioRxiv https://doi.org/10.1101/2025.01.23.634608 (2025).
Afzal, M. et al. Precision medicine informatics: principles, prospects, and challenges. IEEE Access 8, 13593–13612 (2020).
Zhang, W. & Mei, H. A constructive model for collective intelligence. Natl Sci. Rev. 7, 1273–1277 (2020).
Qian, C. et al. Iterative experience refinement of software-developing agents. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.04219 (2024).
Riffle, D. et al. OLAF: an open life science analysis framework for conversational bioinformatics powered by large language models. Preprint at arXiv https://doi.org/10.48550/arXiv.2504.03976 (2025).
Xie, E. et al. CASSIA: a multi-agent large language model for automated and interpretable cell annotation. Nat. Commun. 17, 389 (2025).
Tsyganov, M. M. et al. Influence of DNA copy number aberrations in ABC transporter family genes on the survival of patients with primary operatable non-small cell lung cancer. Curr. Cancer Drug Targets (2025).
Sun, Y. et al. SERINC2-mediated serine metabolism promotes cervical cancer progression and drives T cell exhaustion. Int. J. Biol. Sci. 21, 1361–1377 (2025).
Wang, X., Jiang, C. & Li, Q. Serinc2 drives the progression of cervical cancer through regulating Myc pathway. Cancer Med. 13, e70296 (2024).
Lee, J. S. et al. SEZ6L2 is an important regulator of drug-resistant cells and tumor spheroid cells in lung adenocarcinoma. Biomedicines 8 (2020).
Ishikawa, N. et al. Characterization of SEZ6L2 cell-surface protein as a novel prognostic marker for lung cancer. Cancer Sci. 97, 737–745 (2006).
Jee, J. et al. DNA liquid biopsy-based prediction of cancer-associated venous thromboembolism. Nat. Med. 30, 2499–2507 (2024).
Xu, Z. et al. MiHATP: a multi-hybrid attention super-resolution network for pathological image based on transformation pool contrastive learning. In International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2024).
Yang, K. et al. If LLM is the wizard, then code is the wand: a survey on how code empowers large language models to serve as intelligent agents. Preprint at arXiv https://doi.org/10.48550/arXiv.2401.00812 (2024).
Qian, C. et al. Investigate-consolidate-exploit: A general strategy for inter-task agent self-evolution. Preprint at arXiv https://doi.org/10.48550/arXiv.2401.13996 (2024).
Gao, C. et al. Large language models empowered agent-based modeling and simulation: a survey and perspectives. Humanit. Soc. Sci. Commun. 11, 1–24 (2024).
Zhong, W. et al. Memorybank: enhancing large language models with long-term memory. Proc. AAAI Conf. Artif. Intell. 38, 19724–19731 (2024).
Liu, L. et al. Think-in-memory: Recalling and post-thinking enable llms with long-term memory. Preprint at arXiv https://doi.org/10.48550/arXiv.2311.08719 (2023).
Li, Z. et al. VarBen: generating in silico reference data sets for clinical next-generation sequencing bioinformatics pipeline evaluation. J. Mol. Diagn. 23, 285–299 (2021).
Pendleton, M. et al. Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat. Methods 12, 780–786 (2015).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304 (2024).
Yang, Y. et al. Comprehensive landscape of resistance mechanisms for neoadjuvant therapy in esophageal squamous cell carcinoma by single-cell transcriptomics. Signal Transduct. Target. Ther. 8, 298 (2023).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018).
Bu, D. et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 49, W317–W325 (2021).
Jiang, X. et al. Long term memory: the foundation of AI self-evolution. Preprint at arXiv https://doi.org/10.48550/arXiv.2410.15665 (2024).
Sun, J. Biomedical dataset files collection. Zenodo https://doi.org/10.5281/zenodo.17430550 (2025).
