For a full list of publications see Ken (kezhi) Li and Zella King.

This survey synthesizes 55 studies on medical bias in large AI models, a rapidly growing but fragmented field spanning medicine, sociology, and computer science. It organizes the literature into three themes: bias taxonomy, detection methods, and mitigation strategies. The review identifies key challenges, including the absence of unified fairness principles, limited datasets and benchmarks, weak automatic bias-detection tools, lack of real-world and continuous validation, poor population representation, and minimal analysis of fairness-accuracy trade-offs. It also highlights emerging research opportunities and provides an indexed list of publicly available models and datasets to support future work.
Y. Xiao, Z. Chen, J. T. Huang, W. Chen, Y. Liu, K. Li, M. Mousavi, R. Dobson, J. Zhang

The study introduces PROMPT, an explainable machine-learning pipeline that predicts real-time 30-day mortality risk for critically ill children during inter-hospital transport to PICUs. Using continuous vital-sign time series, EHR data, and transport-specific information, PROMPT outperforms the conventional PIM3 score, with random forest and logistic regression models achieving AUROC up to 0.83. The system provides dynamic, patient-specific risk updates and SHAP-based explanations, helping clinicians identify deterioration during transport. This work demonstrates the feasibility of real-time, interpretable AI decision support in mobile paediatric critical care.
Z. Huo, J. Booth, T. Monks, P. Knight, L. Watson, M. Peters, C. Pagel, P. Ramnarayan, K. Li

The state of the art of blood glucose prediction has been achieved by leveraging advanced deep learning methods to model multimodal data, i.e., sensor data and self-reported event data, organized as multi-variate time series (MTS). However, these methods are mostly regarded as “black boxes” and not entirely trusted by clinicians and patients. In this paper, we propose interpretable graph attentive recurrent neural networks (GARNNs) to model MTS, explaining variable contributions via summarizing variable importance and generating feature maps by graph attention mechanisms instead of post-hoc analysis.
C. Piao, T. Zhu, S. E. Baldeweg, P. Taylor, P. Georgiou, J. Sun, J. Wang, K. Li

This systematic review aims to critically assess the use of LLMs in mental health, specifically focusing on their applicability and efficacy in early screening, digital interventions, and clinical settings. By systematically collating and assessing the evidence from current studies, our work analyzes models, methodologies, data sources, and outcomes, thereby highlighting the potential of LLMs in mental health, the challenges they present, and the prospects for their clinical use.
Z. Guo, A. Lai, J.H. Thygesen, J. Farrington, T. Keen, K. Li

This study employed GPU-accelerated value iteration implemented with the Python library JAX to efficiently find optimal replenishment policies for perishable inventory problems with large state spaces. This approach can address complex scenarios with over 16 million states and additional features, demonstrating superior performance compared to heuristic policies with a maximum optimality gap of 2.49%. The methodology has the potential to extend the application of value iteration to a broader range of operational research problems by leveraging consumer-grade GPU hardware for large-scale parallel computation.
J. Farrington, K. Li, W. Wong, M. Utley

Applying a mixed effects logistic regression analysis to the outcomes of a six-week longitudinal study in 12 type 1 diabetes (T1D) adults using CGM and a clinically validated wearable sensor wristband (NCT ID NCT03643692), we identified several significant associations between physiological measurements and hypo- and hyperglycemic events measured an hour later. We proceeded to develop a new smartphone-based platform, ARISES (Adaptive, Real-time, and Intelligent System to Enhance Self-care), with an embedded deep learning algorithm utilizing multi-modal data from CGM, daily entries of meal and bolus insulin, and the sensor wristband to predict glucose levels and hypo- and hyperglycemia.
T. Zhu, K. Li, P. Herrero, P. Georgiou

Machine learning for hospital operations is under-studied. We present a prediction pipeline that uses live electronic health-records for patients in a UK teaching hospital’s emergency department (ED) to generate short-term, probabilistic forecasts of emergency admissions. A set of XGBoost classifiers applied to 109,465 ED visits yielded AUROCs from 0.82 to 0.90 depending on elapsed visit-time at the point of prediction. Patient-level probabilities of admission were aggregated to forecast the number of admissions among current ED patients and, incorporating patients yet to arrive, total emergency admissions within specified time-windows.
Z. King, J. Farrington, M. Utley, E. Kung, S. Elkhodair, S. Harris, R. Sekula, J. Gillham, K. Li, S. Crowe

We propose a novel deep learning framework, Fast-adaptive and Confident Neural Network (FCNN). An attention-based recurrent neural network is used to learn representations from contiuous glucose monitoring input and forward a weighted sum of hidden states to an evidential output layer, aiming to compute personalized BG predictions with theoretically supported model confidence. The model-agnostic meta-learning is employed to enable fast adaptation for a new T1D subject with limited training data.
T. Zhu, K. Li, P. Herrero, P. Georgiou
IEEE Transactions on Biomedical Engineering (2021)

Platelets are a blood component required for the formation of clots. Patients may receive a platelet transfusion to either stop or prevent bleeding. Hospital blood banks hold stock on-site to fill orders for patients, and place orders for replenishment with a regional blood bank that serves multiple hospitals. This stock management process is complicated by the need for compatibility between donor units and patients, and by the short shelf-life of platelets. We created an reinforcement learning environment based on a platelet replenishment scenario from the literature and demonstrate that DRL methods can be used to learn effective ordering policies.
J. Farrington, K. Li, M. Utley, W. K. Wong
January 2022, UK CDT_conference

In this first-of-its-kind study, researchers from UCL combined multiple NHS datasets on national laboratory testing data, primary care consultations, hospitalisations and deaths to reveal the exact trajectory of individuals through the healthcare system during the pandemic, and what impact this had on their health outcomes.
Johan H Thygesen, Chris Tomlinson, Sam Hollings, Mehrdad Mizani, Alex Handy, Ashley Akbari, Amitava Banerjee, Jennifer Cooper, Alvina Lai, Ken Li, Bilal Mateen, Naveed Sattar, Reecha Sofat, Ana Torralbo, Honghan Wu, Angela Wood, Jonathan A C Sterne, Christina Pagel, William Whiteley, Cathie Sudlow, Harry Hemingway, Spiros Denaxas, on behalf of the CVD-COVID-UK Consortium

This challenge aimed to establish whether a single validated measure of microcirculatory perfusion (microcirculatory flow index) can be predicted directly from a DFM video sequence, without intermediate manual analysis steps. Automatic analysis that can be carried out in (near) real-time would facilitate the incorporation of microcirulatory targets into clinical trials by enabling the impact of interventions to be quantified and enacted upon with the aim of optimising the microcirculation and improving patient outcomes.
Chris Tomlinson, Jan Grols, Ramit Debnath, Sarah Johnson, Max Barton, Tianyu Han, Seyedeh Nazanin Khatami, Giacomo Baldo, Aniketh Ramesh, Diego Cammarano, Kashif Rajpoot
Turing Data Study Group Final Report (2021)

In this study, we design and build the first DL and ML pipeline that uses the routinely updated, linked EHR data for 56 million people in England accessed via NHS Digital to predict first ischaemic stroke in people with AF, and as a secondary outcome, COVID-19 death. Our pipeline improves first stroke prediction in AF by 17% compared to CHA2DS2-VASc (0.61 (0.57-0.65) vs 0.52 (0.52-0.52) area under the receiver operating characteristics curves, 95% confidence interval) and provides a generalisable, opensource framework that other researchers and developers can build on.
Alex Handy, Angela Wood, Cathie Sudlow, Chris Tomlinson, Frank Kee, Johan H Thygesen, Mohammad Mamouei, Reecha Sofat, Richard Dobson, Samantha Ip, Spiros Denaxas, on behalf of the CVD-COVID-UK Consortium