Our Projects
Real-time Reconstructions of Visual Perception from fMRI
Clinical Language Models
-
Large language models (LLMs) have been successfully applied to a diverse set of tasks, ranging from coding assistants to robotics. One area of keen interest is healthcare, where LLMs have the potential to assist care providers with administrative workflows and to provide clinical decision support. As such, there has been significant research in the past two years related to both understanding the medical capabilities of LLMs, as well as developing LLMs trained on medical text. Generalist LLMs like GPT-4 and PaLM exhibit medical knowledge that can be elicited via prompt engineering or minimal finetuning. While this approach has often resulted in state-of-the-art (SOTA) performance, training of medical domain-specific LLMs has also shown significant promise and it is an open question which approach is superior.
We are interested in studying this question in further detail and developing the best open LLMs for medical tasks. We believe the release of domain-specialized open-source LLMs is crucial for practical medical applications of the latest LLM advances. While there has been some work done in the field, like BioMedLM, Clinical Camel, Bio Mistral, Meditron, etc., all these open LLMs are inferior to closed models like Med-PaLM 2. We aim to build open LLMs that are on-par or superior to SOTA closed LLMs like Med-PaLM 2. We plan to share model weights, methodology, and results transparently.
-
Recent advances in machine learning and computational neuroscience now allow neuroscientists to reconstruct visual perception from human brain activations, with the caveat that such models are run offline and require large amounts of training data to be collected from the same individual, rendering them infeasible for clinical use.
This collaboration aims to develop a real-time system to reconstruct visual perception from human brain activations in a single brain-imaging session. This would allow researchers to quickly peer into the patient's internal mental experience and be used in closed-loop designs for brain-computer interfaces.
We are utilizing generative AI techniques to improve the state-of-the-art offline fMRI to image reconstructions of natural scenes and human faces. We will then use active learning to enable rapid data collection and real-time fMRI to image reconstruction.
Training Foundation Models for Radiology
-
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in generating high-quality images (see: DALL-E 2, CLIP, and of course, Stable Diffusion). However, medical imaging data fundamentally differs from natural images.
The language used to capture relevant details in medical data succinctly uses a different, narrow but semantically rich, domain-specific vocabulary. Not surprisingly, multimodal models trained on natural image-text pairs do not tend to generalize well to the medical domain.
For this reason, we have trained diffusion models and vision-language models on publicly available chest x-rays and corresponding prompts and instru tions. Our chest X-ray diffusion model is able to generate high-fidelity, diverse synthetic x-rays conditioned on text prompts, while our vision-language model achieves state-of-the-art results on chest X-ray interpretation tasks.