Google's Med-Gemini AI surpasses GPT-4 in healthcare

Blog Main Image
Author Thumbnail
Danny de Kruijk
Product Lead
May 3, 2024

On Monday, Google and DeepMind released a paper on Med-Gemini, a group of advanced AI models focused on healthcare applications. The models are still in the research phase, but the authors claim that Med-Gemini outperforms competing models such as OpenAI's GPT-4. However, the latter is not lagging behind in the medical field and has recently expanded its partnership with Moderna, a major pharmaceutical company.

Med-Gemini's remarkable leap forward, when validated in real-world settings, is its ability to capture context and temporality, such as potentially understanding the background and setting of symptoms, as well as the timing and order of their onset. This is a familiar stumbling block in existing health-related AI models.

It is true that we doctors are notorious for our abbreviations and lack of uniformity in documentation. Nevertheless, the true challenge in training medical algorithms is not textual complexity, but rather contextual ones. A simple example is a situation that every parent of a toddler knows well: having to visit a pediatrician for your youngster's fever and rash. The doctor will always ask, which came first, the fever or the rash? Did it spread from the head down or from the legs up? These simple features can distinguish a mild and self-limiting illness, such as Roseola, from a potentially life-threatening one, such as meningococcal meningitis.

These seemingly simple questions, with their multidimensionality and time-series features, can completely throw an AI model off the bat with the slightest inaccuracy. This exact contextuality appears to have been addressed by Med-Gemini by moving away from the massive undertaking of building a comprehensive overall medical model. Instead, Google's developers have adopted a vertical-by-vertical approach to related models, referred to as a “family” of models, each optimizing for a specific medical domain or scenario, such as image analysis in radiology and pathology, signal interpretation such as deciphering electrocardiogram exams, or long-term context understanding such as reading long medical records.

According to researchers, this has led to improved and nuanced accuracy, and more transparent reasoning, that offers some interpretable feedback, such as why a proposed diagnosis is the most likely. Since doctors are expected to stay up to date with recent research, Google Med-Gemini appears to hold to the same standard. The new model also includes a significant additional layer - a web-based search for current information, allowing data to be supplemented with external knowledge, integrating online results into the model.

Although Med-Gemini has used various data sources, including excerpts from health records, x-rays, skin lesion photos, medical exam preparation questions, and others, it's still important to remember what still needs to happen: real-world validation of actual production-level data in an everyday clinical setting, or at least a prospective double-blind randomized clinical trial.

Here is the link to the paper: Med-Gemini Paper