CHAPTER 5:
                                                                                                                                                         Science           Medicine
                                     Artificial Intelligence
--------------PAGE Number: 1. END OF PAGE-----------------------
                                                                                                                                                                                                         CHAPTER 5:
                      LAI            Artificial Intelligence                                                                                                                              Science and
                                     Index Report 2024
                                                                                                                                                                                                     Medicine
                     Preview
                      Overview                                                                                                          3
                                                                                                                                                        ACCESS THE PUBLIC DATA
                      Chapter Highlights                                                                                                4
                      5.1                                                                                                               5
                            AlphaDev                                                                                                    5
                            FlexiCubes                                                                                                  6
                           Synbot                                                                                                       8
                            GraphCast                                                                                                   9
                            GNoME                                                                                                      10
                            Flood Forecasting                                                                                          11
                      5.2                                                                                                              12
                      Notable Medical Systems                                                                                          12
                            SynthSR                                                                                                    12
                            Coupled Plasmonic Infrared Sensors                                                                         14
                            EVEscape                                                                                                   15
                            AlphaMissence                                                                                              17
                            Human Pangenome Reference                                                                                  18
                      Clinical Knowledge                                                                                               19
                            MedQA                                                                                                      19
                            Highlighted Research: GPT-4 Medprompt                                                                     20
                            Highlighted Research: MediTron-70B                                                                        22
                      Diagnosis                                                                                                       23
                            Highlighted Research: CoDoC                                                                               23
                            Highlighted Research: CT Panda                                                                            24
                            Other Diagnostic Uses                                                                                     25
                            FDA-Approved AI-Related Medical Devices                                                                   26
                       Administration and Care                                                                                        28
                            Highlighted Research: MedAlign                                                                            28
                      Appendix                                                                                                        30
                                                                                                                                                                                                                                      2
--------------PAGE Number: 2. END OF PAGE-----------------------
                                                                                                                                                                                                         CHAPTER 5:
                      LAI            Artificial Intelligence
                                     Index Report 2024                                                                                                                                     Science and
                                                                                                                                                                                                     Medicine
                      Overview
                       This year's Al Index introduces a new chapter on Al in science and medicine in recognition of
                      Al's growing role in scientific and medical discovery. It explores 2023's standout Al-facilitated
                      scientific achievements, including advanced weather forecasting systems like GraphCast
                      and improved material discovery algorithms like GNoME. The chapter also examines medical
                      AI system performance, important 2023 AI-driven medical innovations like SynthSR and
                      ImmunoSEIRA, and trends in the approval of FDA AI-related medical devices.
                                                                                                                                                                                                                                      3
--------------PAGE Number: 3. END OF PAGE-----------------------
                                                                                                                                                                                                         CHAPTER 5:
                      LAI            Artificial Intelligence
                                     Index Report 2024                                                                                                                                     Science and
                                                                                                                                                                                                     Medicine
                     Chapter Highlights
                      1. Scientific progress accelerates even further, thanks to Al. In 2022, AI began to advance
                      scientific discovery. 2023, however, saw the launch of even more significant science-related Al applications--
                      from AIphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of
                      materials discovery.
                      2. Al helps medicine take significant strides forward. In 2023, several significant medical systems
                      were launched, including EVEscape, which enhances pandemic prediction, and AlphaMissence, which assists in
                      Al-driven mutation classification. Al is increasingly being utilized to propel medical advancements.
                      3. Highly knowledgeable medical AI has arrived. Over the past few years, Al systems have shown
                      remarkable improvement on the MedQA benchmark, a key test for assessing AI's clinical knowledge. The
                      standout model of 2023, GPT-4 Medprompt, reached an accuracy rate of 90.2%, marking a 22.6 percentage
                      point increase from the highest score in 2022. Since the benchmark's introduction in 2019, Al performance on
                      MedQA has nearly tripled.
                      4. The FDA approves more and more AI-related medical devices. In 2022, the FDA approved 139
                      AI-related medical devices, a 12.1% increase from 2021. Since 2012, the number of FDA-approved AI-related medical
                      devices has increased by more than 45-fold. Al is increasingly being used for real-world medical purposes.
                                                                                                                                                                                                                                      4
--------------PAGE Number: 4. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                     This section highlights significant Al-related scientific breakthroughs of 2023 as chosen by the AI Index Steering Committee.
                      5.1
                      AlphaDev                                                                                                        fundamental sorting algorithms on short sequences
                      AlphaDev discovers faster sorting algorithms                                                                     such as Sort 3, Sort 4, and Sort 5 (Figure 5.1.1). Some
                      AlphaDev is a new Al reinforcement learning system                                                              of the new algorithms discovered by AlphaDev have
                      that has improved on decades of work by scientists                                                               been incorporated into the LLVM standard C++ sort
                      and engineers in the field of computational algorithmic                                                         library. This marks the first update to this part of
                      enhancement. AlphaDev developed algorithms with                                                                 the library in over 10 years and is the first addition
                      fewer instructions than existing human benchmarks for                                                           designed using reinforcement learning.
                      AlphaDev vs. human benchmarks when optimizing for algorithm length
                       Source: Mankowitz et al., 2023 | Chart: 2024 AI Index report
                           120                                                                                                                                                                 115
                                           AlphaDev              Human benchmarks
                          100
                           80
                      E
                                                                                                                                                                   66
                                                                                                                                                                                     63
                           60
                                                                                                          46
                                                                                                42
                           40                                                                                                                            37
                                                                                                                                      33                                                                                    31
                                                                    28       28                                                                                                                                   27
                                                 18                                                                          21
                           20           17
                             0            Sort 3                      Sort 4                       Sort 5                    VarSort3                     VarSort4                    VarSort5                      Varlnt
                                                                                                                             Algorithm
                                                                                                                                                                                                                          Figure 5.1.1
                      Chapter5Preview                                                                                                                                                                                                 5
--------------PAGE Number: 5. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                      FlexiCubes                                                                                                       quality. FlexiCubes addresses some of these
                      3D mesh optimization with FlexiCubes                                                                             limitations by employing Al for gradient-based
                      3D mesh generation, crucial in computer graphics,                                                                optimization and adaptable parameters (Figure
                      involves creating a mesh of vertices, edges, and                                                                 5.1.2
                      faces to define 3D objects. It is key to video games,                                                            mesh adjustments. Compared to other leading
                      animation, medical imaging, and scientific visualization.                                                        methods that utilize differentiable isosurfacing for
                      Traditional isosurface extraction algorithms often                                                               mesh reconstruction, FlexiCubes achieves mesh
                      struggle with limited resolution, structural rigidity, and                                                       extractions that align much more closely with the
                      numerical instabilities, which subsequently impacts                                                              underlying ground truth (Figure 5.1.3).
                      Sample FlexiCubes surface reconstructions
                      Source: Nvidia, 2023
                         1  1
                                         Marching Cubes 15k tris                                 DMTET 15k tris                                    FLEXiCuBES 13k tris                                  Reference 91k tris
                       1   1
                                 3D reconstruction from images             Generative 3D modeling           Animated 3D reconstruction          Tet-mesh physics simulation              Adaptive Meshing             Developability
                                                                                                                                                                                                                         Figure 5.1.2
                      Chapter 5 Preview                                                                                                                                                                                               6
--------------PAGE Number: 6. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1  Notable Scientific Milestones
                      Select quantitative results on 3D mesh reconstruction
                      Source: Shen et al., 2023 | Chart: 2024 Al Index report
                                           80.67%
                           80%
                           70%
                                                                       63.34%
                           60%                                                                      55.22%
                                                                                                                                52.37%                      50.20%
                      E    50%                                                                                                                                                          48.66%
                           40%
                                                                                                                                                                                                                    34.87%
                           30%
                           20%
                           10%
                             0%                                                                    NDCsDF
                                           MCsDF                      DChermite                                                  MC                      DMTet(64)                   DMTet(80)                   FlexiCubes
                                                                                                             Algorithm/method evaluated at 643
                                                                                                                                                                                                                         Figure 5.1.3
                      Chapter 5Preview                                                                                                                                                                                                7
--------------PAGE Number: 7. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                      Synbot                                                                                   Synbot design
                                                                                                               Source: Ha et al., 2023
                      Al-driven robotic chemist for
                      synthesizing organic molecules                                                                                          [Output] Synthetic         [Input]
                                                                                                                                              recipe & material          Target molecule & task
                      Synbot employs a multilayered system,
                      comprising an AI software layer for                                                              @ Pantry                                                                 @ Retrosynthesis
                                                                                                                       Dispensing                               Experimental                    DoE& optimization
                      chemical synthesis planning, a robot                                                             Reaction                                     results                     Decision-making
                                                                                                                       @ Sample-prep.                                                           Database
                      software layer for translating commands,                                                         Analysis                          Robot                AI S/W
                                                                                                                       Transfer-robot                     layer                layer
                      and a physical robot layer for conducting                                                                                                  Synbot
                      experiments. The closed-loop feedback                                                                                            Robot                     Synthesis,
                                                                                                                                                    commands                      recipes,
                      mechanism between the Al and the                                                                                                           S/W layerRobot
                      robotic system enables Synbot to develop                                                                                                                                  @ Recipe generation
                                                                                                                                                                                                @ Recipe translation
                      synthetic recipes with yields equal to                                                                                                                                    Online scheduling
                      or exceeding established references
                                                                                                                                                                                                                          Figure 5.1.4
                      (Figure 5.1.4). In an experiment aimed at
                      synthesizing M1 [4-(2,3-dimethoxyphenyl)-                                                the mid-80% reference range and completed the synthesis
                      1H-pyrrolo[2,3-b]pyridine], Synbot                                                       in significantly less time (Figure 5.1.5). Synbot's automation
                      developed multiple synthetic formulas                                                    of organic synthesis highlights Al's potential in fields such as
                      that achieved conversion yields surpassing                                               pharmaceuticals and materials science.
                       Reaction kinetics of M1 autonomous optimization experiment, Synbot vs. reference
                       Source: Ha et al., 2023 | Chart: 2024 AI Index report
                                                                                                                                                                                                               100
                          100%                                                                                                                                                                                 100100
                                                                                                                                                                                                               85%, Reference
                           80%
                           60%
                           40%
                            20%
                             0%                                                                                       12
                                 0                     3                    6                     9                                         15                   18                   21                    24
                                                                                                                Time (hours)
                                                                                                                                                                                                                         Figure 5.1.5
                      Chapter5Preview                                                                                                                                                                                                 8
--------------PAGE Number: 8. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                      GraphCast                                                                                                  and more. Figure 5.1.7 compares the performance
                      More accurate global weather forecasting                                                                    of GraphCast with the current industry state-of-the-
                      with GraphCast                                                                                              art weather simulation system: the High Resolution
                      GraphCast is a new weather forecasting system                                                              Forecast (HRES). GraphCast posts a lower root mean
                      that delivers highly accurate 10-day weather                                                               squared error, meaning its forecasts more closely
                      predictions in under a minute (Figure 5.1.6). Utilizing                                                    correspond to observed weather patterns. GraphCast
                      graph neural networks and machine learning,                                                                can be a valuable tool in deciphering weather patterns,
                      GraphCast processes vast datasets to forecast                                                              enhancing preparedness for extreme weather events,
                      temperature, wind speed, atmospheric conditions,                                                           and contributing to global climate research.
                       GraphCast weather prediction
                      Source: DeepMind, 2023
                           a) Input weather state                                  b) Predict the next state                               c) Roll out a forecast
                                                                                                                                                                                                                          Figure 5.1.6
                       Ten-day z500 forecast skill: GraphCast vs. HRES
                       Source: Lam et al., 2023 | Chart: 2024 AI Index report
                                          GraphCast HRES (O6z/18z)                    )HRES(OOz/12z)
                            800
                            700
                            600
                      1
                            500
                            400
                            300
                            200
                             100
                                0
                                        1        2        3         4       5         6        7        8         9       10
                                                                     Lead time (days)                           Figure 5.1.7
                      Chapter 5Preview                                                                                                                                                                                                9
--------------PAGE Number: 9. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                      GNoME                                                                                                    Sample material structures
                                                                                                                               Source: Merchant et al., 2023
                      Discovering new materials with GNoME
                      The search for new functional materials is key to
                      advancements in various scientific fields, including
                      robotics and semiconductor manufacturing. Yet this
                      discovery process is typically expensive and slow.
                      Recent advancements by Google researchers have
                                                                                                                                              K,BiCI                           Li4MgGe,S                       MogGeB,
                      demonstrated that graph networks, a type of AI
                      model, can expedite this process when trained on
                      large datasets. Their model, GNoME, outperformed
                      the Materials Project, a leading method in materials
                      discovery, by identifying a significantly larger
                      number of stable crystals (Figure 5.1.8). GNoME has                                                                      KV                                                             im
                      unveiled 2.2 million new crystal structures, many
                      overlooked by human researchers (Figure 5.1.9 and                                                                                                                                                   Figure 5.1.8
                      Figure 5.1.10). The success of Al-driven projects like
                      GNoME highlights the power of data and scaling in
                      speeding up scientific breakthroughs.
                       GNoME vs. Materials Project: stable crystal count                                                            GNoME vs. Materials Project: distinct prototypes
                       Source: Merchant et al., 2023 | Chart: 2024 AI Index report                                                  Source: Merchant et al., 2023 | Chart: 2024 AI Index report
                           1,000,000               GNoME              Material Project                                                                      GNoME              Material Project
                                                                                                                                        20,000
                             100,000
                               10,000                                                                                                   10,000
                                 1,000
                                                2               3               4                5               6                                       2                 3                4                5                6
                                                                       Unique elements                                                                                            Unique elements
                                                                                                           Figure 5.1.9                                                                                                  Figure 5.1.10
                      Chapter 5 Preview                                                                                                                                                                                             10
--------------PAGE Number: 10. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                       5.1
                      Flood Forecasting                                                                                             A team of Google researchers has used Al to develop
                      Al for more accurate and reliable flood forecasts                                                             highly accurate hydrological simulation models
                      New research introduced in 2023 has made                                                                      that are also applicable to ungauged basins.' These
                      significant progress in predicting large-scale flood                                                          innovative methods can predict certain extreme flood
                      events. Floods, among the most common natural                                                                 events up to five days in advance, with accuracy that
                      disasters, have particularly devastating effects in                                                           matches or surpasses current state-of-the-art models,
                      less developed countries where infrastructure for                                                             such as GloFAS. The AI model demonstrates superior
                      prevention and mitigation is lacking. Consequently,                                                           precision (accuracy of positive predictions) and recall
                      developing more accurate prediction methods that                                                              (ability to correctly identify all relevant instances)
                      can forecast these events further in advance could                                                            across a range of return period events, outperforming
                      yield substantial positive impacts.                                                                           the leading contemporary method (Figure 5.1.11).2 The
                                                                                                                                     model is open-source and is already being used to
                                                                                                                                    predict flood events in over 80 countries.
                       Predictions of AI model vs. GloFAS across return periods
                       Source: Nearing et al., 2023 | Chart: 2024 Al Index report
                           1.00                                                                                                       1.00
                                            Al model             GloFAS
                           0.80                                                                                                       0.80
                           0.60                                                                                                  1    0.60
                           0.40                                                                                                       0.40
                           0.20                                                                                                       0.20
                           0.00                                                                                                       0.00
                                      1(N=3,649)            2 (N=3,675)            5 (N=3,416)           10 (N=3,087)                            1(N=3,682)            2 (N=3,691)            5 (N=3,597)           10 (N=3,321)
                                                                                                                            Return period
                                                                                                                                                                                                                         Figure 5.1.11
                      1 An ungauged basin is a watershed for which there is insufficient streamflow data to model hydrological flows.
                      2 A return period (recurrence interval) measures the likelihood of a particular hydrological event recurring within a specific period. For example, a 100-year flood means there is a 1% chance of
                      the event being equaled or exceeded in any given year.
                      Chapter 5Preview                                                                                                                                                                                               11
--------------PAGE Number: 11. END OF PAGE-----------------------
                      LAI             Artificial Intelligence                                                                                                                 Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                     Al models are becoming increasingly valuable in healthcare, with applications for detecting polyps to aiding clinicians
                     in making diagnoses. As Al performance continues to improve, monitoring its impact on medical practice becomes
                     increasingly important. This section highlights significant Al-related medical systems introduced in 2023, the current
                     state of clinical AI knowledge, and the development of new Al diagnostic tools and models aimed at enhancing
                      hospital administration.
                      5.2
                      Notable Medical Systems                                                                                  SynthSR generations
                                                                                                                               Source: Iglesias et al., 2023
                      This section identifies significant Al-related                                                                         Input                SynthSR           FreeSurfer seg.           3D render
                      medical breakthroughs of 2023 as chosen by the
                                                                                                                                E
                      Al Index Steering Committee.
                      SynthSR
                      Transforming brain scans for advanced analysis
                      SynthSR is an AI tool that converts clinical brain
                      scans into high-resolution T-1 weighted images
                      (Figure 5.2.1). This advancement addresses the issue
                                                                                                                                i
                      of scan quality variability, which previously limited
                      the use of many scans in advanced research. By
                      transforming these scans into T1-weighted images,
                      known for their high contrast and clear brain
                                                                                                                                                                                                                         Figure 5.2.1
                      structure depiction, SynthSR facilitates the creation
                      of detailed 3D brain renderings. Experiments using
                      SynthSR demonstrate robust correlations between
                      observed volumes at both scan and subject levels,
                      suggesting that SynthSR generates images closely
                      resembling those produced by high-resolution T1
                      scans. Figure 5.2.2 illustrates the extent to which
                      SynthSR scans correspond with ground-truth
                      observations across selected brain regions. SynthID
                      significantly improves the visualization and analysis
                       of brain structures, facilitating neuroscientific
                      research and clinical diagnostics.
                      Chapter 5 Preview                                                                                                                                                                                              12
--------------PAGE Number: 12. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      SynthSR correlation with ground-truth volumes on select brain regions
                       Source: Iglesias et al., 2023 | Chart: 2024 Al Index report
                                   Subject level                 0.91                         0.93                          0.91                         0.99                          0.89                         0.90
                                            (n=41)
                            Scan level (ablated                  0.79.                        0.79.                         0.76                         0.99                          0.74                         0.54
                           segmentation task)
                                       Scan level                0.79                         0.83                          0.77                         0.99                          0.76                         0.60
                                          (n=435)
                                                                    White matter                 cCortical gray matter                                      Ventricles                    Hippocampus                  Amygdala
                                                                                                                                     Brain region
                                                                                                                                                                                                                         Figure 5.2.2
                      Chapter 5 Preview                                                                                                                                                                                             13
--------------PAGE Number: 13. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      Coupled Plasmonic Infrared Sensors                                                                     ImmunoSEIRA detection principle and the setup
                                                                                                                             Source: Kavungal et al., 2023
                      Coupled plasmonic infrared sensors for the
                      detection of neurodegenerative diseases                                                                   A
                      Diagnosis of neurodegenerative diseases such as
                      Parkinson's and Alzheimer's depends on fast and
                      precise identification of biomarkers. Traditional
                      methods, such as mass spectrometry and ELISA, are
                      useful in that they can focus on quantifying protein
                      levels; however, they cannot discern changes in                                                                     Monomers                      Oligomers                               Fibrils
                      structural states. This year, researchers uncovered a                                                     B                                                 D                        E
                                                                                                                                      Infrared objective
                      new method for neurodegenerative disease diagnosis                                                                                            IR light
                      that combined Al-coupled plasmonic infrared sensors                                                        Inlet             Outlet
                      that use Surface-Enhanced Infrared Absorption                                                                            Chipcell
                      (SEIRA) spectroscopy with an immunoassay                                                                                                                          Amidell Amidel
                                                                                                                                           -flowcell            Au             ASyn  1500     1600     1700  161516351643166016671685 16881696
                      technique (ImmunoSEIRA; Figure 5.2.3). In tests that                                                                           Analytes  nanorodAntibodyspecies   Wave number (cm-)
                      compared actual fibril percentages with predictions                                                                                                                                                Figure 5.2.3
                      made by Al systems, the accuracy of the predictions
                      was found to very closely match the actual reported
                      percentages (Figure 5.2.4).
                      Deep neural network predicted vs. actual fibrils percentages in test samples
                      Source: Kavungal et al., 2023 | Chart: 2024 AI Index report
                           100%
                            80%
                            60%
                            40%
                            20%
                              0%
                                               0%                         25%                        40%                         50%                         60%                         75%                        100%
                                                                                                                Actual fibrils concentration (%)                                                                         Figure 5.2.4
                      Chapter 5 Preview                                                                                                                                                                                             14
--------------PAGE Number: 14. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      EVEscape
                      Forecasting viral evolution for pandemic                                                                       of viruses (Figure 5.2.5). EVEscape evaluates
                      preparedness                                                                                                   viral escape independently of current strain data
                      Predicting viral mutations is vital for vaccine design                                                        predicting 50.0% of observed SARS-CoV-2 mutations,
                      and pandemic minimization. Traditional methods,                                                               outperforming traditional lab studies which predicted
                      which rely on real-time virus strain and antibody data,                                                       46.2
                      face challenges during early pandemic stages due                                                              predicted only 24% of mutations (Figure 5.2.6).
                      to data scarcity. EVEscape is a new AI deep learning                                                          This performance highlights EVEscape's potential
                      model trained on historical sequences and biophysical                                                         as a valuable asset for enhancing future pandemic
                      and structural information that predicts the evolution                                                        preparedness and response efforts.
                      EVEscape design
                      Source: Thadani et al. 2023
                           a                  Escape                                               Fitness                                          Accessibility                                   Dissimilarity
                                                                                          ACE2
                                                                                                                                x
                                                                                       Spike
                                                                                                                                                                                                +
                                                                                       P(mutation maintains fitness)                     P(mutation accessible to Ab | fit)                     P(mutation disrupts Ab
                                                                                                                                                                                               binding ( fit, accessible)
                                P(mutation escapes immunity)
                                                                                              Deep learning of                                                        Biophysical information
                                                                                         evolutionary sequences
                          b
                              Pandemic                                                                  Variant                                                                       Variant
                                starts                                                                 appears                                                                    becomes VOC
                                                                                                                                                                                                                            Time
                                                                                                                                              Warning time of previous
                                                                                                                                               models (~2-4 months)
                                                                                                     EVEscape early warning time
                                                                                                    allows for vaccine development
                                                                                                                                                                                                                         Figure 5.2.5
                      Chapter 5 Preview                                                                                                                                                                                              15
--------------PAGE Number: 15. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                       EVEscape vs. other models on SARS-CoV-2 RBD mutation prediction
                      Source: Thadani et al., 2023 | Chart: 2024 AI Index report
                                                                  E   :
                                    1
                           50%                                                                                                                                    50%, EVEscape (prepandemic)
                                                                                                                                                                  46%, Later experimental scans (pandemic ab + sera)
                           40%
                                                                                                                                                                  32%, Earlier experimental scans (pandemic ab)
                           30%
                                                                                                                                                                  24%, Previous model
                           20%
                            10%
                             0%
                                  2020  -Jan        2020              2021               2021  -Jul       2022               2022              2023
                                                                                        Pandemic date                                                                                                                    Figure 5.2.6
                      Chapter5Preview                                                                                                                                                                                               16
--------------PAGE Number: 16. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      AlphaMissence                                                                                            Hemaglobin subunit beta (HBB)
                                                                                                                               Source: Google DeepMind, 2023
                      Better classification of Al mutations
                      Scientists still do not fully understand which
                      genetic mutations lead to diseases. With millions of
                      possible genetic mutations, determining whether
                      a mutation is benign or pathogenic requires labor-
                      intensive experiments.
                      In 2023, researchers from Google DeepMind
                      unveiled AlphaMissense, a new Al model
                      that predicted the pathogenicity of 71 million
                      missense variants. Missense mutations are
                      genetic alterations that impact the functionality
                      of human proteins (Figure 5.2.7) and can lead
                      to various diseases, including cancer. Of the 71
                      million possible missense variants, AlphaMissense
                      classified 89%, identifying 57% as likely benign and
                      32% as likely pathogenic, while the remainder were
                      categorized as uncertain (Figure 5.2.8). In contrast,
                      human annotators have only been able to confirm                                                                                                                                                    Figure 5.2.7
                      the nature of O.1% of all missense mutations.
                      AlphaMissense predictions
                      Source: Google DeepMind, 2023 |Chart: 2024 AI Index report
                                                        Likely benign              Likely pathogenic                Uncertain
                      Prediction category                                                        57%                                                                             32%                                    11%
                                              0%                                 20%                                 40%                                 60%                                 80%                                100%
                                                                                                                           % of variants classified
                                                                                                                                                                                                                         Figure 5.2.8
                      Chapter 5 Preview                                                                                                                                                                                              17
--------------PAGE Number: 17. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2   Al in Medicine
                      Human Pangenome Reference                                                                                Graph genome for the MHC region of the genome
                                                                                                                               Source: Google Research, 2023
                      Using Al to map the human genome
                      The human genome is a set of molecular instructions
                      for a human. The first human genome draft was                                                                                                                Different individual's
                                                                                                                                                                                   sequences
                      released in 2000 and updated in 2022. However,
                      the update was somewhat incomplete. It did not
                      incorporate various genetic mutations, like blood
                      type, and did not as completely map diverse ancestry
                      groups. Therefore, under the existing genome
                      reference, it would be difficult to detect diseases or
                      find cures in certain groups of people.
                      In 2023, the Human Pangenome Research Consortium,
                      comprising 119 scientists from 60 institutions, used AI                                                                  Reference Genome path
                      to develop an updated and more representative human
                      genome map (Figure 5.2.9). The researchers achieved
                      remarkable accuracy, annotating a median of 99.07%                                                                                                                                                 Figure 5.2.9
                      of protein-coding genes, 99.42% of protein-coding                                                        This latest version of the genome represents the most
                      transcripts, 98.16% of noncoding genes, and 98.96%                                                       comprehensive and genetically diverse mapping of the
                      of noncoding transcripts, as detailed in Figure 5.2.10.                                                 human genome to date.
                      Ensembl mapping pipeline results
                      Source: Liao et al., 2023 | Chart: 2024 Al Index report
                           100%                        99.07%                                           99.42%                                           98.16%                                          98.96%
                            80%
                            60%
                            40%
                      de
                            20%
                             0%
                                             Protein-coding genes                          Protein-coding transcripts                            Noncoding genes                                Noncoding transcripts
                                                                                                                      Genes and transcripts                                                                             Figure 5.2.10
                      Chapter 5Preview                                                                                                                                                                                              18
--------------PAGE Number: 18. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      Clinical Knowledge
                       Evaluating the clinical knowledge of Al models                                                              AI performance on the MedQA benchmark has seen
                      involves determining the extent of their medical                                                             remarkable improvement, with the leading system,
                      expertise, particularly knowledge applicable in a                                                            GPT-4 Medprompt, achieving an accuracy rate of
                      clinical setting.                                                                                            90.2
                                                                                                                                   the top score in 2022 (Figure 5.2.11). Since MedQA's
                      MedQA                                                                                                        inception, Al capabilities on this benchmark have
                      Introduced in 2020, MedQA is a comprehensive                                                                 nearly tripled, showcasing the rapid improvements of
                      dataset derived from professional medical board                                                              clinically knowledgeable Al systems.
                      exams, featuring over 60,000 clinical questions
                      designed to challenge doctors.
                       MedQA: accuracy
                       Source: Papers With Code, 2023 | Chart: 2024 Al Index report
                           90%                                                                                                                                                                                             90.20%
                           80%
                           70%
                       1
                           60%
                           50%
                           40%
                                            2019                                      2020                                       2021                                      2022                                       2023
                                                                                                                                                                                                                         Figure 5.2.11
                      Chapter 5Preview                                                                                                                                                                                              19
--------------PAGE Number: 19. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                             Highlighted Research:
                             GPT-4 Medprompt
                             Although LLMs exhibit impressive                                                 Moreover, as noted earlier, GPT-4 Medprompt was the first to
                             general knowledge, it is commonly                                                surpass the 90% accuracy mark on the MedQA benchmark.
                             assumed that significant fine-tuning                                             This breakthrough not only underscores GPT-4 Medprompt's
                             is required for them to excel at                                                 exceptional and potentially clinically useful medical
                             specialized knowledge, such as                                                   capabilities but also demonstrates that fine-tuning may not
                             answering medical questions. Fine-                                               always be necessary for adapting models to specialized
                             tuning entails training an LLM on                                                domains. Prompt engineering has shown to be a promising
                             domain-specific data.                                                            alternative strategy.
                             Research from Microsoft in late 2023
                             has overturned this assumption.                                                  GPT-4 vs. Med-PaLM 2 answering a medical question
                             This study employed prompt                                                       Source: Nori et al., 2023
                             engineering to direct GPT-4 toward
                             achieving remarkable performance                                                        Question: A 22-year-old male marathon runner presents to the office with the complaint of
                                                                                                                    right-sided rib pain when he runs long distances. Physical examination reveals normal
                             on the MultiMedQA benchmark                                                            heart and lung findings and an exhalation dysfunction at ribs 4-5 on the right. Which of the
                                                                                                                    following muscles or muscle groups will be most useful in correcting this dysfunction
                             suite, a group of four challenging                                                     utilizing a direct method?
                             medical benchmarks (Figure 5.2.12).                                                    (A) anterior scalene(B) Iatissimus dorsi (C) pectoralis minor(D) quadratus lumborum
                             GPT-4 Medprompt exceeded the                                                                                                                     Hand-crafted CoT from Med PaLM 2
                            performance of the top 2022 model,                                                                   Let's solve this step-bystep, referring to authoritative sources as needed.
                             Flan-PaLM 540B, in the multiple-                                                                    Among the options, only, pectoralis minor muscle origins from the outer
                                                                                                                    Expert       Surfaces of the 3rd to 5th ribs.
                             choice sections of several renowned
                             medical benchmarks, including                                                                                                                                       GPT-4 generated CoT
                             PubMedQA,MedMCQA,and MMLU,                                                                           The primary muscle involved in rib exhalation is the internal intercostal muscle.
                                                                                                                                  However, this option is not listed among the answer choices. Among the
                             by 3.0, 21.5, and 16.2 percentage                                                      GPT-4        provided options, the pectoralis minor can contribute to rib movement, as it
                                                                                                                                 originates from the 3rd-5th ribs and can act to depress the ribs during
                                                                                                                                 exhalation when the scapula is fixed. Therefore, it could potentially be useful in
                            points, respectively. It also exceeded                                                               correcting an exhalation dysfunction at ribs 4-5.
                             the performance of the then state-of-
                             the-art Med-PaLM 2 (Figure 5.2.13).                                                                                                                                                   Figure 5.2.12
                      Chapter5Preview                                                                                                                                                                                              20
--------------PAGE Number: 20. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                             Highlighted Research:
                            GPT-4 Medprompt (cont'd)
                             Model performance on MultiMedQA sub-benchmarks
                             Source: Nori et al., 2023 | Chart: 2024 Al Index report
                                                MMLU             MedMCQA                PubMedQA               MedQA
                                 100%
                                                                                                                                                                                           94.25%
                                                                                             89.88%                                         87.37%                                                                      90.20%
                                                                                                                         86.50%
                                                                 79.00%                                         81.80%                                                  81.40%                       79.10%   82.00%
                                  80%         78.02%                                                                                                           75.20%
                                                                                                       72.30%                                         72.40%
                                                                          67.60%
                            1     60%                  57.60%
                                  40%
                                  20%
                                   0%
                                                      Flan-PaLM 540B                                   Med-PaLM 2                                         GPT-4                                   GPT-4 Medprompt
                                                             2022                                                                                          2023
                                                                                                                                                                                                                     Figure 5.2.13
                      Chapter 5 Preview                                                                                                                                                                                             21
--------------PAGE Number: 21. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                             Highlighted Research:
                             MediTron-70B
                             GPT-4 Medprompt is an impressive system;                                                                 PaLM 2 (both closed models), it represents
                             however, it is closed-source, meaning its weights                                                       a significant improvement over the state-of-
                             are not freely available to the broader public for                                                       the-art results from 2023 and surpasses other
                             use. New research in 2023 has also sought to                                                            open-source models like Llama 2 (Figure 5.2.14).
                             advance the capabilities of open-source medical                                                          MediTron-70B's score on MedQA is the highest
                             LLMs. Among this new research, MediTron-70B                                                             yet achieved by an open-source model. If medical
                             stands out as particularly promising. This model                                                        Al is to reach its fullest potential, it is important
                             achieves a respectable 70.2% accuracy on the                                                             that its capabilities are widely accessible. In this
                             MedQA benchmark. AIthough this is below the                                                              context, MediTron represents an encouraging
                             performance of GPT-4 Medprompt and Med-                                                                  step forward.
                              Performance of select models on MedQA
                              Source: Chen et al., 2023 | Table: 2024 AI Index report
                               Model                                            Release date                                     Access type                                      Score on MedQA
                               GPT-4 Medprompt                                  November 2023                                    Closed                                                                               90.20%
                               Med-PaLM 2                                       April 2023                                       Closed                                                                               86.20%
                               MediTron-70B                                     November 2023                                    Open                                                                                  70.20%
                               Med-PaLM                                         December 2022                                    Closed                                                                                67.20%
                               Llama 2                                          July 2023                                        Open                                                                                 63.80%
                                                                                                                                                                                                                 Figure 5.2.14
                      Chapter5Preview                                                                                                                                                                                               22
--------------PAGE Number: 22. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      Diagnosis
                      AI tools can also be used for diagnostic purposes including, for example, in radiology or cancer detection.
                             Highlighted Research:
                             CoDoC
                             Al medical imaging systems demonstrate robust                                                          (the ability to accurately identify those without it).
                             diagnostic capabilities, yet there are instances                                                       Specifically, across four medical datasets, CoDoC's
                             where they overlook diagnoses that clinicians                                                         sensitivity surpasses clinicians' by an average of
                             catch, and vice versa. This observation suggests                                                      4.5
                             a logical integration of Al systems and clinicians'                                                   by 6.5 percentage points (Figure 5.2.15). In terms
                             diagnostic abilities. In 2023, researchers unveiled                                                   of specificity, CoDoC outperforms clinicians by
                             CoDoC (Complementarity-Driven Deferral to                                                             an average of 2.7 percentage points across tested
                             Clinical Workflow), a system designed to discern                                                       datasets and a standalone predictive model by 5.7
                             when to rely on AI for diagnosis and when to defer                                                    percentage points. Moreover, CoDoC has been
                             to traditional clinical methods. CoDoC notably                                                        shown to reduce clinical workflow by 66%. These
                             enhances both sensitivity (the ability to correctly                                                   findings suggest that Al medical systems can be
                            identify individuals with a disease) and specificity                                                   integrated into clinical workflows, thereby enhancing
                                                                                                                                    diagnostic accuracy and efficiency.
                             CoDoC vs. standalone predictive Al system and clinical readers: sensitivity
                             Source: Dvijotham et al., 2023 | Chart: 2024 AI Index report
                                                  CoDoC            Clinician(s)          Standalone predictive Al model
                                 100%                                                                                                         96.70
                                                                                                                                                                                            90.50
                                                                                                                                                                 86.70%
                                  80%
                                                 72.60%
                                                           62.70%   64.90%
                            1     60%                                                           56.90%
                                                                                                         50.00
                                  40%
                                  20%
                                   0%          UK mammography dataset                       US mammography dataset 1                       US mammography dataset 2                                TB dataset
                                                                                               Breast cancer detection                                                                            TB detection
                                                                                                                           Task and dataset                                                                      Figure 5.2.15
                      Chapter 5 Preview                                                                                                                                                                                             23
--------------PAGE Number: 23. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                             Highlighted Research:
                             CT Panda
                             Pancreatic ductal adenocarcinoma (PDAC) is a particularly                                                                     PANDA
                                                                                                                                                          detection                         PANDA prediction
                                                                                                                                                                                          (on non-contrast CT)
                             lethal cancer, often detected too late for surgical intervention.                                                            Source:
                             Screening for PDAC in asymptomatic individuals is                                                                            Cao et al., 2023
                                                                                                                                                          Figure 5.2.16
                             challenging due to its low prevalence and the risk of false
                             positives. This year, a Chinese research team developed
                             PANDA (pancreatic cancer detection with artificial
                             intelligence), an Al model capable of efficiently detecting
                             and classifying pancreatic lesions in X-rays (Figure 5.2.16). In
                             validation tests, PANDA surpassed the average radiologist in
                             sensitivity by 34.1% and in specificity by 6.3% (Figure 5.2.17).
                             In a large-scale, real-world test involving approximately
                             20,000
                            a specificity of 99.9% (Figure 5.2.18). Al medical tools like
                             PANDA represent significant advancements in diagnosing
                             challenging conditions, offering cost-effective and accurate
                             detection previously considered difficult or prohibitive.
                             PANDA vs. mean radiologist on multicenter validation                                                   PANDA performance on real-world multi-scenario
                             (6,239 patients)                                                                                       validation (20,530 patients)
                             Source: Cao et al., 2023 | Chart: 2024 Al Index report                                                 Source: Cao et al., 2023 | Chart: 2024 AI Index report
                                                                                                                                       100%                                                                99.90%
                                 35%                     34.10%                                                                                                 92.90%
                                30%                                                                                                     80%
                                 25%
                                                                                                                                         60%
                                 20%
                                 15%                                                                                                     40%
                                 10%
                                                                                                    6.30%                               20%
                                  5%
                                  0%                  Sensitivity                                Specificity                              0%                  Sensitivity                                Specificity
                                                                                                               Figure 5.2.17                                                                                         Figure 5.2.18
                      Chapter 5 Preview                                                                                                                                                                                             24
--------------PAGE Number: 24. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                       Other Diagnostic Uses
                      New research published in 2023 highlights how Al can be used in other diagnostic contexts. Figure 5.2.19
                      summarizes some of the findings.
                      Additional research on diagnostic Al use cases
                      Source: Al Index, 2024
                       Research                                 Use case                                 Findings
                       Schopf et al., 2023                      Breast cancer                            The authors conducted a meta-review of the literature exploring mammography-image-based
                                                                                                         Al algorithms. They discovered that predicting future breast cancer risk using only
                                                                                                         mammography images achieves accuracy that is comparable to or better than traditional risk
                                                                                                         assessment tools.
                        Dicente Cid et al., 2023                X-ray interpretation                     The researchers developed two open-source neural networks, X-Raydar and X-Raydar-NLP,
                                                                                                         for classifying chest X-rays using images and free-text reports. They found that these
                                                                                                         automated classification methods perform at levels comparable to human experts and
                                                                                                         demonstrate robustness when applied to external data sets.
                                                                                                                                                                                                                        Figure 5.2.19
                      Chapter 5Preview                                                                                                                                                                                              25
--------------PAGE Number: 25. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      FDA-Approved AI-Related Medical Devices
                       The U.S. Food and Drug Administration (FDA)                                                                 Figure 5.2.20 illustrates the number of Al medical
                      maintains a list of AI/ML-enabled medical devices                                                            devices approved by the FDA over the past decade.
                      that have received approval. The devices featured                                                            In 2022, a total of 139 Al-related medical devices
                       on this list meet the FDA's premarket standards,                                                            received FDA approval, marking a 12.1% increase from
                      which include a detailed review of their effectiveness                                                       the total approved in 2021. Since 2012, the number of
                      and safety. As of October 2023, the FDA has not                                                              these devices has increased by more than 45-fold.
                      approved any devices that utilize generative Al or are
                      powered by LLMs.
                      Number of AI medical devices approved by the FDA, 2012-22
                      Source: FDA, 2023 | Chart: 2024 AI Index report
                            140                                                                                                                                                                                             139
                                                                                                                                                                                                          124
                            120
                                                                                                                                                                                        107
                            100
                             80                                                                                                                                        77
                                                                                                                                                     63
                             60
                             40
                                                                                                                                   26
                             20                                                                                  18
                                          3                 3                 6                5
                                       2012              2013              2014              2015              2016              2017              2018              2019             2020               2021             2022
                                                                                                                                                                                                                       Figure 5.2.20
                      3 The FDA last updated the list in October 2023, meaning that the totals for 2023 were incomplete. Consequently, the AI Index limited its data presentation to include only information
                      up to 2022.
                      Chapter 5Preview                                                                                                                                                                                              26
--------------PAGE Number: 26. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      Figure 5.2.21 illustrates the specialties associated with FDA-approved medical devices. Of the 139 devices
                      approved in 2022, a significant majority, 87.1%, were related to radiology. The next most common specialty was
                      cardiovascular, accounting for 7.2% of the approvals.
                     Number of AI medical devices approved by the FDA by specialty, 2012-22
                     Source: FDA, 2023 | Chart: 2024 AI Index report
                                                                       2012           2013           2014          2015           2016           2017           2018           2019          2020           2021           2022
                                                    Radiology            2                             5                            11             15            39             51             94            105             121
                                              Cardiovascular                                                                        4               6             9             12              7             11             10
                                                   Neurology                                                                                                      4              4                            2              2
                         Gastroenterology and urology                                                                                                              1             1                             3              1
                                                 Hematology                                                                                        2              2              1              3                             1
                                               Microbiology                             2                                                                                        2
                     1                     General hospital                                                                                                                      2
                             General and plastic surgery                                                                                                          2              1
                                                 Ophthalmic                                                                                                       2              1
                                         Clinical chemistry                                                                                                       2
                                             Anesthesiology
                                                    Pathology
                                       Ear nose and throat
                                                        Dental
                                                  Orthopedic
                             Obstetrics and gynecology
                                                                                                                                                                                                                        Figure 5.2.21
                      Chapter5Preview                                                                                                                                                                                               27
--------------PAGE Number: 27. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                      Administration and Care
                      Al tools also hold the potential to enhance medical administration efficiency and elevate the standard of patient care.
                             Highlighted Research:
                             MedAlign
                             Despite significant advances                                        benchmark with 983 questions and instructions and 303 clinician
                            in AI for healthcare, existing                                      responses, drawn from seven different medical specialties (Figure
                             benchmarks like MedQA and                                           5.2.22
                             USMLE, focused on knowledge-
                                                                                                 The researchers then tested various existing LLMs on MedAlign. Of
                             based questions, do not fully
                                                                                                all LLMs, a GPT-4 variant using multistep refinement achieved the
                             capture the diverse tasks
                                                                                                 highest correctness rate (65.0%) and was routinely preferred over
                             clinicians perform in patient
                                                                                                 other LLMs (Figure 5.2.23). MedAlign is a valuable milestone toward
                             care. Clinicians often engage
                                                                                                 using Al to alleviate administrative burdens in healthcare.
                             in information-intensive tasks,
                             such as creating tailored
                             diagnostic plans, and spend a                                       MedAlign workflow
                                                                                                Source: Fleming et al., 2023
                             significant proportion of their
                             working hours on administrative                                             Clinician Instruction                                                                 LLM Response
                             tasks. Although Al has the                                                              Summarize from the EHR
                                                                                                                    the strokes that the patient                          EHR                            LLM
                             potential to streamline these                                                           had and their associated                   +          4
                             processes, there is a lack of                                                          neurologic deficits.
                             suitable electronic health                                                  Clinician Response
                            records (EHR) datasets for                                                               The patient had strokes in the L basal
                                                                                                                     ganglia in 2018 and multiple strokes in 2022:
                             benchmarking and fine-tuning                                                            R occipital, left temporal, L frontal. The
                             medically administrative LLMs.                                                         patient had right sided weakness associated                                         ???
                                                                                                                     with the 2018 stroke after which she was
                             This year researchers have                                                              admitted to rehab. She then had a left sided
                                                                                                                    hemianopsia related to the 2022 stroke.
                             made strides to address this
                                                                                                                                   Evaluating LLMs with MedAlign
                            gap by introducing MedAlign:
                            a comprehensive EHR-based                                                                                                                                                         Figure 5.2.22
                      Chapter 5Preview                                                                                                                                                                                              28
--------------PAGE Number: 28. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                 5.2
                             Highlighted Research:
                             MedAlign (cont'd)
                             Evaluation of model performance: human vs. COMET ranks
                             Source: Fleming et al., 2023 | Chart: 2024 Al Index report
                                                                                  Human ranks                                                                                         COMET ranks
                                    GPT-4 (32k + MR)                  48%         56%        73%         71%        82%                 GPT-4 (32k + MR)                  50%         52%        66%         63%         79%
                                          GPT-4 (32k)      52%                    58%        72%         74%         81%                      GPT-4 (32k)      50%                    51%         63%        58%         77%
                             1                                                                                                   1
                                            GPT-4 (2k)    44%         42%                    67%         70%        76%                         GPT-4 (2k)     48%        49%                    66%         61%         79%
                                      Vicuna-13B (2k)      27%        28%         33%                    50%         63%                   Vicuna-13B (2k)     34%        37%         34%                    49%         70%
                                       Vicuna-7B (2k)     29%         26%         30%        50%                     64%                   Vicuna-7B (2k)      37%        42%         39%         51%                    71%
                                MPT-7B-Instruct (2k)       18%        19%         24%        37%         36%                         MPT-7B-Instruct (2k)      21%        23%         21%        30%         29%
                                                            E          E                      1           1                                                     E           E                      1          1
                                                                                  Model B (loser)                                                                                     Model B (loser)
                                                                                                                                                                                                               Figure 5.2.23
                      Chapter 5Preview                                                                                                                                                                                              29
--------------PAGE Number: 29. END OF PAGE-----------------------
                      LAI             Artificial Intelligence                                                                                                                 Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                                Appendix
                      Appendix
                      Acknowledgments
                      The Al Index would like to acknowledge Emma
                      Williamson for her work surveying the literature on
                      significant Al-related science and medicine trends.
                       Benchmarks
                      1. MedQA: Data on MedQA was taken from the
                         MedQA Papers With Code leaderboard in January
                          2024
                          original paper.
                      FDA-Approved AI-Medical
                      Devices
                       Data on FDA-approved AI-medical devices is
                      sourced from the FDA website that tracks artificial
                      intelligence and machine learning (AI/ML)-enabled
                       medical devices.
                      Chapter 5Preview                                                                                                                                                                                             30
--------------PAGE Number: 30. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                                Appendix
                      Works Cited
                      Cao, K., Xia, Y., Yao, J., Han, X., Lambert, L., Zhang, T., Tang, W., Jin, G., Jiang, H., Fang, X., Nogues, I., Li, X., Guo, W., Wang, Y.,
                      Fang, W., Qiu, M., Hou, Y., Kovarnik, T., Vocka, M., Lu, J. (2023). "Large-Scale Pancreatic Cancer Detection via Non-contrast
                      CT and Deep Learning." Nature Medicine 29, no. 12: 3033-3043. https://doi.org/10.1038/s41591-023-02640-w.
                      Chen, Z., Cano, A. H., Romanou, A., Bonnet, A., Matoba, K., Salvi, F., Pagliardini, M., Fan, S., Kopf, A., Mohtashami, A.,
                      Sallinen, A., Sakhaeirad, A., Swamy, V., Krawczuk, I., Bayazit, D., Marmet, A., Montariol, S., Hartley, M.-A., Jaggi, M. & Bosselut,
                      A. (2023). MEDITRON-70B: Scaling Medical Pretraining for Large Language Models (arXiv:2311.16079). arXiv.
                      http://arxiv.org/abs/2311.16079.
                      Cheng, J., Novati, G., Pan, J., Bycroft, C., Zemgulyte, A., Applebaum, T., Pritzel, A., Wong, L. H., Zielinski, M., Sargeant, T.,
                      Schneider, R. G., Senior, A. W., Jumper, J., Hassabis, D., Kohli, P. & Avsec, Z. (2023). "Accurate Proteome-Wide Missense
                      Variant Effect Prediction With AlphaMissense." Science 381. https://doi.org/10.1126/science.adg7492.
                      Cid, Y. D., Macpherson, M., Gervais-Andre, L., Zhu, Y., Franco, G., Santeramo, R., Lim, C., Selby, I., Muthuswamy, K., Amlani,
                      A., Hopewell, H., Indrajeet, D., Liakata, M., Hutchinson, C. E., Goh, V. & Montana, G. (2024). "Development and Validation
                      of Open-Source Deep Neural Networks for Comprehensive Chest X-Ray Reading: A Retrospective, Multicentre Study."
                      The Lancet Digital Health 6, no. 1: e44-e57. https://doi.org/10.1016/S2589-7500(23)00218-2.
                      Fleming, S. L., Lozano, A., Haberkorn, W. J., Jindal, J. A., Reis, E. P., Thapa, R., Blankemeier, L., Genkins, J. Z., Steinberg,
                      E., Nayak, A., Patel, B. S., Chiang, C.-C., Callahan, A., Huo, Z., Gatidis, S., Adams, S. J., Fayanju, O., Shah, S. J., Savage, T.,
                      ... Shah, N. H. (2023). MedAlign: A Clinician-Generated Dataset for Instruction Following With Electronic Medical Records
                      (arXiv:2308.14089). arXiv. http://arxiv.org/abs/2308.14089.
                      Ha, T., Lee, D., Kwon, Y., Park, M. S., Lee, S., Jang, J., Choi, B., Jeon, H., Kim, J., Choi, H., Seo, H.-T., Choi, W., Hong, W., Park,
                      Y. J., Jang, J., Cho, J., Kim, B., Kwon, H., Kim, G., .. Choi, Y.-S. (2023). "AI-Driven Robotic Chemist for Autonomous Synthesis
                      of Organic Molecules." Science Advances 9, no. 44. https://doi.org/10.1126/sciadv.adj0461.
                      Iglesias, J. E., Billot, B., Balbastre, Y., Magdamo, C., Arnold, S. E., Das, S., Edlow, B. L., Alexander, D. C., Golland, P. & Fischl, B.
                      (2023). "SynthSR: A Public AI Tool to Turn Heterogeneous Clinical Brain Scans into High-Resolution T1-Weighted Images for 3D
                      Morphometry." Science Advances 9, no. 5. https://doi.org/10.1126/sciadv.add3607.
                      Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H. & Szolovits, P. (2020). What Disease Does This Patient Have?
                      A Large-Scale Open Domain Question Answering Dataset From Medical Exams (arXiv:2009.13081; Version 1). arXiv.
                      http://arxiv.org/abs/2009.13081.
                      Kavungal, D., Magalhaes, P., Kumar, S. T., Kolla, R., Lashuel, H. A. & Altug, H. (2023). "Artificial Intelligence--Coupled Plasmonic
                      Infrared Sensor for Detection of Structural Protein Biomarkers in Neurodegenerative Diseases." Science Advances 9, no. 28.
                      https://doi.org/10.1126/sciadv.adg9644.
                      Lam, R., Sanchez-Gonzalez, A., Willson, M., Wirnsberger, P., Fortunato, M., Alet, F., Ravuri, S., Ewalds, T., Eaton-Rosen, Z., Hu,
                      W., Merose, A., Hoyer, S., Holland, G., Vinyals, O., Stott, J., Pritzel, A., Mohamed, S. & Battaglia, P. (2023). "Learning Skillful
                      Medium-Range Global Weather Forecasting." Science 382. https://doi.org/10.1126/science.adi2336.
                      Liao, W.-W., Asri, M., Ebler, J., Doerr, D., Haukness, M., Hickey, G., Lu, S., Lucas, J. K., Monlong, J., Abel, H. J., Buonaiuto, S.,
                      Chang, X. H., Cheng, H., Chu, J., Colonna, V., Eizenga, J. M., Feng, X., Fischer, C., Fulton, R. S., ... Paten, B. (2023). "A Draft
                      Human Pangenome Reference." Nature 617: 312-24. https://doi.org/10.1038/s41586-023-05896-x.
                      Mankowitz, D. J., Michi, A., Zhernov, A., Gelmi, M., Selvi, M., Paduraru, C., Leurent, E., Iqbal, S., Lespiau, J.-B., Ahern, A., Koppe,
                      T., Millikin, K., Gaffney, S., Elster, S., Broshear, J., Gamble, C., Milan, K., Tung, R., Hwang, M., ... Silver, D. (2023). "Faster Sorting
                      Algorithms Discovered Using Deep Reinforcement Learning." Nature 618: 257-63. https://doi.org/10.1038/s41586-023-06004-9.
                      Chapter5Preview                                                                                                                                                                                               31
--------------PAGE Number: 31. END OF PAGE-----------------------
                      LAI            Artificial Intelligence                                                                                                                  Chapter 5: Science and Medicine
                                     Index Report 2024                                                                                                                                                                Appendix
                      Merchant, A., Batzner, S., Schoenholz, S. S., Aykol, M., Cheon, G. & Cubuk, E. D. (2023). "Scaling Deep Learning for Materials
                      Discovery" Nature 624: 80-85. https://doi.org/10.1038/s41586-023-06735-9.
                      Nearing, G., Cohen, D., Dube, V., Gauch, M., Gilon, O., Harrigan, S., Hassidim, A., Klotz, D., Kratzert, F., Metzger, A., Nevo, S.,
                      Pappenberger, F., Prudhomme, C., Shalev, G., Shenzis, S., Tekalign, T., Weitzner, D. & Matias, Y. (2023). AI Increases Global
                      Access to Reliable Flood Forecasts (arXiv:2307.16104). arXiv. http://arxiv.org/abs/2307.16104.
                      Nori, H., Lee, Y. T., Zhang, S., Carignan, D., Edgar, R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W., Luo, R., McKinney, S. M.,
                      Ness, R. O., Poon, H., Qin, T., Usuyama, N., White, C. & Horvitz, E. (2023a). Can Generalist Foundation Models Outcompete
                      Special-Purpose Tuning? Case Study in Medicine (arXiv:2311.16452; Version 1). arXiv. http://arxiv.org/abs/2311.16452.
                      Schopf, C. M., Ramwala, O. A., Lowry, K. P., Hofvind, S., Marinovich, M. L., Houssami, N., EImore, J. G., Dontchos, B. N., Lee, J.
                      M. & Lee, C. I. (2024). "Artificial Intelligence-Driven Mammography-Based Future Breast Cancer Risk Prediction: A Systematic
                      Review." Journal of the American College of Radiology 21, no. 2: 319-28. https://doi.org/10.1016/j.jacr.2023.10.018.
                      Shen, T., Munkberg, J., Hasselgren, J., Yin, K., Wang, Z., Chen, W., Gojcic, Z., Fidler, S., Sharp, N. & Gao, J. (2023).
                      "Flexible Isosurface Extraction for Gradient-Based Mesh Optimization." ACM Transactions on Graphics 42, no. 4: 1-16.
                      https://doi.org/10.1145/3592430.
                      Thadani, N. N., Gurev, S., Notin, P., Youssef, N., Rollins, N. J., Ritter, D., Sander, C., Gal, Y. & Marks, D. S. (2023).
                      "Learning From Prepandemic Data to Forecast Viral Escape." Nature 622: 818-25. https://doi.org/10.1038/s41586-023-06617-Q.
                      Chapter 5Preview                                                                                                                                                                                              32
--------------PAGE Number: 32. END OF PAGE-----------------------