Asian shares hit 15-month high ahead of US data    Empowering Egypt's economy: IFC, World Bank spearhead private sector growth, development initiatives    QatarEnergy acquires stake in 2 Egyptian offshore gas exploration blocks    Al-Sisi inaugurates restored Sayyida Zainab Mosque, reveals plan to develop historic mosques    Shell Egypt hosts discovery session for university students to fuel participation in Shell Eco-marathon 2025    Chad faces growing food insecurity crisis amidst multiple challenges, UN warns    Germany's Lilium, Swiss firm expand to France    UNICEF calls for increased child-focused climate investments in drought-stricken Zimbabwe    S. Korea plans $7.3b support package for chip industry – FinMin    WHO warns of foodborne disease risk in Kenya amidst flooding    Egypt's CBE offers EGP 60b in T-bills on Sunday    CBE sets new security protocols for ATM replenishment, money transport services    SoftBank's Arm to develop AI chips by 2025    Hurghada ranks third in TripAdvisor's Nature Destinations – World    Elevated blood sugar levels at gestational diabetes onset may pose risks to mothers, infants    President Al-Sisi hosts leader of Indian Bohra community    China in advanced talks to join Digital Economy Partnership Agreement    Japanese Ambassador presents Certificate of Appreciation to renowned Opera singer Reda El-Wakil    Sweilam highlights Egypt's water needs, cooperation efforts during Baghdad Conference    AstraZeneca injects $50m in Egypt over four years    Egypt, AstraZeneca sign liver cancer MoU    Swiss freeze on Russian assets dwindles to $6.36b in '23    Climate change risks 70% of global workforce – ILO    Prime Minister Madbouly reviews cooperation with South Sudan    Egypt retains top spot in CFA's MENA Research Challenge    Egyptian public, private sectors off on Apr 25 marking Sinai Liberation    Debt swaps could unlock $100b for climate action    President Al-Sisi embarks on new term with pledge for prosperity, democratic evolution    Amal Al Ghad Magazine congratulates President Sisi on new office term    Egyptian, Japanese Judo communities celebrate new coach at Tokyo's Embassy in Cairo    Uppingham Cairo and Rafa Nadal Academy Unite to Elevate Sports Education in Egypt with the Introduction of the "Rafa Nadal Tennis Program"    Financial literacy becomes extremely important – EGX official    Euro area annual inflation up to 2.9% – Eurostat    BYD، Brazil's Sigma Lithium JV likely    UNESCO celebrates World Arabic Language Day    Motaz Azaiza mural in Manchester tribute to Palestinian journalists    Russia says it's in sync with US, China, Pakistan on Taliban    It's a bit frustrating to draw at home: Real Madrid keeper after Villarreal game    Shoukry reviews with Guterres Egypt's efforts to achieve SDGs, promote human rights    Sudan says countries must cooperate on vaccines    Johnson & Johnson: Second shot boosts antibodies and protection against COVID-19    Egypt to tax bloggers, YouTubers    Egypt's FM asserts importance of stability in Libya, holding elections as scheduled    We mustn't lose touch: Muller after Bayern win in Bundesliga    Egypt records 36 new deaths from Covid-19, highest since mid June    Egypt sells $3 bln US-dollar dominated eurobonds    Gamal Hanafy's ceramic exhibition at Gezira Arts Centre is a must go    Italian Institute Director Davide Scalmani presents activities of the Cairo Institute for ITALIANA.IT platform    







Thank you for reporting!
This image will be automatically disabled when it gets reported by several people.



ChatGPT passes radiology board exam
Published in Daily News Egypt on 16 - 05 - 2023

According to two recent research investigations published in radiography, a journal of the Radiological Society of North America (RSNA), the most recent version of ChatGPT passed a radiography board-style exam, showcasing the promise of big language models but also exposing flaws that impair reliability.
An artificial intelligence (AI) chatbot called ChatGPT employs a deep learning model to identify patterns and correlations between words in its massive training data in order to produce responses that resemble those of a human in response to a prompt. However, because its training data lacks a reliable source of truth, the tool may produce factually inaccurate replies.
According to senior author Rajesh Bhayana, M.D., FRCPC, an abdominal radiologist and technology head at University Medical Imaging Toronto, Toronto General Hospital in Toronto, Canada, "the use of large language models like ChatGPT is exploding and only going to increase." "Our research offers insight into ChatGPT's performance in a radiology context, highlighting the tremendous potential of large language models, along with the current constraints that render it unreliable."
Bhayana said that similar chatbots are being put into well-known search engines like Google and Bing that physicians and people use to seek medical information. ChatGPT was recently declared the fastest-growing consumer application in history.
Bhayana and associates first evaluated ChatGPT based on GPT-3.5, the most widely used version, to evaluate its performance on radiology board exam questions and investigate benefits and limits. The researchers employed 150 multiple-choice questions that were created to be similar to the Canadian Royal College and American Board of Radiology tests in terms of style, content, and difficulty.
The questions were separated into categories for higher-order (apply, analyse, synthesise) and lower-order (knowledge recall, fundamental comprehension) thinking in order to gain insight into performance. The higher-order thinking questions were further subdivided into categories (description of imaging findings, clinical care, computation and categorization, and illness links).
The effectiveness of ChatGPT was assessed on a general level as well as by question type and topic. The language used in responses was also evaluated for confidence.
The researchers discovered that ChatGPT, which is based on GPT-3.5, correctly answered 69% of questions (104 of 150), which is close to the passing mark of 70% utilised by the Royal College in Canada. The model struggled with questions demanding higher-order thinking (60%, 53 of 89), but did reasonably well on questions requiring lower-order thinking (84%, 51 of 61). The higher-order questions including the explanation of imaging findings (61%, 28 of 46), calculation and classification (25%, 2 of 8), and application of concepts (30%, 3 of 10) were particularly difficult for it to answer. Given its lack of pretraining in radiology, it was not surprising that it performed poorly on higher-order thinking issues.
GPT-4 was made available in limited quantities to premium consumers in March 2023. It especially boasted enhanced advanced reasoning capabilities over GPT-3.5.
In a subsequent investigation, GPT-4 outperformed GPT-3.5 and achieved a passing score of 81% (121 of 150) on the same questions, outperforming GPT-3.5. Higher-order thinking questions, in particular those involving the description of imaging findings (85%) and application of concepts (90%), were answered much more correctly by GPT-4 than by GPT-3.5 (81%).
The results imply that the allegedly superior advanced reasoning abilities of GPT-4 translate to improved performance in a radiology setting. Additionally, they recommend better contextual comprehension of terms relevant to radiology, such as imaging descriptions, which is essential to support future downstream applications.
"Our study demonstrates an impressive improvement in ChatGPT's performance in radiology over a short period, highlighting the growing potential of large language models in this context," Bhayana added.
GPT-4's performance on problems requiring lower-order thinking did not increase (80% vs. 84%), and it gave incorrect answers to 12 questions that GPT-3.5 had answered right, raising concerns about the validity of the test as a source of data.
"We were originally taken aback by ChatGPT's precise and assured responses to some difficult radiological questions, but we were also taken aback by some incredibly irrational and incorrect comments," Bhayana added. Naturally, the erroneous results should not be particularly surprising considering how these models operate.
Hallucinations, a potentially harmful propensity of ChatGPT that results in erroneous responses, are less common in GPT-4, although their use is still currently limited in medical practice and education.
Both investigations demonstrated that ChatGPT consistently employed confident language, even when it was untrue. Bhayana observes that this is particularly risky if used as the sole source of knowledge, especially for beginners who might not be able to distinguish between confident and wrong responses.
This, in my opinion, is its biggest drawback. Currently, the best uses of ChatGPT are to generate ideas, aid in the beginning stages of medical writing, and summarise data. It must constantly be fact-checked if utilised for quick information recall, according to Bhayana.


Clic here to read the story from its source.