KOICA, Plan International mark conclusion of Humanitarian Partnership Programme in Egypt    Microsoft to invest $1.7b in Indonesia's cloud, AI infrastructure    Uganda secures $295m loan from S. Arabia's IDB    Asian stocks climb, yen shakes    Ministry of Finance to launch 26 tenders for T-bills, bonds worth EGP 457bn in May    EGP fluctuates against USD in early Tuesday trade    Al-Sisi, Biden discuss Gaza crisis, Egyptian efforts to reach ceasefire    Egyptian, Bosnian leaders vow closer ties during high-level meeting in Cairo    S. Africa regards BHP bid typical market activity    Al-Mashat to participate in World Economic Forum Special Meeting in Riyadh    Sweilam highlights Egypt's water needs, cooperation efforts during Baghdad Conference    AstraZeneca, Ministry of Health launch early detection and treatment campaign against liver cancer    AstraZeneca injects $50m in Egypt over four years    Egypt, AstraZeneca sign liver cancer MoU    US to withdraw troops from Chad, Niger amid shifting alliances    Negativity about vaccination on Twitter increases after COVID-19 vaccines become available    US student protests confuse White House, delay assault on Rafah    Environment Ministry, Haretna Foundation sign protocol for sustainable development    Swiss freeze on Russian assets dwindles to $6.36b in '23    Amir Karara reflects on 'Beit Al-Rifai' success, aspires for future collaborations    Climate change risks 70% of global workforce – ILO    Prime Minister Madbouly reviews cooperation with South Sudan    Ramses II statue head returns to Egypt after repatriation from Switzerland    Egypt retains top spot in CFA's MENA Research Challenge    Egyptian public, private sectors off on Apr 25 marking Sinai Liberation    Egypt forms supreme committee to revive historic Ahl Al-Bayt Trail    Debt swaps could unlock $100b for climate action    President Al-Sisi embarks on new term with pledge for prosperity, democratic evolution    Amal Al Ghad Magazine congratulates President Sisi on new office term    Egyptian, Japanese Judo communities celebrate new coach at Tokyo's Embassy in Cairo    Uppingham Cairo and Rafa Nadal Academy Unite to Elevate Sports Education in Egypt with the Introduction of the "Rafa Nadal Tennis Program"    Financial literacy becomes extremely important – EGX official    Euro area annual inflation up to 2.9% – Eurostat    BYD، Brazil's Sigma Lithium JV likely    UNESCO celebrates World Arabic Language Day    Motaz Azaiza mural in Manchester tribute to Palestinian journalists    Russia says it's in sync with US, China, Pakistan on Taliban    It's a bit frustrating to draw at home: Real Madrid keeper after Villarreal game    Shoukry reviews with Guterres Egypt's efforts to achieve SDGs, promote human rights    Sudan says countries must cooperate on vaccines    Johnson & Johnson: Second shot boosts antibodies and protection against COVID-19    Egypt to tax bloggers, YouTubers    Egypt's FM asserts importance of stability in Libya, holding elections as scheduled    We mustn't lose touch: Muller after Bayern win in Bundesliga    Egypt records 36 new deaths from Covid-19, highest since mid June    Egypt sells $3 bln US-dollar dominated eurobonds    Gamal Hanafy's ceramic exhibition at Gezira Arts Centre is a must go    Italian Institute Director Davide Scalmani presents activities of the Cairo Institute for ITALIANA.IT platform    







Thank you for reporting!
This image will be automatically disabled when it gets reported by several people.



ChatGPT passes radiology board exam
Published in Daily News Egypt on 16 - 05 - 2023

According to two recent research investigations published in radiography, a journal of the Radiological Society of North America (RSNA), the most recent version of ChatGPT passed a radiography board-style exam, showcasing the promise of big language models but also exposing flaws that impair reliability.
An artificial intelligence (AI) chatbot called ChatGPT employs a deep learning model to identify patterns and correlations between words in its massive training data in order to produce responses that resemble those of a human in response to a prompt. However, because its training data lacks a reliable source of truth, the tool may produce factually inaccurate replies.
According to senior author Rajesh Bhayana, M.D., FRCPC, an abdominal radiologist and technology head at University Medical Imaging Toronto, Toronto General Hospital in Toronto, Canada, "the use of large language models like ChatGPT is exploding and only going to increase." "Our research offers insight into ChatGPT's performance in a radiology context, highlighting the tremendous potential of large language models, along with the current constraints that render it unreliable."
Bhayana said that similar chatbots are being put into well-known search engines like Google and Bing that physicians and people use to seek medical information. ChatGPT was recently declared the fastest-growing consumer application in history.
Bhayana and associates first evaluated ChatGPT based on GPT-3.5, the most widely used version, to evaluate its performance on radiology board exam questions and investigate benefits and limits. The researchers employed 150 multiple-choice questions that were created to be similar to the Canadian Royal College and American Board of Radiology tests in terms of style, content, and difficulty.
The questions were separated into categories for higher-order (apply, analyse, synthesise) and lower-order (knowledge recall, fundamental comprehension) thinking in order to gain insight into performance. The higher-order thinking questions were further subdivided into categories (description of imaging findings, clinical care, computation and categorization, and illness links).
The effectiveness of ChatGPT was assessed on a general level as well as by question type and topic. The language used in responses was also evaluated for confidence.
The researchers discovered that ChatGPT, which is based on GPT-3.5, correctly answered 69% of questions (104 of 150), which is close to the passing mark of 70% utilised by the Royal College in Canada. The model struggled with questions demanding higher-order thinking (60%, 53 of 89), but did reasonably well on questions requiring lower-order thinking (84%, 51 of 61). The higher-order questions including the explanation of imaging findings (61%, 28 of 46), calculation and classification (25%, 2 of 8), and application of concepts (30%, 3 of 10) were particularly difficult for it to answer. Given its lack of pretraining in radiology, it was not surprising that it performed poorly on higher-order thinking issues.
GPT-4 was made available in limited quantities to premium consumers in March 2023. It especially boasted enhanced advanced reasoning capabilities over GPT-3.5.
In a subsequent investigation, GPT-4 outperformed GPT-3.5 and achieved a passing score of 81% (121 of 150) on the same questions, outperforming GPT-3.5. Higher-order thinking questions, in particular those involving the description of imaging findings (85%) and application of concepts (90%), were answered much more correctly by GPT-4 than by GPT-3.5 (81%).
The results imply that the allegedly superior advanced reasoning abilities of GPT-4 translate to improved performance in a radiology setting. Additionally, they recommend better contextual comprehension of terms relevant to radiology, such as imaging descriptions, which is essential to support future downstream applications.
"Our study demonstrates an impressive improvement in ChatGPT's performance in radiology over a short period, highlighting the growing potential of large language models in this context," Bhayana added.
GPT-4's performance on problems requiring lower-order thinking did not increase (80% vs. 84%), and it gave incorrect answers to 12 questions that GPT-3.5 had answered right, raising concerns about the validity of the test as a source of data.
"We were originally taken aback by ChatGPT's precise and assured responses to some difficult radiological questions, but we were also taken aback by some incredibly irrational and incorrect comments," Bhayana added. Naturally, the erroneous results should not be particularly surprising considering how these models operate.
Hallucinations, a potentially harmful propensity of ChatGPT that results in erroneous responses, are less common in GPT-4, although their use is still currently limited in medical practice and education.
Both investigations demonstrated that ChatGPT consistently employed confident language, even when it was untrue. Bhayana observes that this is particularly risky if used as the sole source of knowledge, especially for beginners who might not be able to distinguish between confident and wrong responses.
This, in my opinion, is its biggest drawback. Currently, the best uses of ChatGPT are to generate ideas, aid in the beginning stages of medical writing, and summarise data. It must constantly be fact-checked if utilised for quick information recall, according to Bhayana.


Clic here to read the story from its source.