Benchmarking Human–AI collaboration for common evidence appraisal tools
Article 2024 en
Authors
TW
Tim Woelfle
JH
Julian Hirt
PJ
Perrine Janiaud
Abstract
1 min read
Current LLMs alone appraised evidence worse than humans. Human-AI collaboration may reduce workload for the second human rater for the assessment of reporting (PRISMA) and methodological rigor (AMSTAR) but not for complex tasks such as PRECIS-2.
Discussion(0)
No comments yet. Be the first to comment.