Tools for measuring technical skills during gynaecologic surgery: a scoping review

Hennings, Louise Inkeri; Sørensen, Jette Led; Hybscmann, Jane; Strandbygaard, Jeanett

doi:10.1186/s12909-021-02790-w

BMC Medical Education

Table 4 Studies evaluated by Kane’s validity argument

From: Tools for measuring technical skills during gynaecologic surgery: a scoping review

Assessment tool	Scoring	Generalisation	Extrapolation
Objective Structured Assessment of technical Skills (OSATS) [17].	Comparison of OSATS scores over time.	Not reported	Construct validity was demonstrated as a significant rise in score with increasing caseload as 1.10 OSATS point per assessed procedure (p = 0.008, 95% CI 0.44–1.77)
Vaginal Surgical Skills Index (VSSI) [18].	Comparing GRS and VSSI. A visual analogue scale was added for overall performance.	Internal consistency for the VSSI and GRS = (Cronbach’s alpha (0.95–0.97)) Interrater reliability = 0.53 and intrarater reliability = 0.82	Construct validity was evaluated by measuring convergent validity using Pearson correlation coefficient (r) (VSSI = 0.64, p = 0.01, 95% CI 0.53–0.73) (GRS = 0.51, p = 0.001, 95% CI 0.40–0.61) and showed the ability to discriminate training levels by VSSI scores.
Hopkins Assessment of Surgical Competency (HASC) [19].	Surgeons rated on general surgical skills and case-specific surgical skills. No comparison.	Internal consistency reliability of the items using Cronbach’s alpha = 0.80 (p < 0.001)	Discriminative validity for inexperienced vs intermediate surgeons (p < 0.001)
Objective Structured Assessment of Laparoscopic Salpingectomy (OSA-LS) [20].	Surgeons rated by OSA-LS. No Comparison.	Interrater reliability =0.831. Intrarater reliability not reported.	Discriminative validity for inexperienced vs intermediate surgeon’s vs experienced surgeons (p < 0.03)
Robotic Hysterectomy Assessment Score (RHAS) [21].	Surgeons rated by expert viewers using RHAS. No Comparison,	Interrater reliability for total domain score = 0.600 (p < 0.001). Intrarater reliability not reported.	Discriminative validity for experts, advanced beginners and novice in all domains except vaginal cuff closure (p = 0.006).
Competence Assessment for Laparoscopic Supracervical Hysterectomy (CAT-LSH) [22].	Comparing GOALS and CAT-LSH	Interrater reliability = 0.75 Intrarater reliability not reported.	Discriminative validity for inexperienced vs intermediate (p < 0.001) and intermediate vs experts (p < 0.001) assessed by assistant surgeon. For blinded reviewers discriminative validity for inexperienced vs intermediate (p < 0.006) and intermediate vs experts (p < 0.011).
Feasible rating scale for formative and summative feedback [23].	Surgeons rated by expert viewers using 12-item procedure-specific checklist	Interrater reliability =0.996 for one rater and 0.0998 for two raters. Intrarater reliability not reported.	Discriminative validity for beginners and experienced surgeons (p = < 0.001)
GERT = Generic Error Rating Tool [24].	Comparing OSATS and GERT	Interrater reliability = > 0.95) Intrarater reliability = > 0.95)	Significant negative correlation between OSATS and GERT scores (rater 1: Spearman = − 0.76, (p < 0.001); rater 2 = − 0.88, (p < 0.001)

Back to article page

ISSN: 1472-6920

Contact us

Submission enquiries: bmcmedicaleducation@biomedcentral.com
General enquiries: ORSupport@springernature.com