One of the first attempts to assess interrater reliability in osteopathic medicine was performed by Upledger
9 in 1977. His study showed substantial agreement in examining craniosacral motion, despite the fact that the examination was performed on children and used a poor methodology. In 2001, Moran and Gibbons
10 showed that interrater reliability for simultaneous palpation at the head and the sacrum ranged from poor to nonexistent, with intraclass correlation coefficients ranging from –0.09 to 0.31. Degenhardt et al
11 studied the vertebral region and found differences between interrater reliability before and after consensus training. In the pretraining evaluation of interrater reliability, κ ranged from 0.02 to 0.34, which is within the poor to fair reliability range. After consensus training, reliability improved, rising into the moderate range for tissue texture changes (κ=0.45) and the substantial range for tenderness assessments (κ=0.68). Reliability for positional asymmetry in the transverse plane (κ=0.34) and rotational motion asymmetry (κ=0.20) improved but remained in the fair range.
11 In pelvic anatomical landmark asymmetry, Kmita and Lucas
12 revealed low reliability, with κ ranging from −0.38 to 0.51. Rajendran and Gallagher
13 studied the interrater reliability on both the pelvis and spine, assessing Mitchell's pelvic diagnostic procedures,
14 and obtained a κ statistic ranging from −0.05 to 0.03. Some of these studies used TART to describe the findings. Lower limb area was also studied by Kmita and Lucas
12 to assess the interrater reliability on the medial malleoli asymmetry test, and they obtained κ levels ranging from −0.05 to 0.49.
12 Overall, the systematic review by Basile et al
15 reported that the levels of diagnostic reliability in osteopathy were heterogeneous. Bush and Vorro
16 acknowledged that reaching a reliable palpatory diagnosis is a challenging task. Therefore, they suggested that palpatory findings could be correlated with kinematic parameters to objectify them to increase the reliability.