This systematic literature review investigated the inter-rater and test-retest reliability of case formulations. We considered the reliability of case formulations across a range of theoretical modalities and the general quality of the primary research studies.A systematic search of five electronic databases was conducted in addition to reference list trawling to find studies that assessed the reliability of case formulation. This yielded 18 studies for review. A methodological quality assessment tool was developed to assess the quality of studies, which informed interpretation of the findings.Results indicated inter-rater reliability mainly ranging from slight (.1-.4) to substantial (.81-1.0). Some studies highlighted that training and increased experience led to higher levels of agreement. In general, psychodynamic formulations appeared to generate somewhat increased levels of reliability than cognitive or behavioural formulations; however, these studies also included methods that may have served to inflate reliability, for example, pooling the scores of judges. Only one study investigated the test-retest reliability of case formulations yielding support for the stability of formulations over a 3-month period.Reliability of case formulations is varied across a range of theoretical modalities, but can be improved; however, further research is required to strengthen our conclusions.Clinical implications: The findings from the review evidence some support for case formulation being congruent with the scientist-practitioner approach. The reliability of case formulation is likely to be improved through training and clinical experience. Limitations: The broad inclusion criteria may have introduced heterogeneity into the sample, which may have affected the results. Studies reviewed were limited to peer-reviewed journal articles written in the English language, which may represent a source of publication and selection bias.