Replication studies are associated with different goals in the empirical sciences, depending on whether research aims at developing new theories or at testing existing theories (context of discovery vs. context of justification, cf. Reichenbach, 1938). Conceptual replications strive for generalization and can be useful in the context of discovery. Direct replications, by contrast, target the replicability of a specific empirical research result under independent conditions and are thus indispensable in the context of justification. Without assuming replicability, it is impossible to reach a consensus about generally accepted empirical facts. However, such accepted facts are mandatory for testing theories in the empirical sciences. On the basis of this framework, we suggest and motivate standards for replication studies. A characteristic feature of psychological science is the probabilistic nature of the to-be-replicated empirical claim, which typically takes the form of a statistical hypothesis. This raises a number of methodological problems concerning the nature of the replicability hypothesis, the control of error probabilities in statistical decisions about the replicability hypothesis, the determination of the to-bedetected effect size given distortions of published effect sizes by publication bias, the a priori determination of sample sizes for replication studies, and the correct interpretation of the replication rate (i. e., the success rate in a series of replication studies). We propose and discuss solutions for all these problems.