In this PhD thesis, we explore and apply methods inspired by the free energy principle to two important areas in machine learning and neuroscience. The free energy principle is a general mathematical theory of the necessary information-theoretic behaviours of systems that maintain a separation from their environment. A core postulate of the theory is that complex systems can be seen as performing variational Bayesian inference and minimizing an information-theoretic quantity called the variational free energy. The thesis is structured into three independent sections. Firstly, we focus on predictive coding, a neurobiologically plausible process theory derived from the free energy principle which argues that the primary function of the brain is to minimize prediction errors, showing how predictive coding can be scaled up and extended to be more biologically plausible, and elucidating its close links with other methods such as Kalman Filtering. Secondly, we study active inference, a neurobiologically grounded account of action through variational message passing, and investigate how these methods can be scaled up to match the performance of deep reinforcement learning methods. We additionally provide a detailed mathematical understanding of the nature and origin of the information-theoretic objectives that underlie exploratory behaviour. Finally, we investigate biologically plausible methods of credit assignment in the brain. We first demonstrate a close link between predictive coding and the backpropagation of error algorithm. We go on to propose novel and simpler algorithms which allow for backprop to be implemented in purely local, biologically plausible computations.