Assessing the pilot’s cognitive state is of increasing importance in aviation, especially for the development of adaptive assistance systems. For this purpose, the assessment of mental workload (MWL) is of special interest as an indication when and how to adapt the automation to fit the pilot’s current needs. Thus, there is a need to assess the pilot continuously, objectively and non-intrusively. Neurophysiological measurements like electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) are promising candidates for such an assessment. Yet, there is evidence that EEG- and fNIRS-based MWL measures are susceptible to influences from other concepts like mental fatigue (MF), and decrease in accuracy when MWL and MF confound. Still, there are only few studies targeting this problem, and no systematic investigation into this problem has taken place. Thus, the validity of neurophysiological MWL measures is not clear yet. In order to undertake such a systematic investigation, I conducted three studies: one experiment in which I investigated the effects of increasing MWL on cortical activation when MF is controlled for; a second experiment in which I examined the effects of increasing MF on cortical activation when MWL is controlled for; and a further comparative analysis of the gathered data. In order to induce MWL and MF in a controllable and comparable fashion, I conceived and used a simplified simulated flight task with an incorporated adapted n-back and monitoring task. I used a concurrent EEG-fNIRS measurement to gain neurophysiological data, and collected performance data and self-reported MWL and MF. In the first study (N = 35), I induce different four levels of MWL by increasing the difficulty of the n-back task, and controlled for MF by means of randomization and a short task duration (≤ 45 minutes). Higher task difficulty elicited higher subjective MWL ratings, declining performance, increased frontal theta band power and decreased frontal deoxyhaemoglobin (HbR) concentration. Furthermore, fNIRS proved more sensitive to tasks with low difficulty, and EEG to tasks with high difficulty. Only the combination of both methods was able to discriminate all four induced MWL levels. Thus, frontal theta band power and HbR were sensitive to changing MWL. In the second study (N = 31), I. I induced MF by means of time on task. Thus, I prolonged the task duration to approx. 90 minutes, and controlled for MWL by using a low but constant task difficulty derived from the first experiment. Over the course of the experiment, the participants’ subjective MF increased linearly, but their performance remained stable. In the EEG data, there was an early increase and levelling in parietal alpha band power and a slower, but steady increase in frontal theta band power. The fNIRS data did not show a consistent trend in any direction with increasing MF. Thus, only parietal alpha and frontal theta band power were sensitive to changing MF. In the third study, I investigated the validity of two EEG indices commonly used for MWL assessment, the Task Load Index (TLI) and the Engagement Index (EI). I computed the indices from the data of the two experiments, and compared the results between the datasets, and to single band powers. The TLI increased with increasing MWL, but was less sensitive than theta band power alone, and varied slightly with increasing MF. The EI did not vary with MWL, and was not sensitive to gradually increasing MF. Thus, neither index could be considered a valid MWL measure. In sum, neurophysiological measures can be used to assess changes in MWL. Yet, frontal HbR was the only measure sensitive to MWL that did not also vary with MF, and further research is needed to conclude if this finding holds true under different task characteristics. Thus, the tested EEG and fNIRS measures are only valid indications of MWL when confounding effects of MF are explicitly controlled for. I discuss further influences on the tested EEG and fNIRS measures, possible combinations with other data sources, and practical challenges for a neurophysiological MWL assessment. I conclude that neurophysiological measures should be used carefully outside the laboratory, as their validity will likely suffer in realistic settings. When their limitations are understood and respected, they can help to understand the cognitive processes involved in MWL, and can be a valuable addition to an MWL assessment.