The presented paper proposes a new method for unique automatic evaluation of speech therapy exercises, one part of the future software system for speech therapy support. The method is based on the detection of the lips and tongue movements, which will help to evaluate the quality of the exercise implementation. Four different types of exercises are introduced and the corresponding features,
... [Show full abstract] capturing the quality of the movements, are shown. The method was tested using manually annotated data and the proposed features were evaluated and analyzed. At the second part, the tongue detection is proposed based on the convolutional neural network approach and preliminary results were shown.