The fusion of image data from trans-esophageal echography (TEE) and X-ray fluoroscopy is attracting increasing interest in minimally-invasive treatment of structural heart disease. In order to calculate the needed transformation between both imaging systems, we employ a discriminative learning (DL) based approach to localize the TEE transducer in X-ray images. The successful application of DL methods is strongly dependent on the available training data, which entails three challenges: (1) the transducer can move with six degrees of freedom meaning it requires a large number of images to represent its appearance, (2) manual labeling is time consuming, and (3) manual labeling has inherent errors. This paper proposes to generate the required training data automatically from a single volumetric image of the transducer. In order to adapt this system to real X-ray data, we use unlabeled fluoroscopy images to estimate differences in feature space density and correct covariate shift by instance weighting. Two approaches for instance weighting, probabilistic classification and Kullback-Leibler importance estimation (KLIEP), are evaluated for different stages of the proposed DL pipeline. An analysis on more than 1900 images reveals that our approach reduces detection failures from 7.3% in cross validation on the test set to zero and improves the localization error from 1.5 to 0.8mm. Due to the automatic generation of training data, the proposed system is highly flexible and can be adapted to any medical device with minimal efforts.