Takafumi Aoki’s research while affiliated with Tohoku University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (355)


Multibiometrics Using a Single Face Image
  • Conference Paper

December 2024

·

2 Reads

·

Taito Tonosaki

·

Takafumi Aoki

·

[...]

·

Masakatsu Nishigaki



Fig. 1. OCT images in the World's Largest Online Annotated SD-OCT dataset [4] before and after flattening, where the yellow, green, and blue lines indicate the inner limiting membrane (ILM), the inner aspect of the retinal pigment epithelium drusen complex (IRPE), and the outer aspect of Bruch's membrane (OBM), respectively: (a) and (b) are the healthy subjects and (c) and (d) are the age-related macular degeneration (AMD) patients.
Fig. 2. Overview of FDDA using only the first-order shift ∆1(n2) with a(1) = 1.
Fig. 3. Examples of applying FDDA to OCT images in MSHC, where each of the zeroorder, first-order, and second-order shifts is applied to the input image for simplicity: (a) the zero-order shift ∆0(n2), (b) the first-order shift ∆1(n2), (c) the second-order shift ∆2(n2), and (d) the combined shift ∆(n2), and an example of applying RandomAffine for comparison. Colored lines on each image indicate the annotated boundaries between the retinal layers.
Fig. 4. Examples of applying PRLC to OCT images in MSHC, where the red dashed box indicates the pasted retinal layer area. An example of applying CutMix is also shown for comparison. Colored lines on each image indicate the annotated boundaries between the retinal layers.
Fig. 5. Detection results of the retinal layer boundaries from the OCT images of MSHC using each method. The red dotted circles indicate the region with large errors.

+1

Formula-Driven Data Augmentation and Partial Retinal Layer Copying for Retinal Layer Segmentation
  • Preprint
  • File available

October 2024

·

7 Reads

Major retinal layer segmentation methods from OCT images assume that the retina is flattened in advance, and thus cannot always deal with retinas that have changes in retinal structure due to ophthalmopathy and/or curvature due to myopia. To eliminate the use of flattening in retinal layer segmentation for practicality of such methods, we propose novel data augmentation methods for OCT images. Formula-driven data augmentation (FDDA) emulates a variety of retinal structures by vertically shifting each column of the OCT images according to a given mathematical formula. We also propose partial retinal layer copying (PRLC) that copies a part of the retinal layers and pastes it into a region outside the retinal layers. Through experiments using the OCT MS and Healthy Control dataset and the Duke Cyst DME dataset, we demonstrate that the use of FDDA and PRLC makes it possible to detect the boundaries of retinal layers without flattening even retinal layer segmentation methods that assume flattening of the retina.

Download

Multibiometrics Using a Single Face Image

September 2024

·

5 Reads

Multibiometrics, which uses multiple biometric traits to improve recognition performance instead of using only one biometric trait to authenticate individuals, has been investigated. Previous studies have combined individually acquired biometric traits or have not fully considered the convenience of the system.Focusing on a single face image, we propose a novel multibiometric method that combines five biometric traits, i.e., face, iris, periocular, nose, eyebrow, that can be extracted from a single face image. The proposed method does not sacrifice the convenience of biometrics since only a single face image is used as input.Through a variety of experiments using the CASIA Iris Distance database, we demonstrate the effectiveness of the proposed multibiometrics method.


Face image de-identification based on feature embedding

September 2024

·

29 Reads

EURASIP Journal on Image and Video Processing

A large number of images are available on the Internet with the growth of social networking services, and many of them are face photos or contain faces. It is necessary to protect the privacy of face images to prevent their malicious use by face image de-identification techniques that make face recognition difficult, which prevent the collection of specific face images using face recognition. In this paper, we propose a face image de-identification method that generates a de-identified image from an input face image by embedding facial features extracted from that of another person into the input face image. We develop the novel framework for embedding facial features into a face image and loss functions based on images and features to de-identify a face image preserving its appearance. Through a set of experiments using public face image datasets, we demonstrate that the proposed method exhibits higher de-identification performance against unknown face recognition models than conventional methods while preserving the appearance of the input face images.



FSErasing: Improving Face Recognition with Data Augmentation Using Face Parsing

June 2024

·

148 Reads

We propose original semantic labels for detailed face parsing to improve the accuracy of face recognition by focusing on parts in a face. The part labels used in conventional face parsing are defined based on biological features, and thus, one label is given to a large region, such as skin. Our semantic labels are defined by separating parts with large areas based on the structure of the face and considering the left and right sides for all parts to consider head pose changes, occlusion, and other factors. By utilizing the capability of assigning detailed part labels to face images, we propose a novel data augmentation method based on detailed face parsing called Face Semantic Erasing (FSErasing) to improve the performance of face recognition. FSErasing is to randomly mask a part of the face image based on the detailed part labels, and therefore, we can apply erasing-type data augmentation to the face image that considers the characteristics of the face. Through experiments using public face image datasets, we demonstrate that FSErasing is effective for improving the performance of face recognition and face attribute estimation. In face recognition, adding FSErasing in training ResNet-34 with Softmax using CelebA improves the average accuracy by 0.354 points and the average equal error rate (EER) by 0.312 points, and with ArcFace, the average accuracy and EER improve by 0.752 and 0.802 points, respectively. ResNet-50 with Softmax using CASIA-WebFace improves the average accuracy by 0.442 points and the average EER by 0.452 points, and with ArcFace, the average accuracy and EER improve by 0.228 points and 0.500 points, respectively. In face attribute estimation, adding FSErasing as a data augmentation method in training with CelebA improves the estimation accuracy by 0.54 points. We also apply our detailed face parsing model to visualize face recognition models and demonstrate its higher explainability than general visualization methods.



Face Image De-identification Based on Feature Embedding

April 2024

·

83 Reads

A large number of images are available on the Internet with the growth of social networking services, and many of them are face photos or contain faces. It is necessary to protect the privacy of face images to prevent their malicious use by face image de-identification techniques that make face recognition difficult, which prevent the collection of specific face images using face recognition. In this paper, we propose a face image de-identification method that generates a de-identified image from an input face image by embedding facial features extracted from that of another person into the input face image. We develop the novel framework for embedding facial features into a face image and loss functions based on images and features to de-identify a face image preserving its appearance. Through a set of experiments using public face image datasets, we demonstrate that the proposed method exhibits higher de-identification performance against unknown face recognition models than conventional methods while preserving the appearance of the input face images.


Citations (64)


... Others try to optimize for photometric consistency across views. For instance, Ito et al. [19] aim to densify sparse 3D information from MVS depth maps by leveraging photometric consistency. This allows to fill in missing data using a NeRF-based optimization scheme. ...

Reference:

Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering
Neural Radiance Field-Inspired Depth Map Refinement for Accurate Multi-View Stereo

Journal of Imaging

... Ng et al. proposed CMB-Net, which combines face and periocular features [9]. In addition, there are other methods that combine the face, left iris and right iris [10], and the periocular and iris [11], [12]. Since CNN can extract features with high discriminative performance and has improved the recognition accuracy of a single biometric trait, multibiometrics using CNN has also improved its recognition accuracy. ...

Eye Biometrics Combined with Periocular and Iris Recognition Using CNN
  • Citing Conference Paper
  • October 2023

... This section describes a de-identification method for face images by embedding facial features of other persons into the face images. The proposed method is inspired by deep steganography [30,31], which generates a stego image by embedding another image into the input image, i.e., cover image, while preserving the appearance of the input image. Face images de-identified by the proposed method have high image quality since the face images are not perturbed like AEs. Figure 3 illustrates the overview of the proposed method, which is used in the inference phase. ...

Cancelable Face Recognition Using Deep Steganography

IEEE Transactions on Biometrics Behavior and Identity Science

... By using multiple low-cost visible light cameras from different views to simultaneously capture images of the same object and fuse the image information through 3D reconstruction algorithms, it is possible to obtain the 3D contour of the object, including surface texture and color information. Multi-view 3D reconstruction technology is widely used in industrial production [18][19][20][21] , medical treatment [22] , cultural heritage protection [23][24][25] and other fields [26,27] . This study aims to apply multi-view 3D reconstruction to scan and reconstruct complete 3D contours of coarse aggregate particles. ...

PM-MVS: PatchMatch multi-view stereo

Machine Vision and Applications

... Reflections and lighting conditions may obscure glasses or cause misleading highlights, resulting in occlusions that reduce detection accuracy [57,58]. These factors create a difficult environment for traditional and deep learning models to accurately localize and identify eyewear [59], highlighting the need for a dataset that can account for these variables [60,61]. ...

Eyeglass Frame Segmentation for Face Image Processing
  • Citing Conference Paper
  • November 2022

... Next, [47] and [48] extend the application of vision transformers to zero-shot anti-spoofing and data augmentation, respectively, also achieving state-of-theart performance. Lastly, [29] and [49] both report significant improvements in accuracy and reduced equal error rates using transformer-based models. These studies collectively highlight the potential of vision transformers in enhancing the security of face recognition systems. ...

Spoofing Attack Detection in Face Recognition System Using Vision Transformer with Patch-wise Data Augmentation
  • Citing Conference Paper
  • November 2022

... Similarly, the center of the nose and the center of the left eyebrow are used to obtain the nose and eyebrow images for nose and eyebrow recognition, respectively. The iris images are extracted from the periocular image using the method proposed by kawakami et al [13] as shown in Fig. 4. The details of the process are omitted due to space limitation. For more details, refer to [13]. ...

A Simple and Accurate CNN for Iris Recognition
  • Citing Conference Paper
  • November 2022

... Especially in low texture scenes, the reconstruction has been a challenging issue, which is significantly influenced by the extraction, description, and matching methods of feature points. In this regard, the SIFT method, commonly used in SfM, does not perform well, and an insufficient number of feature points will directly affect the reconstruction outcome [25] .In order to address this issue, Yang & Jiang [26] utilized films of spark patterns overlaid on the object's surface to increase the feature points. Hoshi S et al [25] combined SuperPoint , SuperGlue, and LoFTR from deep learning, using SuperPoint +SuperGlue to process regular image areas and LoFTR to handle low texture areas. ...

Accurate and Robust Image Correspondence for Structure-From-Motion and its Application to Multi-View Stereo
  • Citing Conference Paper
  • October 2022

... The modest sample size and the requirement for more testing on a bigger dataset for real-world application are acknowledged constraints. A newborn fingerprint identification technique using deep learning for fingerprint categorization was described by [169]. The approach obtained 78.4% classification accuracy compared to hand classification with a dataset consisting of 1,357 training photos, 166 validation images, and 1,181 test images. ...

Fundamental Study of Neonate Fingerprint Recognition Using Fingerprint Classification
  • Citing Conference Paper
  • September 2022