Erdem AkagündüzMiddle East Technical University | METU · Graduate School of Informatics
Erdem Akagündüz
PhD
About
60
Publications
7,883
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
523
Citations
Publications
Publications (60)
Assessing seismic hazards and thereby designing earthquake-resilient structures or evaluating structural damage that has been incurred after an earthquake are important objectives in earthquake engineering. Both tasks require critical evaluation of strong ground motion records, and the knowledge of site conditions at the earthquake stations plays a...
Plant health can be monitored dynamically using multispectral sensors that measure Near-Infrared reflectance (NIR). Despite this potential, obtaining and annotating high-resolution NIR images poses a significant challenge for training deep neural networks. Typically, large networks pre-trained on the RGB domain are utilized to fine-tune infrared im...
Plant health can be monitored dynamically using multispectral sensors that measure Near-Infrared reflectance (NIR). Despite this potential, obtaining and annotating high-resolution NIR images poses a significant challenge for training deep neural networks. Typically, large networks pre-trained on the RGB domain are utilized to fine-tune infrared im...
In this survey, we compile a list of publicly available infrared image and video sets for artificial intelligence and computer vision researchers. We mainly focus on IR image and video sets, which are collected and labelled for computer vision applications such as object detection, object segmentation, classification, and motion detection. We categ...
Tracking objects can be a difficult task in computer vision, especially when faced with challenges such as occlusion, changes in lighting, and motion blur. Recent advances in deep learning have shown promise in challenging these conditions. However, most deep learning-based object trackers only use visible band (RGB) images. Thermal infrared electr...
In this paper, an LSTM autoencoder-based architecture is utilized for drowsiness detection with ResNet-34 as feature extractor. The problem is considered as anomaly detection for a single subject; therefore, only the normal driving representations are learned and it is expected that drowsiness representations, yielding higher reconstruction losses,...
In this paper, an LSTM autoencoder-based architecture is utilized for drowsiness detection with ResNet-34 as feature extractor. The problem is considered as anomaly detection for a single subject; therefore, only the normal driving representations are learned and it is expected that drowsiness representations, yielding higher reconstruction losses,...
Drone detection has become an essential task in object detection as drone costs have decreased and drone technology has improved. It is, however, difficult to detect distant drones when there is weak contrast, long range, and low visibility. In this work, we propose several sequence classification architectures to reduce the detected false-positive...
Atmospheric turbulence has a degrading effect on the image quality of long-range observation systems. As a result of various elements such as temperature, wind velocity, humidity, etc., turbulence is characterized by random fluctuations in the refractive index of the atmosphere. It is a phenomenon that may occur in various imaging spectra such as t...
In this survey, we compile a list of publicly available infrared image and video sets for artificial intelligence and computer vision researchers. We mainly focus on IR image and video sets which are collected and labelled for computer vision applications such as object detection, object segmentation, classification, and motion detection. We catego...
Semantic segmentation is the pixel-wise labeling of an image. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high-level and hierarchical image features; several deep learning-based 2D semantic segmentation approaches have been proposed within the last decade. In this survey, we mainly focus on the...
Forests can be efficiently monitored by automatic semantic segmentation of trees using satellite and/or aerial images. Still, several challenges can make the problem difficult, including the varying spectral signature of different trees, lack of sufficient labelled data, and geometrical occlusions. In this paper, we address the tree segmentation pr...
In this paper, we investigate the parameter identification problem in dynamical systems through a deep learning approach. Focusing mainly on second-order, linear time-invariant dynamical systems, the topic of damping factor identification is studied. By utilizing a six-layer deep neural network with different recurrent cells, namely GRUs, LSTMs or...
In this paper, we investigate the parameter identification problem in dynamical systems through a deep learning approach. Focusing mainly on second-order, linear time-invariant dynamical systems, the topic of damping factor identification is studied. By utilizing a six-layer deep neural network with different recurrent cells, namely GRUs, LSTMs or...
Memorability of an image is a characteristic determined by the human observers' ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. The current study aims to enhance our understanding and prediction of image memorability, improving upo...
In this paper, we introduce a machine learning approach to the problem of infrared small target detection filter design. For this purpose, similarly to a convolutional layer of a neural network, the normalized-cross-correlational (NCC) layer, which we utilize for designing a target detection/recognition filter bank, is proposed. By employing the NC...
We propose a real-time image matching framework, which is hybrid in the sense that it uses both hand-crafted features and deep features obtained from a well-tuned deep convolutional network. The matching problem, which we concentrate on, is specific to a certain application, that is, printing design to product photo matching. Printing designs are a...
We propose a real-time image matching framework, which is hybrid in the sense that it uses both hand-crafted features and deep features obtained from a well-tuned deep convolutional network. The matching problem, which we concentrate on, is specific to a certain application, that is, printing design to product photo matching. Printing designs are a...
Semantic segmentation is the pixel-wise labelling of an image. Since the problem is defined at the pixel level, determining image class labels only is not acceptable, but localising them at the original image pixel resolution is necessary. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high level a...
Memorability of an image is a characteristic determined by the human observers' ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. {The current study aims to enhance our understanding and prediction of image memorability, improving up...
A novel method to detect human falls in depth videos is presented in this paper. A fast and robust shape sequence descriptor, namely the Silhouette Orientation Volume (SOV), is used to represent actions and classify falls. The SOV descriptor provides high classification accuracy even with a combination of simple associated models such as Bag-of-Wor...
VISCHEMA dataset was created using a subset from the SUN dataset used for
Image Memorabiity tests at the University of York.
The Images used are from the SUN image database, from
https://groups.csail.mit.edu/vision/SUN/
The compressed image set used in the experiments is provided in the file:
VISCHEMA SUN folders.zip
The selected memorable are p...
Our complete MATLAB code which uses the VMS selection to improve the performance of predicting image memorability can be found here (zip file). For explanations on the code please check README.txt. https://www.cs.york.ac.uk/vischema/
The subject of this paper is the visual object tracking in infrared (IR) videos. Our contribution is twofold. First, the performance behaviour of the state-of-the-art trackers is investigated via a comparative study using IR-visible band video conjugates, i.e., video pairs captured observing the same scene simultaneously, to identify the IR specifi...
Abstract—In this study, a civilian ship dataset is constructed via images captured by an infrared camera on an unmanned flying vehicle. By using this dataset and synchronized inertial data (UAV altitude and orientation, gimbal angles), a vessel classification method is proposed. The method first calculates the ship base length in meters by using se...
In this study, a scale-invariant representation for closed planar curves (silhouettes) is proposed. The orientations of all points within the Gaussian scale-space of the curve are extracted. This orientation scale-space is used to create the silhouette orientation image in which the positions of each pixel indicate the curve's pixel positions and s...
In this study, a feature extractor and a global descriptor for closed planar curves, i.e. silhouettes, are proposed. Initially, the closed curve is arc-length sampled and the Gaussian scale-space is constructed. Using the absolute curvature values and orientations of the curves within the higher scale levels, scale invariant features are obtained....
This paper summarizes the developed real-time algorithm for registering subsequent video frames and experiments performed on an embedded processor. The features extracted from subsequent frames are matched using a RANSAC technique and frames are registered with an affine transformation model. The techniques used in the literature are improved to wo...
This study presents an open and interoperable maritime surveillance framework with multimodal sensor networks and an automated decision-making. The intention is to improve sea-border control, plugging the gaps in the maritime security with interoperability solutions and have wide-area situational awareness, thus particular reducing the number of il...
Biomechanical modeling of soft tissue is a complex problem for achieving realistic surgical simulations, surgical planning, and scientific analysis. In the literature, three categories of biomechanical models: spline based models, spring models, and finite element models (FEMs) are mainly used for dealing with this problem. Among these, spline base...
3D object recognition is performed using a scale and orientation invariant feature extraction method and a scale and orientation invariant topological representation. 3D surfaces are represented by sparse, repeatable, informative and semantically meaningful 3D surface structures, which are called multiscale features. These features are extracted wi...
The shape of the face can be estimated before the surgery by using 3-dimensional computer programs that provide tools to guide skill modifications. The aim of this study was to present the dynamic volume spline method to predict facial soft tissue changes after the modification of the skull associated with orthognathic surgery.
Soft tissue volume i...
A generic, transform invariant 3D facial feature detection method based on mean (H) and Gaussian (K) curvature analysis is proposed. A scale space of the HK values is constructed differently from the previous HK attempts. The 3D features are extracted from this scale space and used in a global topology, which is trained with a Gaussian model using...
Human facial soft tissue is modelled by a dynamic volume spline and the parameters of this model are estimated from the actual human facial skin properties by the optimization-based inverse solution. In the literature, various compressive force–displacement experiments are conducted and the parameters of the proposed models are estimated from the a...
Soft tissue models are of many types, but each type has its own problems in terms of realistic appearance and computational efficiency. In this study, we propose a hybrid model based on dynamic volume splines which combine the advantages of these models. To simulate a realistic appearance, biomechanical properties of the real tissue are obtained by...
Although they are orientation invariant, mean (H) and Gaussian (K) curvature values are essentially variant under scale and resolution changes. In order to overcome this fact, in this study, scale-spaces of the 3D surface and the curvature values are constructed. Then features with their scale information are sought within the scale-space. Thus, di...
Using mean curvature (H) and Gaussian curvature (K) values or shape index (S) and curvedness (C) values, HK and SC curvature spaces are constructed in order to classify surface patches into types such as pits, peaks, saddles etc. Since both HK and SC curvature spaces classify surface patches in to similar types, their classification capabilities ar...
Most D object recognition methods use mean-Gaussian curvatures (HK) or shape index-curvedness (SC) values for classification. Although these two curvature descriptions classify objects into same categories, their mathematical definitions vary. In this study a comparison between the two curvature description is carried out for the purpose of 3D obje...
The data acquired by 3D face scanners have distortions such as spikes, holes and noise. Enhancement of 3D face data by removing these distortions while keeping the face features is important for the applications using these data. In this study, thresholding is used for removing spikes, thresholding together with face symmetry is used for hole filli...
Using transform invariant 3D features obtained from a database of 3D range images, geometric hashing is applied for the purpose of 3D object recognition. Mean (H) and Gaussian (K) curvature values within a scale-space of the surface is used. Since H and K values are used and a scale-space of the surface is constructed the method is independent of t...
Finding slope units for a given watershed is an important task before analyzing landslide susceptibility. Usually, slope unit generation requires integration of several digital elevation model (DEM)-based outputs obtained from GIS and related hydrological software. Therefore, it is time consuming due to involved steps and compilation of various sof...
In this paper, the 3D face scanner that we developed using stereo cameras and structured light together is presented. Structured light having a pattern of vertical lines is used to create feature points and to match them easily. 3D point cloud obtained by stereo analysis is post processed to obtain the 3D model in obj format.
In this study a representation using scale and invariant generic 3D features, for 3D facial models is proposed. These generic feature vectors obtained from descriptive parts of the face like eyes, nose, or nose saddle, are then convolved into a graphical model where a characteristic topology for a 3D facial model representation is achieved. These s...
In this paper we present a physically-based 3D facial skin model based on biomechanical properties of human facial anatomy. The skin model is mainly formed of two DNURBS surfaces as a two layered membrane structure. The model represents the soft tissue layer which covers the skull and is attached to the skull via muscle insertion points. The model...
An algorithm is proposed for 3D object representation using generic 3D features which are transformation and scale invariant. Descriptive 3D features and their relations are used to construct a graphical model for the object which is later trained and then used for detection purposes. Descriptive 3D features are the fundamental structures which are...
In this paper, a scale and orientation invariant feature representation for 2.5 D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions...
An algorithm is proposed to extract transformation and scale invariant 3D fundamental elements from the surface structure of 3D range scan data. The surface is described by mean and Gaussian curvature values at every data point at various scales and a scale-space search is performed in order to extract the fundamental structures and to estimate the...
3D face modeling based on real images is one of the important subjects of Computer Vision that is studied recently. In this paper the study that we conducted in our Computer Vision and Intelligent Systems Research Laboratory on 3D face model generation using uncalibrated multiple still images is explained
In this thesis, 3D animation of human facial expressions and lip motion and their synchronization with a Turkish Speech engine using JAVA programming language, JAVA3D API and Java Speech API, is analyzed. A three-dimensional animation model for simulating Turkish lip motion and facial expressions is developed. In addition to lip motion, synchroniza...