
Alptekin TemizelMiddle East Technical University | METU · Graduate School of Informatics
Alptekin Temizel
Professor
About
117
Publications
43,892
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,199
Citations
Introduction
Dr. Alptekin Temizel is a Professor at the Graduate School of Informatics, Middle East Technical University.
B.Sc.: Electrical and Electronic Eng., METU, Turkey, 1999.
Ph.D.: CVSSP, University of Surrey, UK, 2006.
Research Interests: video surveillance, computer vision, machine learning, deep learning, GPU programming, CUDA.
Additional affiliations
April 2016 - May 2017
March 2007 - present
Publications
Publications (117)
Introduction of RGB-D sensors together with the efforts on open-source point-cloud processing tools boosted research in both computer vision and robotics. One of the key areas which have drawn particular attention is object recognition since it is one of the crucial steps for various applications. In this paper, two spatially enhanced local 3D desc...
Coherent nature of crowd movement allows representing the crowd motion using sparse features. However, surveillance videos recorded at dierent periods of time are likely to have dierent crowd densities and motion characteristics. These varying scene properties necessitate use of dierent models for an eective representation of behaviour at dierent p...
The recently proposed TLD (Tracking-Learning-Detection) method has become a popular visual tracking algorithm as it was shown to provide promising long-term tracking results. On the other hand, the high computational cost of the algorithm prevents it being used at higher resolutions and frame rates.
In this paper, we describe the design and implem...
Mean-shift tracking plays an important role in computer vision applications because of its robustness, ease of
implementation and computational efficiency. In this study, a fully automatic multiple-object tracker based on mean-shift
algorithm is presented. Foreground is extracted using a mixture of Gaussian followed by shadow and noise removal to
i...
When directly applied to images with different scales, scale invariant feature transform (SIFT) matching performance decreases significantly. In this reported work, this phenomenon is demonstrated and a simple method to increase the performance of SIFT matching is proposed. The proposed method includes preprocessing the images before matching and i...
The usage of wearable devices in daily life has grown rapidly with the advancements in sensor technologies. These devices primarily rely on optical sensors to capture videos from an egocentric perspective, known as First Person Vision (FPV). FPV videos possess distinct characteristics compared to third-person videos, such as significant ego-motions...
Driving scene understanding task involves detecting static elements such as lanes, traffic signs, and traffic lights, and their relationships with each other. To facilitate the development of comprehensive scene understanding solutions using multiple camera views, a new dataset called Road Genome (OpenLane-V2) has been released. This dataset allows...
Rating a video based on its content is an important step for classifying video age categories. Movie content rating and TV show rating are the two most common rating systems established by professional committees. However, manually reviewing and evaluating scene/film content by a committee is a tedious work and it becomes increasingly difficult wit...
Background
Assessment of endoscopic activity in ulcerative colitis (UC) is important for treatment decisions and monitoring disease progress. However, substantial inter- and intraobserver variability in grading impairs the assessment. Our aim was to develop a computer-aided diagnosis system using deep learning to reduce subjectivity and improve the...
Detection of small objects and objects far away in the scene is a major challenge in surveillance applications. Such objects are represented by small number of pixels in the image and lack sufficient details, making them difficult to detect using conventional detectors. In this work, an open-source framework called Slicing Aided Hyper Inference (SA...
In scoring systems used to measure the endoscopic activity of ulcerative colitis, such as Mayo endoscopic score or Ulcerative Colitis Endoscopic Index Severity, levels increase with severity of the disease activity. Such relative ranking among the scores makes it an ordinal regression problem. On the other hand, most studies use categorical cross-e...
Drone detection has become an essential task in object detection as drone costs have decreased and drone technology has improved. It is, however, difficult to detect distant drones when there is weak contrast, long range, and low visibility. In this work, we propose several sequence classification architectures to reduce the detected false-positive...
Availability of large, diverse, and multi-national datasets is crucial for the development of effective and clinically applicable AI systems in the medical imaging domain. However, forming a global model by bringing these datasets together at a central location, comes along with various data privacy and ownership problems. To alleviate these proble...
Predicting transfer values of association football players, despite its importance, has been studied in a limited way in the literature. The existing approaches have mainly focused on explanatory models that cannot be used in predicting future values. In this paper, we propose a method where we fuse in-game performance data, player popularity metri...
Availability of large, diverse, and multi-national datasets is crucial for the development of effective and clinically applicable AI systems in the medical imaging domain. However, forming a global model by bringing these datasets together at a central location, comes along with various data privacy and ownership problems. To alleviate these proble...
Detection of small objects and objects far away in the scene is a major challenge in surveillance applications. Such objects are represented by small number of pixels in the image and lack sufficient details, making them difficult to detect using conventional detectors. In this work, an open-source framework called Slicing Aided Hyper Inference (SA...
Endoscopic Mayo score and Ulcerative Colitis Endoscopic Index of Severity are commonly used scoring systems for the assessment of endoscopic severity of ulcerative colitis. They are based on assigning a score in relation to the disease activity, which creates a rank among the levels, making it an ordinal regression problem. On the other hand, most...
Deep convolutional networks are prominently used in object detection tasks due to their notable performances. These networks typically have pooling layers following the convolution, which effectively subsamples the convolution output, potentially introducing aliasing. An aliased signal emerging in the earlier layers inevitably propagates throughout...
While exam-style questions are a fundamental educational tool serving a variety of purposes, manual construction of questions is a complex process that requires training, experience and resources. To reduce the expenses associated with the manual construction of questions and to satisfy the need for a continuous supply of new questions, automatic q...
Player performance evaluation is a challenging problem with multiple dimensions. Football (soccer) is the largest sports industry in terms of monetary value and it is paramount that teams can assess the performance of players for both financial and operational reasons. However, this is a difficult task, not only because performance differs from pos...
Deep Neural Networks have been shown to be vulnerable to various kinds of adversarial perturbations. In addition to widely studied additive noise based perturbations, adversarial examples can also be created by applying a per pixel spatial drift on input images. While spatial transformation based adversarial examples look more natural to human obse...
Triple DES (3DES) is a NIST and ISO/IEC standard block cipher that is also used in some web browsers and several electronic payment applications. We propose an optimized bit‐level parallelization of 3DES for GPU accelerated encryption to allow processing high volumes of data. Since the block size of 3DES is 64 bits, our approach considers a kernel...
Background
Multi-layered convolutional neural networks are artificial intelligence (AI) algorithms that allow to process specific datasets. Endoscopic mayo score (EMS) is an endoscopic scoring tool for ulcerative colitis (UC) that is widely using for evaluating the disease activity to make a further treatment plan. EMS is an endoscopist-depended su...
Existing methods for egocentric activity recognition are mostly based on extracting motion characteristics from videos. On the other hand, ubiquity of wearable sensors allow acquisition of information from different sources. Although the increase in sensor diversity brings out the need for adaptive fusion, most of the studies use pre-determined wei...
Computer-aided polyp detection is playing an increasingly more important role in the colonoscopy procedure. Although many methods have been proposed to tackle the polyp detection problem, their out-of-distribution test results, which is an important indicator of their clinical readiness, are not demonstrated. In this study, we propose an ensemble-b...
In this paper, we focus on latent modification and generation of 3D point cloud object models with respect to their semantic parts. Different to the existing methods which use separate networks for part generation and assembly, we propose a single end-to-end Autoencoder model that can handle generation and modification of both semantic parts, and g...
The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are sever...
Scarcity of training data is one of the prominent problems for deep networks which require large amounts data. Data augmentation is a widely used method to increase the number of training samples and their variations. In this paper, we focus on improving vehicle detection performance in aerial images and propose a generative augmentation method whi...
The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are sever...
In this paper, we focus on latent modification and generation of 3D point cloud object models with respect to their semantic parts. Different to the existing methods which use separate networks for part generation and assembly, we propose a single end-to-end Autoencoder model that can handle generation and modification of both semantic parts, and g...
Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited capacity embedded devices. Once trained on less resource-constrained computational environments, they can be deployed...
High-performance computing of array signal processing problems is a critical task as real-time system performance is required for many applications. Noise subspace-based Direction-of-Arrival (DOA) estimation algorithms are popular in the literature since they provide higher angular resolution and higher robustness. In this study, we investigate var...
Triple DES (3DES) is a standard fundamental encryption algorithm, used in several electronic payment applications and web browsers. In this paper, we propose a parallel implementation of 3DES on GPU. Since 3DES encrypts data with 64-bit blocks, our approach considers each 64-bit block a kernel block and assign a separate thread to process each bit....
High Dynamic Range (HDR) images are generated using multiple exposures of a scene. When a hand-held camera is used to capture a static scene, these images need to be aligned by globally shifting each image in both dimensions. For a fast and robust alignment, the shift amount is commonly calculated using Median Threshold Bitmaps (MTB) and creating a...
Acoustic features extracted from speech are widely used in problems such as biometric speaker identification and first-person activity detection. However, the use of speech for such purposes raises privacy issues as the content is accessible to the processing party. In this work, we propose a method for speaker and posture classification using intr...
Video frames obtained through endoscopic examination can be corrupted by many artefacts. These artefacts adversely affect the diagnosis process and make the examination of the underlying tissue difficult for the professionals. In addition, detection of these artefacts is essential for further automated analysis of the images and high-quality frame...
Weather events such as rain, snow, and fog degrade the quality of images taken under these conditions. Enhancement of such images is critical for intelligent transport and outdoor surveillance systems. Generative Adversarial Networks (GAN) based methods have been shown to be promising for enhancing these images in recent years. In this study, we ad...
Adversarial images are samples that are intentionally modified to deceive machine learning systems. They are widely used in applications such as CAPTHAs to help distinguish legitimate human users from bots. However, the noise introduced during the adversarial image generation process degrades the perceptual quality and introduces artificial colours...
Generative Adversarial Networks (GANs) are shown to be successful at generating new and realistic samples including 3D object models. Conditional GAN, a variant of GANs, allows generating samples in given conditions. However, objects generated for each condition are different and it does not allow generation of the same object in different conditio...
Adversarial examples have a negative effect on the performance of classifiers which have otherwise good performance on undisturbed images. These examples are generated by adding non-random noise to the test samples in order to fool the classifier. Adversarial attacks use these intentionally generated examples and they pose a security risk to the ma...
Many people view the food photos available in social media when choosing a restaurant. The attractiveness of these photos is an important factor in shaping initial impressions about a restaurant.
There are some properties such as colour harmony, colour balance, and content of an image which constitutes the aesthetics of an image. Although there ar...
Generative Adversarial Networks (GANs) are shown to be successful at generating new and realistic samples including 3D object models. Conditional GAN, a variant of GANs, allows generating samples in given conditions. However, objects generated for each condition are different and it does not allow generation of the same object in different conditio...
Egocentric activity recognition in first-person videos has an increasing importance with a variety of applications such as lifelogging, summarization, assisted-living and activity tracking. Existing methods for this task are based on interpretation of various sensor information using pre-determined weights for each feature. In this work, we propose...
Adversarial examples are known to have a negative effect on the performance of classifiers which have otherwise good performance on undisturbed images. These examples are generated by adding non-random noise to the testing samples in order to make classifier misclassify the given data. Adversarial attacks use these intentionally generated examples...
In this paper, we investigate fusion of different types of classifiers for activity recognition on first-person videos in a data-driven approach. The algorithm first uses the classifiers,
which are composed of kernel and descriptor combinations, through well-known AdaBoost trials. After all trials, classifiers are ordered and assigned ranks with re...
Line of sight (LOS) analysis is a set of methods and algorithms to determine the visible points in a terrain with reference to a specific observer point. This analysis is used in simulations, Geographic Information System (GIS) applications and games. For this reason, it is important to have a capability to get results quickly and facilitate analys...
Poster for ”Accelerating Johnson’s All-Pairs Shortest Paths Algorithm on GPU”
In graph theory finding shortest paths from each node to all the others is a common problem, known as all-pairs shortest path (APSP). However, it is challenging to process large graphs containing hundreds of thousands nodes and vertices in feasible time for real world applications. In this paper, we present a parallel implementation of Johnson's al...
Activity recognition from first-person (ego-centric)
videos has recently gained attention due to the increasing ubiquity
of the wearable cameras. There has been a surge of efforts
adapting existing feature descriptors and designing new
descriptors for the first-person videos. An effective activity
recognition system requires selection and use of co...
Activity recognition from first-person (ego-centric) videos has recently gained attention due to the increasing ubiquity of the wearable cameras. There has been a surge of efforts adapting existing feature descriptors and designing new descriptors for the first-person videos. An effective activity recognition system requires selection and use of co...
Tracking groups of people is a challenging problem. Groups may grow or shrink
dynamically with merging and splitting of individuals and conventional trackers are not
designed to handle such cases. In this paper, we present a Conjoint Individual and Group
Tracking (CIGT) framework based on particle filter and online learning. CIGT has four
complemen...
Range-gated imaging systems are active systems which use a high-power pulsed-light source and control the opening and closing times of the camera shutter in conjunction with the light source. By calculating the arrival time of the reflected light from the object, the camera shutter is opened for a short time period to form an image using the return...
In this study, we propose a method for panoramic background subtraction by using Pan-Tilt cameras in real-time. The proposed method is based on parallelization of image registration, panorama generation and background subtraction operations to run on Graphics Processing Unit (GPU). Experiments results showed that GPU usage increases speed of the al...
Son yıllarda mobil teknolojiler günlük rutin işleri gerçekleştirmek üzere sıkça kullanılmaya başlanmıştır. Mobil teknolojilerin bu denli hızla yaygınlaşması ise birçok sektör açısından ve akademik çalışmalar için kullanıcı verisi elde etmek üzere güvenilir bir kaynak haline gelmiştir. Mobil telefonlardan elde edilen veriler ile kullanıcıların alışk...
In this paper, we present a method for joint tracking of individuals and groups in surveillance scenarios. Groups are dynamic entities and they may grow or shrink with merge-split events. This dynamic nature makes it difficult to track groups using conventional trackers. In this paper, we propose a new tracking method named Conjoint Individual and...
In this study, a fully automatic surveillance system for indoor environments which is capable of tracking multiple objects using both visible and thermal band images is proposed. These two modalities are fused to track people and the objects they carry separately using their heat signatures and the owners of the belongings are determined. Fusion of...
With the increasing focus on safety and security in public areas, anomaly detection in video surveillance systems has become increasingly more important. In this paper, we describe a method that models the temporal behavior and detects behavioral anomalies in the scene using probabilistic graphical models. The Coupled Hidden Markov Model (CHMM) met...
Surveillance of crowded public spaces and detection of anomalies from the video is important for public safety and security. While anomaly detection is possible by detection and tracking of individuals in low-density areas, such methods are not reliable in high-density crowded scenes. In this work we propose a holistic unsupervised approach to clus...
In crowd surveillance systems, it is important to select the proper analysis algorithm considering the properties of the video content. The inappropriate algorithm selection may result in performance degradation and generation of false alarms. An important feature of crowd videos is the density of the crowd. While object detection and tracking base...
Surveillance cameras are playing more important role in our daily life with the increasing number of human population and surveillance cameras. While there are a myriad of methods for video analysis, they are generally designed for low-density areas. Running of these algorithms in crowded areas would not give expected results and results in high nu...
Automated analysis of crowd behaviour using surveillance videos is an important issue for public security as it allows detection of potentially dangerous situations in crowds. Although there is a considerable amount of study in crowd behaviour analysis, the majority are limited in several ways. A few problems to mention are: limited real-time consi...
Abstract— Extraction of crowd dynamics from video is the fundamental step for automatic detection of abnormal events. However, it is difficult to obtain sufficient performance with object tracking due to occlusions and insufficient resolution of the objects in the scene. As a result, optical flow or feature tracking methods are preferred in crowd v...
Anomaly detection from crowd videos is an issue that is becoming more important due to the difficulties in maintaining the public security in crowded places. Surveillance videos has a significant role for enabling the real time analysis of the captured events occurring in crowded places. This paper presents a method that detects anomalies in crowd...
In this article, parallel implementation of a real-time intelligent video surveillance system on Graphics Processing Unit (GPU) is described. The system is based on background subtraction and composed of motion detection, camera sabotage detection (moved camera, out-of-focus camera and covered camera detection), abandoned object detection, and obje...