Chapter

Self-enhancing GPS-Based Authentication Using Corresponding Address

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Behavioral-based authentication is a new research approach for user authentication. A promising idea for this approach is to use location history as the behavioral features for the user classification because location history is relatively unique even when there are many people living in the same area and even when the people have occasional travel, it does not vary from day to day. For Global Positioning System (GPS) location data, most of the previous work used longitude and latitude values. In this paper, we investigate the advantage of metadata extracted from the longitude and latitude themselves without the need to require any other information other than the longitude and latitude. That is the location identification name (i.e., the address). Our idea is based on the fact that given a pair of longitude and latitude, there is a corresponding address. This is why we use the term self-enhancing in the title. We then applied text mining on the address and combined the extracted text features with the longitude and latitude for the features of the classification. The result showed that the combination approach outperforms the GPS approach using Adaptive Boosting and Gradient Boosting algorithms.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Several approaches attempt to mitigate these problems in the context of mobile web applications. The most recent and promising of these rely on machine learning (ML) techniques to analyze user behavior and thereby provide a basis for authentication decisions, e.g., using data arising on the user's mobile device (frontend) [14], [15], [16], [17] or the application servers (backend) [18], [19]. Popular data sources include sensor data or locations of the mobile device, as well as network addresses and browser information. ...
... -Location and Network Connection Data: In contrast to conventional computers, mobile devices usually change their location frequently, and the resulting movement profiles are strongly dependent on the specific user. Hence, both conventional [40], [41] and novel authentication systems [15], [16] rely on the physical locations for authentication purposes. User locations can be determined in several ways, the most common being GPS [42], but information about the current network connection can also approximate them [43]. ...
... Depending on the scenario, EER values of well below 5% were determined. In addition, many other efforts deal with behavior-based authentication based on various data sources, such as location data [16], [53], sensor data [54], [55], and network connection information [56]. Besides, Acien et al. [39] summarized further work in this area in a structured way. ...
Preprint
As nowadays most web application requests originate from mobile devices, authentication of mobile users is essential in terms of security considerations. To this end, recent approaches rely on machine learning techniques to analyze various aspects of user behavior as a basis for authentication decisions. These approaches face two challenges: first, examining behavioral data raises significant privacy concerns, and second, approaches must scale to support a large number of users. Existing approaches do not address these challenges sufficiently. We propose mPSAuth, an approach for continuously tracking various data sources reflecting user behavior (e.g., touchscreen interactions, sensor data) and estimating the likelihood of the current user being legitimate based on machine learning techniques. With mPSAuth, both the authentication protocol and the machine learning models operate on homomorphically encrypted data to ensure the users' privacy. Furthermore, the number of machine learning models used by mPSAuth is independent of the number of users, thus providing adequate scalability. In an extensive evaluation based on real-world data from a mobile application, we illustrate that mPSAuth can provide high accuracy with low encryption and communication overhead, while the effort for the inference is increased to a tolerable extent.
... A popular method is to examine incoming requests upon arrival at the application server and check whether transmitted information, such as IP addresses, are consistent with previous requests of the same user [4], [5]. Additionally, it can be helpful to observe user interactions with the application and detect discrepancies, for example by observing typing patterns, touchscreen usage and the physical location of the devices used [6]- [9]. As a result, continuous authentication settings [10] can be established, where the user's behavior is permanently evaluated and authentication decisions are revised. ...
... One special characteristic of SPCAuth is the way how deviations from regular usage behavior are detected. Related approaches [7]- [9] usually establish one machine learning model for each user, which is trained on the basis of previous interactions. The trained model is then used to decide whether the user behavior in the context of a given action originates from the known user. ...
... The first group of related approaches uses data collected directly on the frontend side. MutliLock [8] and T.P. Thao's [9] approach address user authentication by means of biometric and behavioral features. MultiLock uses several data sources (e.g. ...
... Given limited information from the GPS (longitude, latitude, and timestamp), if metadata that carries extra independent information can be obtained from the GPS itself, it can help to improve the accuracy. An example of GPS-based self-enhancement comes from [7] in which the address is extracted from the pair of longitude and latitude using a reverse geocoding. ...
... Their benchmark results for face detection, face verification, touch-based user identification, and location-based next place prediction showed that more robust methods fine-tuned to the mobile platform are needed to achieve satisfactory verification accuracy. T. Thao et al. [7] extracted the addresses given the longitudes and latitudes from the GPS records and then applied the text mining on the addresses. The data was collected from 50 users for about four months. ...
... Five features were extracted from the timestamp including month, day, hour, minute, and day of a week (i.e., seven days from Monday to Sunday) which are represented by integer numbers. The valid ranges for these features are the intervals [1,12], [1,31], [0, 23], [0, 59], and [1,7], respectively. The year was not extracted as a feature because all the samples in the dataset were collected in the same year (2017). ...
Preprint
Most of the current user authentication systems are based on PIN code, password, or biometrics traits which can have some limitations in usage and security. Lifestyle authentication has become a new research approach. A promising idea for it is to use the location history since it is relatively unique. Even when people are living in the same area or have occasional travel, it does not vary from day to day. For Global Positioning System (GPS) data, the previous work used the longitude, the latitude, and the timestamp as the features for the classification. In this paper, we investigate a new approach utilizing the distance coherence which can be extracted from the GPS itself without the need to require other information. We applied three ensemble classification RandomForest, ExtraTrees, and Bagging algorithms; and the experimental result showed that the approach can achieve 99.42%, 99.12%, and 99.25% of accuracy, respectively.
... Understanding individual human mobility plays an important role especially when the geographic spread of the infectious virus that causes COVID-19 has taken the world into uncharted territory. Not only that, it is also a critical factor in policy planning [1], [2], travel demand forecasting [3], [4], location-based recommendation/service advertising [6], or location-based personal authentication [5]. M. Gonzalez et al. [25] proved that human mobility follows a high degree of regularity. ...
Preprint
Analysis of human mobility from GPS trajectories becomes crucial in many aspects such as policy planning for urban citizens, location-based service recommendation/prediction, and especially mitigating the spread of biological and mobile viruses. In this paper, we propose a method to find temporal factors affecting the human mobility lifestyle. We collected GPS data from 100 smartphone users in Japan. We designed a model that consists of 13 temporal patterns. We then applied a multiple linear regression and found that people tend to keep their mobility habits on Thursday and the days in the second week of a month but tend to lose their habits on Friday. We also explained some reasons behind these findings.
Article
Despite password‐based security schemes offering increased protection against unwanted access, two‐factor authentication (TFA) is still not widely used primarily because users find additional login steps to be exhausting. To solve this challenge, SoundSignature, a unique TFA technique, is introduced that is intended to be both deployable and user friendly. It verifies that the user's phone is in close proximity to the login device by recording background noise using the microphones on the device. By mimicking the seamless experience of password‐only authentication, this technique enhances security without requiring direct user intervention. Our experimental research demonstrates in both indoor and outdoor scenarios, as well as those in which a phone is carried in a pocket or handbag, ambient noise serves as an effective classifier for device proximity. SoundSignature, being easy to set up and compatible with most devices and browsers without the need for additional plugins, stops attackers from trying to impersonate or predict by utilizing a complicated connection between acoustic propagation mechanisms and the constant frequency response of hardware components. A prototype implementation of SoundSignature, which attains an average accuracy rate of about 90.2% with a minimal equal error rate of 8.5%, validates the system's robustness and effectiveness in five different environments.
Chapter
Full-text available
This paper presents a usage-based insurance (UBI) platform that incorporates Internet of Vehicles (IoV) and blockchain technologies, discussing the potential stakeholders, business models, and interaction modes involved in this platform. Existing UBI products mostly use data on the driver’s mileage, driving period, or driving region for more accurate insurance calculations. Automobile UBI encourages customers to continue improving their ability to drive safety and provides a means to smoothly, transparently, and rationally calculate insurance pricing and payout. This paper proposes blockchain architecture to remedy management problems in a UBI environment. A bidding mechanism suitable for the blockchain-based UBI platform was designed to close the information gap between the insurance company and consumer, thus increasing consumer trust in the platform.
Chapter
Deep learning technology is widely used in medicine. The automation of medical image classification and segmentation is essential and inevitable. This study proposes a transfer learning–based kidney segmentation model with an encoder–decoder architecture. Transfer learning was introduced through the utilization of the parameters from other organ segmentation models as the initial input parameters. The results indicated that the transfer learning–based method outperforms the single-organ segmentation model. Experiments with different encoders, such as ResNet-50 and VGG-16, were implemented under the same Unet structure. The proposed method using transfer learning under the ResNet-50 encoder achieved the best Dice score of 0.9689. The proposed model’s use of two public data sets from online competitions means that it requires fewer computing resources. The difference in Dice scores between our model and 3D Unet (Isensee) was less than 1%. The average difference between the estimated kidney volume and the ground truth was only 1.4%, reflecting a seven times higher accuracy than that of conventional kidney volume estimation in clinical medicine.
Chapter
Current user authentication systems are based on PIN code, password, or biometrics traits, which can have some limitations in usage and security. Lifestyle authentication has become a new research approach in which the promising idea is to use the location history since it is relatively unique. Even when people live in the same area or have occasional travel, it does not vary from day to day. For Global Positioning System (GPS) data, previous work used the longitude, latitude, and timestamp as the classification features. In this paper, we investigate a new approach utilizing distance coherence, which can be extracted from the GPS itself without the need to require other information. We applied three ensemble classifications, including RandomForest, ExtraTrees, and Bagging algorithms. The experimental result showed that our approach could achieve 99.42%, 99.12%, and 99.25% of accuracy, respectively.
Conference Paper
Full-text available
In this paper we evaluate how discriminative are behavior-based signals obtained from the smartphone sensors. The main aim is to evaluate these signals for person recognition. The recognition based on these signals increases the security of devices, but also implies privacy concerns. We consider seven different data channels and their combinations. Touch dynamics (touch gestures and keystroking), accelerometer, gyroscope, WiFi, GPS location and app usage are all collected during human-mobile interaction to authenticate the users. We evaluate two approaches: one-time authentication and active authentication. In one-time authentication, we employ the information of all channels available during one session. For active authentication we take advantage of mobile user behavior across multiple sessions by updating a confidence value of the authentication score. Our experiments are conducted on the semi-uncontrolled UMDAA-02 database. This database comprises of smartphone sensor signals acquired during natural human-mobile interaction. Our results show that different traits can be complementary and multimodal systems clearly increase the performance with accuracies ranging from 82.2% to 97.1% depending on the authentication scenario. These results confirm the discriminative power of these signals.
Article
Full-text available
Wearables and mobile devices see the world through the lens of half a dozen low-power sensors, such as, barometers, accelerometers, microphones and proximity detectors. But differences between sensors ranging from sampling rates, discrete and continuous data or even the data type itself make principled approaches to integrating these streams challenging. How, for example, is barometric pressure best combined with an audio sample to infer if a user is in a car, plane or bike? Critically for applications, how successfully sensor devices are able to maximize the information contained across these multi-modal sensor streams often dictates the fidelity at which they can track user behaviors and context changes. This paper studies the benefits of adopting deep learning algorithms for interpreting user activity and context as captured by multi-sensor systems. Specifically, we focus on four variations of deep neural networks that are based either on fully-connected Deep Neural Networks (DNNs) or Convolutional Neural Networks (CNNs). Two of these architectures follow conventional deep models by performing feature representation learning from a concatenation of sensor types. This classic approach is contrasted with a promising deep model variant characterized by modality-specific partitions of the architecture to maximize intra-modality learning. Our exploration represents the first time these architectures have been evaluated for multimodal deep learning under wearable data -- and for convolutional layers within this architecture, it represents a novel architecture entirely. Experiments show these generic multimodal neural network models compete well with a rich variety of conventional hand-designed shallow methods (including feature extraction and classifier construction) and task-specific modeling pipelines, across a wide-range of sensor types and inference tasks (four different datasets). Although the training and inference overhead of these multimodal deep approaches is in some cases appreciable, we also demonstrate the feasibility of on-device mobile and wearable execution is not a barrier to adoption. This study is carefully constructed to focus on multimodal aspects of wearable data modeling for deep learning by providing a wide range of empirical observations, which we expect to have considerable value in the community. We summarize our observations into a series of practitioner rules-of-thumb and lessons learned that can guide the usage of multimodal deep learning for activity and context detection.
Article
Full-text available
Active authentication is the problem of continuously verifying the identity of a person based on behavioral aspects of their interaction with a computing device. In this study, we collect and analyze behavioral biometrics data from 200subjects, each using their personal Android mobile device for a period of at least 30 days. This dataset is novel in the context of active authentication due to its size, duration, number of modalities, and absence of restrictions on tracked activity. The geographical colocation of the subjects in the study is representative of a large closed-world environment such as an organization where the unauthorized user of a device is likely to be an insider threat: coming from within the organization. We consider four biometric modalities: (1) text entered via soft keyboard, (2) applications used, (3) websites visited, and (4) physical location of the device as determined from GPS (when outdoors) or WiFi (when indoors). We implement and test a classifier for each modality and organize the classifiers as a parallel binary decision fusion architecture. We are able to characterize the performance of the system with respect to intruder detection time and to quantify the contribution of each modality to the overall performance.
Conference Paper
Full-text available
A lot of research is being conducted into improving the us-ability and security of phone-unlocking. There is however a severe lack of scientific data on users' current unlocking behavior and perceptions. We performed an online survey (n = 260) and a one-month field study (n = 52) to gain insights into real world (un)locking behavior of smartphone users. One of the main goals was to find out how much overhead unlocking and authenticating adds to the overall phone usage and in how many unlock interactions security (i.e. authentication) was perceived as necessary. We also in-vestigated why users do or do not use a lock screen and how they cope with smartphone-related risks, such as shoulder-surfing or unwanted accesses. Among other results, we found that on average, participants spent around 2.9 % of their smartphone interaction time with authenticating (9 % in the worst case). Participants that used a secure lock screen like PIN or Android unlock patterns considered it unnecessary in 24.1 % of situations. Shoulder surfing was perceived to be a relevant risk in only 11 of 3410 sampled situations.
Conference Paper
Full-text available
In this paper, we consider supervised learning under the assumption that the available memory is small compared to the dataset size. This general framework is relevant in the context of big data, distributed databases and embedded systems. We investigate a very simple, yet effective, ensemble framework that builds each individual model of the ensemble from a random patch of data obtained by drawing random subsets of both instances and features from the whole dataset. We carry out an extensive and systematic evaluation of this method on 29 datasets, using decision tree-based estimators. With respect to popular ensemble methods, these experiments show that the proposed method provides on par performance in terms of accuracy while simultaneously lowering the memory needs, and attains significantly better performance when memory is severely constrained.
Article
Full-text available
Boosting has been a very successful technique for solving the two-class classification problem. In going from two-class to multi-class classification, most algorithms have been restricted to reducing the multi-class classification problem to multiple two-class problems. In this paper, we propose a new algorithm that naturally extends the original AdaBoost algorithm to the multi-class case without reducing it to multiple two-class problems. Similar to AdaBoost in the two-class case, this new algorithm combines weak classifiers and only requires the performance of each weak classifier be better than random guessing (rather than 1/2). We further provide a statistical justification for the new algorithm using a novel multi-class exponential loss function and forward stage-wise additive modeling. As shown in the paper, the new algorithm is extremely easy to implement and is highly competitive with the best currently available multi-class classification methods.
Conference Paper
Full-text available
User identification and access control have become a high demand feature on mobile devices because those devices are wildly used by employees in corporations and government agencies for business and store increasing amount of sensitive data. This paper describes SenGuard, a user identification framework that enables continuous and implicit user identification service for smartphone. Different from traditional active user authentication and access control, SenGuard leverages availability of multiple sensors on today's smartphones and passively use sensor inputs as sources of user authentication. It extracts sensor modality dependent user identification features from captured sensor data and performs user identification at background. SenGuard invokes active user authentication when there is a mounting evidence that the phone user has changed. In addition, SenGuard uses a novel virtualization based system architecture as a safeguard to prevent subversion of the background user identification mechanism by moving it into a privileged virtual domain. An initial prototype of SenGuard was created using four sensor modalities including, voice, location, multitouch, and locomotion. Preliminary empirical studies with a set of users indicate that those four modalities are suited as data sources for implicit mobile user identification.
Article
Full-text available
This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.
Article
Full-text available
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest--descent minimization. A general gradient--descent "boosting" paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least--squares, least--absolute--deviation, and Huber--M loss functions for regression, and multi--class logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are decision trees, and tools for interpreting such "TreeBoost" models are presented. Gradient boosting of decision trees produces competitive, highly robust, interpretable procedures for regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire 1996, and Fr...
Chapter
Homograph attack is a way that attackers deceive victims about which website domain name they are communicating with by exploiting the fact that many characters look alike. The attack becomes serious and is raising broad attention when recently many brand domains have been attacked such as Apple Inc., Adobe Inc., Lloyds Bank, etc. We first design a survey of human demographics, brand familiarity, and security backgrounds and apply it to 2,067 participants. We build a regression model to study which factors affect participants’ ability in recognizing homograph domains. We find that for different levels of visual similarity, the participants exhibit different abilities. 13.95% of participants can recognize non-homographs while 16.60% of participants can recognize homographs whose the visual similarity with the target brand domains is under 99.9%; but when the similarity increases to 99.9%, the number of participants who can recognize homographs significantly drops down to only 0.19%; and for the homographs with 100% of visual similarity, there is no way for the participants to recognize. We also find that female participants tend to recognize homographs better the male but male participants tend to able to recognize non-homographs better than females. Security knowledge is a significant factor affecting both homographs and non-homographs; surprisingly, people who have strong security knowledge tend to be able to recognize homographs but not non-homographs. Furthermore, people who work or are educated in computer science or computer engineering do not appear as a factor affecting the ability in recognizing homographs; however, interestingly, right after they are explained about the homograph attack, people who work or are educated in computer science or computer engineering are the ones who can capture the situation the most quickly.
Conference Paper
Homograph attack is a way that attackers deceive victims about which website domain name they are communicating with by exploiting the fact that many characters look alike. The attack becomes serious and is raising broad attention when recently many brand domains have been attacked such as Apple Inc., Adobe Inc., Lloyds Bank, etc. We first design a survey of human demographics, brand familiarity, and security backgrounds and apply it to 2,067 participants. We build a regression model to study which factors affect participants' ability in recognizing homograph domains. We find that for different levels of visual similarity, the participants exhibit different abilities. 13.95% of participants can recognize non-homographs while 16.60% of participants can recognize homographs whose the visual similarity with the target brand domains is under 99.9%; but when the similarity increases to 99.9%, the number of participants who can recognize homographs significantly drops down to only 0.19%; and for the homographs with 100% of visual similarity, there is no way for the participants to recognize. We also find that female participants tend to recognize homographs better the male but male participants tend to able to recognize non-homographs better than females. Security knowledge is a significant factor affecting both homographs and non-homographs; surprisingly, people who have strong security knowledge tend to be able to recognize homographs but not non-homographs. Furthermore, people who work or are educated in computer science or computer engineering do not appear as a factor affecting the ability in recognizing homographs; however, interestingly, right after they are explained about the homograph attack, people who work or are educated in computer science or computer engineering are the ones who can capture the situation the most quickly.
Chapter
Visual homograph attack is a way that the attackers deceive victims about what domain they are communicating with by exploiting the fact that many characters look alike. The attack is growing into a serious problem and raising broad attention in reality when recently many brand domains have been attacked such as apple.com (Apple Inc.), adobe.com (Adobe Systems Incorporated), lloydsbank.co.uk (Lloyds Bank), etc. Therefore, how to detect visual homograph becomes a hot topic both in industry and research community. Several existing papers and tools have been proposed to find some homographs of a given domain based on different subsets of certain look-alike characters, or based on an analysis on the registered International Domain Name (IDN) database. However, we still lack a scalable and systematic approach that can detect sufficient homographs registered by attackers with a high accuracy and low false positive rate. In this paper, we construct a classification model to detect homographs and potential homographs registered by attackers using machine learning on feasible and novel features which are the visual similarity on each character and some selected information from Whois. The implementation results show that our approach can bring up to 95.90% of accuracy with merely 3.27% of false positive rate. Furthermore, we also make an empirical analysis on the collected homographs and found some interesting statistics along with concrete misbehaviors and purposes of the attackers.
Article
Objectives: Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. Methods: We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. Results: The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). Conclusions: The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents.
Article
In addition to storing a plethora of sensitive personal and work information, smartphones also store sensor data about users and their daily activities. In order to understand users' behaviors and attitudes towards the security of their smartphone data, we conducted 28 qualitative interviews. We examined why users choose (or choose not) to employ locking mechanisms (e.g., PINs) and their perceptions and awareness about the sensitivity of the data stored on their devices. We performed two additional online experiments to quantify our interview results and the extent to which sensitive data could be found in a user's smartphone-accessible email archive. We observed a strong correlation between use of security features and risk perceptions, which indicates rational behavior. However, we also observed that most users likely underestimate the extent to which data stored on their smartphones pervades their identities, online and offline.
Article
Function estimation/approximation is viewed from the perspective of numerical optimization iti function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent "boosting" paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such "TreeBoost" models are presented. Gradient boosting of regression trees produces competitives highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire and Friedman, Hastie and Tibshirani are discussed.
Article
With an increasing number of organizations allowing personal smart phones onto their networks, considerable security risk is introduced. The security risk is exacerbated by the tremendous heterogeneity of the personal mobile devices and their respective installed pool of applications. Furthermore, by virtue of the devices not being owned by the organization, the ability to authoritatively enforce organizational security polices is challenging. As a result, a critical part of organizational security is the ability to drive user security behavior through either on-device mechanisms or security awareness programs. In this paper, we establish a baseline for user security behavior from a population of over one hundred fifty smart phone users. We then systematically evaluate the ability to drive behavioral change via messaging centered on morality, deterrence, and incentives. Our findings suggest that appeals to morality are most effective over time, whereas deterrence produces the most immediate reaction. Additionally, our findings show that while a significant portion of users are securing their devices without prior intervention, it is difficult to influence change in those who do not.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Article
All the information contained in a plain-text document are visible to everybody. On the other hand, compound documents using opaque formats, like Microsoft Compound Document File Format, may contain undisclosed data such as authors name, organizational information of users involved, previously deleted text, machine related information, and much more. Those information could be exploited by third party for illegal purposes. Computer users are unaware of the problem and, even though the Internet offers several tools to clean hidden data from documents, they are not widespread. Furthermore, there is only one paper about this problem in scientific literature, but there is no detailed analysis.In this paper we fill the gap, analyzing the problem with its causes and then we show how to take advantage of this issue: we show how hidden data may be extracted to gain evidence in forensic environment where even a small piece of information may be relevant and we also introduce a new stegosystem especially designed for Microsoft Office documents. We developed FTA, a tool to improve forensic analysis of Microsoft Office documents, and StegOlè, another tool that implements a new stegosystem for Microsoft Office documents. This is the first scientific paper to address the problem from both a steganographic and a forensic point of view.
System and method for real world biometric analytics through the use of a multimodal biometric analytic wallet. US patent
  • B Aaron
  • D Christopher
  • G Barry
  • K David
Aaron B, Christopher D, Barry G, David K (2018) System and method for real world biometric analytics through the use of a multimodal biometric analytic wallet. In: US patent, US20180276362A1.
Influences of human demographics, brand familiarity and security backgrounds on homograph recognition
  • T P Thao
Multi-class AdaBoost
  • J Zhu
  • Z Hui
  • R Saharon
  • H Thevor
Zhu J, Hui Z, Saharon R, Thevor H (2009) Multi-class AdaBoost. In: Statistics and Its Interface, vol. 2, pp 349-360.
  • T P Thao
Thao, T.P., et al.: Influences of human demographics, brand familiarity and security backgrounds on homograph recognition. arXiv:1904.10595 (2020)
Influences of Human Demographics, Brand Familiarity and Security Backgrounds on Homograph Recognition
  • T P Thao
  • Y Sawaya
  • H Nguyen-Son
  • A Yamada
  • A Kubota
  • T Sang
  • R Yamaguchi
Thao TP, Sawaya Y, Nguyen-Son H, Yamada A, Kubota A, Sang T, Yamaguchi R (2020) Influences of Human Demographics, Brand Familiarity and Security Backgrounds on Homograph Recognition. In: arXiv:1904.10595. Available: https://arxiv.org/abs/1904.10595
Available: scikit-learn.org
  • Scikit-Learn
Scikit-learn. Available: scikit-learn.org
  • Alejandro A Aythami
  • M Vera-Rodriguez
  • R Julian
  • F Ruben
Alejandro A, Aythami M, Vera-Rodriguez R, Julian F, Ruben T (2019) Mul-tiLock: Mobile Active Authentication based on Multiple Biometric and Behavioral Patterns. In: Multimodal Understanding and Learning for Embodied Appl. (MULEA'19), pp 53-59.