Hai Jin's research while affiliated with Huazhong University of Science and Technology and other places

Publications (585)

Chapter
Many practical secure systems have been designed to prevent real-world attacks via maximizing the attacking cost so as to reduce attack intentions. Inspired by this philosophy, we propose a new concept named delay encryption with keyword search (DEKS) to resist the notorious keyword guessing attack (KGA), in the context of secure cloud-based search...
Article
Benefited from the proposals of function losses margin-based, face recognition has achieved significant improvements in recent years. Those losses aim to increase the margin between the different identities to enhance the discriminability. Ideally, the class center of different identities is far from each other, and face samples are compact around...
Article
Blockchain platform Ethereum has involved millions of accounts due to its strong potential for providing numerous services based on smart contracts. These massive accounts can be divided into diverse categories, such as miners, tokens, and exchanges, which is termed as account diversity in this paper. The benefit of investigating diversity are mult...
Article
Nowadays, the rapid growth of the number and variety of malware brings great security challenges. Machine learning has become a mainstream tool for effective malware detection, which can mainly be classified into static and dynamic analysis methods. The purpose of malware detection is to have a good and stable detection performance for different so...
Article
Full-text available
In vehicular edge computing (VEC), tasks and data collected by sensors on the vehicles can be offloaded to roadside units (RSUs) equipped with a set of servers through the wireless transmission. These tasks may be dependent of each other and can be modeled as a directed acyclic graph (DAG). The DAG scheduling problem is aimed at scheduling the task...
Article
Many systems have been built to employ the delta-based iterative execution model to support iterative algorithms on distributed platforms by exploiting the sparse computational dependencies between data items of these iterative algorithms in a synchronous or asynchronous approach. However, for large-scale iterative algorithms, existing synchronous...
Preprint
Full-text available
Due to its powerful feature learning capability and high efficiency, deep hashing has achieved great success in large-scale image retrieval. Meanwhile, extensive works have demonstrated that deep neural networks (DNNs) are susceptible to adversarial examples, and exploring adversarial attack against deep hashing has attracted many research efforts....
Article
Cyber security is dynamic as defenders often need to adapt their defense postures. The state-of-the-art is that the adaptation of network defense is done manually (i.e., tedious and error-prone). The ideal solution is to automate adaptive network defense, which is however a difficult problem. As a first step towards automation, we propose investiga...
Conference Paper
Federated learning (FL) enables multiple clients to collaboratively train an accurate global model while protecting clients' data privacy. However, FL is susceptible to Byzantine attacks from malicious participants. Although the problem has gained significant attention, existing defenses have several flaws: the server irrationally chooses malicious...
Article
The introduction of deep learning techniques in intrusion detection problems has enabled an enhanced standard of detection effectiveness. However, most of the progress has occurred in supervised learning, which required a vast amount of labeled training samples. In the real world, there is a limited amount of labeled data available to train a deep...
Conference Paper
Positive-unlabeled (PU) learning deals with the circumstances where only a portion of positive instances are labeled, while the rest and all negative instances are unlabeled, and due to this confusion, the class prior can not be directly available. Existing PU learning methods usually estimate the class prior by training a nontraditional probabilis...
Preprint
Community search, retrieving the cohesive subgraph which contains the query vertex, has been widely touched over the past decades. The existing studies on community search mainly focus on static networks. However, real-world networks usually are temporal networks where each edge is associated with timestamps. The previous methods do not work when h...
Article
In the context of heavy investments in life sciences, one consequence of rapid biotechnical developments is the generation of very large amounts of biological data that not only support bioinformatics research used in medical and health services by data analysis but also cause genetic privacy breaches for the data originators, limiting biomedical d...
Article
Traditional large-scale process manufacturing is gradually transformed into customized discrete manufacturing with the fierce global competition. Production planning has an important impact on improving manufacturing efficiency in the ever-changing from the view of engineering management. However, many nonprocessing-related factors in the flexible...
Article
Full-text available
As one of the emerging cloud computing technologies, containers are widely used in academia and industry. The cloud computing built by the container in the high performance computing (HPC) center can provide high-quality services to users at the edge. Singularity Definition File and Dockerfile (we refer to such files as recipes) have attracted wide...
Article
Since the advent of cryptocurrencies such as Bitcoin, blockchain, as their underlying technologies, has drawn a massive amount of attention from both academia and the industry. This ever-evolving technology inherits the “genes” of distributed systems, offering significant advantages of immutability, transparency, auditability, and tamper-resistance...
Conference Paper
Graph random walk is widely used in the graph processing as it is a fundamental component in graph analysis, ranging from vertices ranking to the graph embedding. Different from traditional graph processing workload, random walk features massive processing parallelisms and poor graph data reuse, being limited by low I/O efficiency. Prior designs fo...
Article
Supporting secure location-based services on encrypted data that is outsourced to cloud computing platforms remains an ongoing challenge for efficiency due to expensive ciphertext calculation overhead. Furthermore, since the clouds may not be trustworthy or even malicious, data security and result authenticity has caused huge concerns. Unfortunatel...
Article
In this paper, we propose efficient distributed algorithms for three holistic aggregation functions on random regular graphs that are good candidates for network topology in next-generation data centers. The three holistic aggregation functions include SELECTION (select the k-th largest or smallest element), DISTINCT (query the count of distinct el...
Preprint
Federated learning (FL) enables multiple clients to collaboratively train an accurate global model while protecting clients' data privacy. However, FL is susceptible to Byzantine attacks from malicious participants. Although the problem has gained significant attention, existing defenses have several flaws: the server irrationally chooses malicious...
Article
In the last decade, many studies have significantly pushed the limits of wireless device-free human sensing (WDHS) technology and facilitated various applications, ranging from activity identification to vital sign monitoring. This survey presents a novel taxonomy that classifies the state-of-the-art WDHS systems into eleven categories according to...
Preprint
The emerging Graph Convolutional Network (GCN) has now been widely used in many domains, and it is challenging to improve the efficiencies of applications by accelerating the GCN trainings. For the sparsity nature and exploding scales of input real-world graphs, state-of-the-art GCN training systems (e.g., GNNAdvisor) employ graph processing techni...
Preprint
While deep face recognition (FR) systems have shown amazing performance in identification and verification, they also arouse privacy concerns for their excessive surveillance on users, especially for public face images widely spread on social networks. Recently, some studies adopt adversarial examples to protect photos from being identified by unau...
Article
Wireless capsule endoscopy is a modern non-invasive Internet of Medical Imaging Things that has been increasingly used in gastrointestinal tract examination. With about one gigabyte image data generated for a patient in each examination, automatic lesion detection is highly desirable to improve the efficiency of the diagnosis process and mitigate h...
Preprint
Full-text available
Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These models leverage masked pre-training and Transformer and have achieved promising results. However, currently there i...
Article
Full-text available
The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily...
Article
This paper points out an important threat that application-level Garbage Collection (GC) creates to the use of non-volatile memory (NVM). Data movements incurred by GC may invalidate the pointers to objects on NVM, and hence harm the reusability of persistent data across executions. The paper proposes the concept of movement-oblivious addressing (M...
Preprint
Binary-source code matching plays an important role in many security and software engineering related tasks such as malware detection, reverse engineering and vulnerability assessment. Currently, several approaches have been proposed for binary-source code matching by jointly learning the embeddings of binary code and source code in a common vector...
Article
Deep learning has gained tremendous success in various fields while training deep neural networks (DNNs) is very compute-intensive, which results in numerous deep learning frameworks that aim to offer better usability and higher performance to deep learning practitioners. TensorFlow and PyTorch are the two most popular frameworks. TensorFlow is mor...
Article
Wi-Fi technology is becoming a promising enabler of device-free fitness tracking to provide reviews and recommendations for effective homely exercise. State-of-the-art Wi-Fi fitness assistants succeed in recognizing the simple meta-movements (e.g., Push-Up and Squat) with discrete and repeatable patterns. Unfortunately, these prior attempts can har...
Article
The emerging Graph Convolutional Network (GCN) has been widely used in many domains, where it is important to improve the efficiencies of applications by accelerating GCN trainings. Due to the sparsity nature and exploding scales of input real-world graphs, state-of-the-art GCN training systems (e.g., GNNAdvisor) employ graph processing technique...
Article
Due to the powerful automatic feature extraction, deep learning-based vulnerability detection methods have evolved significantly in recent years. However, almost all current work focuses on detecting vulnerabilities at a single granularity ( i.e ., slice-level or function-level). In practice, slice-level vulnerability detection is fine-grained but...