
Kai LeiPeking University | PKU · School of Electronic and Computer Engineering
Kai Lei
Asso. Professor
ienlab.com; icnlab.cn; hoticn.com; icenat.net
About
187
Publications
83,204
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,991
Citations
Introduction
Dr. Kai Lei is an Asso. Professor of Shenzhen Graduate School,Peking University and Director of “Shenzhen Key Lab for Information Centric Networking and Blockchain Technonglies (ICNLab, http://www.icnlab.cn)” . He has published over 160 papers. His current research interests include: ICN/NDN, Blockchain, Edge Networking and Federanl Learning. In 2017, He firstly proposed Intelligent Eco Networking (IEN) = NDN+Blockchain+AI
Additional affiliations
February 2005 - December 2018
Peking University Shenzhen Graduate School
Position
- Professor (Associate)
Description
- Director of the Shenzhen Key Lab for Information Centric Networking and Blockchain Technonglies (ICNLab) and Center for Internet Research and Engineering (CIRE): http://netlab.pkusz.edu.cn. He has published over 100 papers
January 2005 - present
Peking University Shenzhen Graduate School
Position
- Professor (Associate)
Education
September 2010 - July 2015
August 1998 - May 1999
September 1994 - July 1998
Publications
Publications (187)
Blockchain provides a new approach for participants to maintain reliable databases in untrusted networks without centralized authorities. It resolves the present problem that all the deals and cooperations between strangers have to rely on trusted third parties. Meanwhile, blockchain makes it possible that a network of completely homogeneous nodes...
Intelligent Eco Networking (IEN) makes significant progress to be a shared in-network computing infrastructure towards the future Internet, owing to its value-oriented ideology, content-centric fashion, intelligent collaborative management, and decentralized consensus trust preservation. Comprehensively uniting resources of computing, storage, and...
Once deployed, a decentralized blockchain system ensures that it will operate faithfully so that no one can interfere with or manipulate its predefined regulations, such as block size and block creation interval investigated in this paper. However, fixed regulations prevent that system from adapting to the change of the environment, such as increas...
The recent marriage of materials science and artificial intelligence has created the need to extract and collate materials information from the tremendous backlog of academic publications. However, this is notoriously hard to achieve in sophisticated application domains, such as Li‐ion battery (LIB) cathodes, which require multiple variables for ma...
In the 5G scenario of the convergence of information technology (IT) and communication technology (CT), multi-operators collaborate to form edge computing, which makes the problem of resource optimization more complicated than ever. Users may access resources deployed by various MEC’s operators to achieve ultra-low latency. However, traditional res...
Software Defined Networking (SDN) simplifies network
control and management by decoupling the control plane
from the data plane. However, the actual packet behaviors, conforming
to the rules in the data plane flow tables, may violate the
original policies in the controller due to the inconsistency between
the data plane and control plane. To addres...
Named data networking (NDN) is a promising future network architecture in 5G edge computing scenarios because it supports multicast, mobility, in-network caching, and security. The key problem of service invocation in edge computing is how to dynamically select the appropriate edge CNs for the computing requester according to the edge CNs and netwo...
In data center networks (DCNs), flows with different objectives coexist and compete for limited network resources (such as bandwidth and buffer space). Without harmonious resource planning, chaotic competition among these flows would lead to severe performance degradation. Furthermore, low latency is critical for many emerging applications such as...
A smart Ponzi scheme is a new form of economic crime that uses Ethereum smart contract account and cryptocurrency to implement Ponzi scheme. The smart Ponzi scheme has harmed the interests of many investors, but researches on smart Ponzi scheme detection is still very limited. The existing smart Ponzi scheme detection methods have the problems of r...
Reliable identity management and authentication are significant for network security. In recent years, as traditional centralized identity management systems suffer from security and scalability problems, decentralized identity management has received considerable attention in academia and industry. However, with the increasing sharing interaction...
Non-orthogonal multiple access (NOMA) has been considered a promising technique for the fifth generation (5G) mobile communication networks because of its high spectrum efficiency. In NOMA, by using successive interference cancellation (SIC) techniques at the receivers, multiple users with different channel gain can be multiplexed together in the s...
Information-Centric Networking (ICN) takes advances in content distribution, thanks to the in-network caches which naturally reduce the delay of upper-layer applications. Though attaining high dissemination efficiency, the in-network caching amplifies the challenges in the consistency of cached copies. Since ICN decouples data from the locations, i...
Confronted with the explosive growth of data in various cyber systems, decentralized storage has gained popularity thanks to its ubiquitousness, since the devices can share their idle storage space for higher scalability. However, current decentralized storage networks suffer the failure risk due to the insufficient trust among nodes, in addition t...
Recent advances in artificial intelligence, big data, mobile edge computing and embedded systems have successfully driven the emergence and adoption of smart vehicles and vehicle edge computing which will improve road safety, traffic congestions, and vehicle exhaust emissions. The high-mobility, ad-hoc network topology, and diverse vehicle-to-every...
This study considers the problem of hybrid community detection in attributed networks based on the information of network topology and attributes with the aim to address the following two shortcomings of existing hybrid community detection methods. First, many of these methods are based on the assumption that network topology and attributes carry c...
As a key problem in artificial intelligence, question answering (QA) has always been a topic of intensive research. Most existing methods cast question answering as an answer selection task. The size of the candidate answer pool is usually very large, so it is difficult to accurately select the correct answer. One of the solutions is to narrow the...
Sentence matching, which aims to capture the semantic relationship between two sequences, is a crucial problem in NLP research. It plays a vital role in various natural language tasks such as question answering, natural language inference and paraphrase identification. The state-of-the-art works utilize the interactive information of sentence pairs...
With the emergence of ever-advancing network threats, the guarantee of system security becomes increasingly crucial, especially in the dynamic and decentralized ad-hoc networks. One essential part of cybersecurity is intrusion detection, which identifies anomalous activities according to traffic patterns. However, the class-imbalanced data have cau...
Congestion control is a fundamental network task that modulates the data transmission rates of traffic sources to efficiently utilize network capacity. With the advent of machine learning, congestion control based on deep reinforcement learning is the subject of extensive attention. At present, research on machine-learning-based congestion control...
Review text has been widely studied in traditional tasks such as sentiment analysis and aspect extraction. However, to date, no work is toward the end-to-end abstractive review summarization that is essential for business organizations and individual consumers to make informed decisions. This study takes the lead to study the aspect/sentiment-aware...
Multi-hop reasoning over paths in knowledge graphs has attracted rising research interest in the field of knowledge graph completion. Entity types and relation types both contain various kinds of information content though only a subset of them are helpful in the specific triples. Although significant progress has been made by existing models, they...
Credit scoring on class imbalance data, where the class of defaulters is insufficiently represented compared with the class of non-defaulters, is an important but challenging task. In this paper, we propose an imbalanced generative adversarial fusion network (IGAFN) to cope with the class imbalance credit scoring based on multi-source heterogeneous...
Tag recommendation has been attracting much attention with the growth of digital resources. The goal of a tag recommendation system is to provide a set of tags for a piece of text to ease the tagging process done manually by a user. These tags have been shown to enhance the capabilities of search engines for navigating, organizing and searching con...
Distant supervision based methods for entity and relation extraction have received increasing popularity due to the fact that these methods require light human annotation efforts. In this paper, we consider the problem of \textit{shifted label distribution}, which is caused by the inconsistency between the noisy-labeled training set subject to exte...
Large scale Internet of Things (IoT) applications in the 5G era pose sever challenges on the network architecture in terms of heterogeneity,scalability,mobility and security.Due to the identification and location overloading problem of IP,TCP/IP based network architecture appears inefficient in addressing the challenges mentioned above.Named Data N...
Powered by a number of smart devices distributed throughout the whole network, the Internet of Things (IoT) is supposed to provide services computing for massive data from devices. Fog computing, an extension of cloud-based IoT-oriented solutions, has emerged with requirements for distribution and decentralization. In this respect, the conjunction...
We study the community question answering (CQA) problem that emerges with the advent of numerous community forums in the recent past. The task of finding appropriate answers to questions from informative but noisy crowdsourced answers is important yet challenging in practice. We present an Attentive User-engaged Adversarial Neural Network (AUANN),...
Open-domain dialog generation, which is a crucial component of artificial intelligence, is an essential and challenging problem. In this article, we present a personalized dialog system, which leverages the advantages of multitask learning and reinforcement learning for personalized dialogue generation (MRPDG). Specifically, MRPDG consists of two s...
Recently, attention-based encoder-decoder models have been used extensively in image captioning. Yet there is still great difficulty for the current methods to achieve deep image understanding. In this work, we argue that such understanding requires visual attention to correlated image regions and semantic attention to coherent attributes of intere...
Reachability preserving compression generates small graphs that preserve the information only relevant to reachability queries, and the compressed graph can answer any reachabil- ity query without decompression. Existing reachability preserving compression algorithms either require a long compression time or include redundant data in the compressed...
As the trend of knowledgelization of content, evaluation of knowledge, networking
of value, ecologicalization of network and intellectualization of ecology is becoming
increasingly prominent in the future Internet. In this paper, we propose the concept of intelligent eco networking (IEN) for the future Internet, countering with the deficiencies
of...
Blockchain builds a distributed point-to-point system, which is widely used in the fields of financial economy, Internet of Things (IoT), big data, cloud computing and edge computing. Meanwhile, edge artificial intelligence (AI) computing refers to the emergence of swarm intelligence AI computing model for edge network application scenarios. Althou...
Community detection is a fundamental step in social network analysis. In this study, we focus on the hybrid identification of overlapping communities with network topology (e.g., user interaction) and edge-induced content (e.g., user messages). Conventional hybrid methods tend to combine topology and node-induced content to explore disjoint communi...
Abstract: Intelligent Eco Networking (IEN) is intent on evolving the IP-based network architecture into a Knowledge-Driven Future Internet Infrastructure based on hierarchical heterogeneous networks composed of virtualized and configurable devices, taking the valuable contents as first-class entity, d riven by knowledge intelligence, integrating th...
Relations between medical concepts convey meaningful medical knowledge. Relation extraction on medical corpus is an important task of information extraction and is the key step of building medical knowledge graph. However, medical entity relation extraction generally has two major issues: (i) In the medical domain, the sentences containing entities...
Knowledge Graph (KG) contains entities and the relations between entities. Due to its representation ability, KG has been successfully applied to support many medical/healthcare tasks. However, in the medical domain, knowledge holds under certain conditions. Such conditions for medical knowledge are crucial for decision-making in various medical ap...
The current research on the blockchain includes study on network architecture and the incentive. In this paper, an introduction to the architecture of information technology was given and the goal and research status of the incentive layer of blockchain were illustrated with digital economy development as the backdrop. The existing issuance of toke...
In this study, we focus on the graph representation learning task in attributed networks. Different from existing embedding methods that treat the incorporation of network structure and semantic as the simple combination of two optimization objectives , we propose a novel Semantic Graph Representation (SGR) model to formulate the joint optimization...
In this paper, we study the network traffic classification task. Different from existing supervised methods that rely heavily on the labeled statistic features in a long period (e.g., several hours or days), we adopt a novel view of unsupervised profiling to explore the flow features and link patterns in a short time window (e.g., several seconds),...
Recently, attention-based encoder-decoder models have been used extensively in image captioning. Yet there is still great difficulty for the current methods to achieve deep image understanding. In this work, we argue that such understanding requires visual attention to correlated image regions and semantic attention to coherent attributes of intere...
Kai Lei Maoyu Du Liwei Yang- [...]
Kuai Xu
The last decade has witnessed the explosive growth of digital tokens and cryptocurrencies including Bitcoin, reward programs and various tokens on Ethereum. However, trading these heterogeneous and non-financial digital assets often encounters substantial liquidity barriers due to the lack of interoperable standards, and faces an equilibrium dilemm...
Answer selection and knowledge base question answering (KBQA) are two important tasks of question answering (QA) systems. Existing methods solve these two tasks separately, which requires large number of repetitive work and neglects the rich correlation information between tasks. In this paper, we tackle answer selection and KBQA tasks simultaneous...
Multi-task learning is a machine learning approach
learning multiple tasks jointly while exploiting commonalities
and differences across tasks. A shared representation is learned
by multi-task learning, and what is learned for each task can
help other tasks be learned better. Most of existing multi-task
learning methods adopt deep neural network as...
The last decade has witnessed the explosive growth of malicious Internet domains which serve as the fundamental infrastructure for establishing advanced persistent threat command and control communication channels or hosting phishing Web sites. Given the big data nature of Internet traffic data and the ability of algorithmically generating domains...
In image-grounded text generation, fine-grained representations of the image are considered to be of paramount importance. Most of the current systems incorporate visual features and textual concepts as a sketch of an image. However, plainly inferred representations are usually undesirable in that they are composed of separate components, the relat...
Heterogeneous Internet of Things (IoT) and multi-access mobile edge computing (MA-MEC) are believed as supporting technologies for building smart city. The advancement and flourish of IoT are facilitating the entry of human society into the Internet of everything era, which lay the foundation of smart city. To address the conflict between computati...
In this paper, we generally formulate the dynamics prediction problem of various network systems (e.g., the prediction of mobility, traffic and topology) as the temporal link prediction task. Different from conventional techniques of temporal link prediction that ignore the potential non-linear characteristics and the informative link weights in th...
Communication bandwidth is a bottleneck in distributed machine learning, and limits the system scalability. The transmission of gradients often dominates the communication in distributed SGD. One promising technique is using the gradient compression to reduce the communication cost. Recently, many approaches have been developed for the deep neural...
Abstract Efficient representations of drugs provide important support for healthcare analytics, such as drug–drug interaction (DDI) prediction and drug–drug similarity (DDS) computation. However, incomplete annotated data and drug feature sparseness create substantial barriers for drug representation learning, making it difficult to accurately iden...
Measuring drug-drug similarity is important but challenging. Significant progresses have been made in drugs whose labeled training data is sufficient and available. However, handling data skewness and incompleteness with domain-specific knowledge graph, is still a relatively new territory and an under-explored prospect. In this paper, we present a...
Automatic real-time summarization of massive document streams on the Web has become an important tool for quickly transforming theoverwhelming documents into a novel, comprehensive and concise overview of an event for users. Significant progresses have been made in static text summarization. However, most previous work does not consider the tempora...
Answer selection and knowledge base question answering (KBQA) are two important tasks of question answering (QA) systems. Existing methods solve these two tasks separately, which requires large number of repetitive work and neglects the rich correlation information between tasks. In this paper, we tackle answer selection and KBQA tasks simultaneous...
Computing the semantic similarity accurately between words is an important but challenging task in the semantic web field. However , the semantic similarity measures involve the comprehensiveness of knowledge learning and the sufficient training of words of both high and low frequency. In this study, an approach MedSim is presented for semantic sim...
The Practical Byzantine Fault Tolerance algorithm (PBFT) has been highly applied in consortium blockchain systems , however, this kind of consensus algorithm can hardly identify and remove faulty nodes in time, and also vulnerable to many attacks against the primary node of PBFT. The equality of consortium members' discourse rights is inapplicable...
Fog Computing which extends the cloud computing paradigm to the edge of the network provides great opportunities for applications with stringent latency requirement. How to allocate the limited caching resources of Fog Nodes (FNs) influences the performance of the fog computing system. In contrast to previous works on caching resource allocation wi...
Existing flow scheduling schemes in Data Center Network (DCN) are designed mainly to minimize the flow complete time (FCT) of short flows and do not consider optimizing the FCT of latency-sensitive long flows (e.g. VR video streaming, interactive artificial intelligence questionanswer stream). Besides, among these traffic scheduling schemes, the in...
Answer selection and knowledge base question answering (KBQA) are two important tasks of question answering (QA) systems. Existing methods solve these two tasks separately, which requires large number of repetitive work and neglects the rich correlation information between tasks. In this paper, we tackle answer selection and KBQA tasks simultaneous...
Named entity discovery and linking is the fundamental and core component of question answering. In Question Entity Discovery and Linking (QEDL) problem, traditional methods are challenged because multiple entities in one short question are difficult to be discovered entirely and the incomplete information in short text makes entity linking hard to...
Automatic text classification (TC) research can be used for real-world problems such as the classification of in-patient discharge summaries and medical text reports, which is beneficial to make medical documents more understandable to doctors. However, in electronic medical records (EMR), the texts containing sentences are shorter than that in gen...