Shumin Shi

Shumin Shi
Beijing Institute of Technology | BIT · School of Computer Science & Technology

About

60
Publications
4,888
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
215
Citations

Publications

Publications (60)
Article
Full-text available
Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly common. However, detecting sarcasm in various forms of communication can be difficult due to conflicting sentiments. In this paper, we introduce a contrasting sentiment‐based model for multimodal sarcasm detection (CS4MSD), which...
Article
Full-text available
In recent years, significant progress has been made in the field of non-autoregressive machine translations. However, the accuracy of non-autoregressive models still lags behind their autoregressive counterparts. This discrepancy can be attributed to the abundance of repetitive tokens in the target sequences generated by non-autoregressive models....
Article
Dialogue state tracking plays a key role in tracking user intentions in task-oriented dialogue systems. Traditional dialogue state tracking methods usually rely on selecting slot values from a fixed ontology to represent the dialogue state. In recent years, more flexible open vocabulary based approaches have become the mainstream focus which are ma...
Chapter
The key issue of session-based recommendation (SBR) is how to efficiently predict the next interaction item based on the item sequence of anonymous users. In order to mine the complex multivariate relationship between items and sessions, we propose a novel model for session-based recommendation named Interest aware Dual-channel Graph Contrastive le...
Article
Non-Autoregressive Transformer, due to its low inference latency, has attracted many attentions from researchers. Although, the performance of non-autoregressive transformer has been significantly improved in recent years, there is still a gap between non-autoregressive transformer and autoregressive transformer. Considering the success of localnes...
Article
With the development of Internet of Things and cloud computing, intelligent question-answering (QA) has brought great convenience to human’s daily activities. As one of the core technologies, sentence semantic matching (SSM) plays a critical role in a variety of intelligent QA systems. However, existing SSM methods usually first encode sentences on...
Preprint
Full-text available
More tasks in Machine Reading Comprehension(MRC) require, in addition to answer prediction, the extraction of evidence sentences that support the answer. However, the annotation of supporting evidence sentences is usually time-consuming and labor-intensive. In this paper, to address this issue and considering that most of the existing extraction me...
Chapter
We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image cap...
Article
In non-autoregressive machine translation, the target tokens are generated by the decoder in one shot. Although this decoding process can significantly reduce the decoding latency, non-autoregressive machine translation still suffers from the sacrifice of translation accuracy. We argue that the reason for such decrease is the lack of the target dep...
Chapter
Named Entity Recognition (NER) is one of fundamental researches in natural language processing. Chinese nested-NER is even more challenging. Recently, studies on NER have generally focused on the extraction of flat structures by sequence annotation strategy while ignoring nested structures. In this paper, we propose a novel model, named LACNNER, th...
Article
Full-text available
Theinference stage can be accelerated significantly using a Non-Autoregressive Transformer (NAT). However, the training objective used in the NAT model also aims to minimize the loss between the generated words and the golden words in the reference. Since the dependencies between the target words are lacking, this training objective computed at wor...
Article
Full-text available
Emotion cause extraction (ECE) task that aims at extracting potential trigger events of certain emotions has attracted extensive attention recently. However, current work neglects the implicit emotion expressed without any explicit emotional keywords, which appears more frequently in application scenarios. The lack of explicit emotion information m...
Chapter
Relation Extraction (RE) requires the model to classify the correct relation from a set of relation candidates given the corresponding sentence and two entities. Recent work mainly studies how to utilize more data or incorporate extra context information especially with Pre-trained Language Models (PLMs). However, these models still face with the c...
Article
Statistical machine translation (SMT) models rely on word-, phrase-, and syntax- level alignments. But neural machine translation (NMT) models rarely explicitly learn the phrase- and syntax- level alignments. In this paper, we propose to improve NMT by explicitly learning the bilingual syntactic constituent alignments. Specifically, we first utiliz...
Article
Neural Machine Translation (NMT) brings promising improvements in translation quality, but until recently, these models rely on large-scale parallel corpora. As such corpora only exist on a handful of language pairs, the translation performance is far from the desired effect in the majority of low-resource languages. Thus developing low-resource la...
Article
Full-text available
Non-autoregressive machine translation aims to speed up the decoding procedure by discarding the autoregressive model and generating the target words independently. Because non-autoregressive machine translation fails to exploit target-side information, the ability to accurately model source representations is critical. In this paper, we propose an...
Chapter
In recent years, non-autoregressive machine translation has achieved great success due to its promising inference speedup. Non-autoregressive machine translation reduces the decoding latency by generating the target words in single-pass. However, there is a considerable gap in the accuracy between non-autoregressive machine translation and autoregr...
Article
Full-text available
In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT t...
Article
Dependency parsing is an important task for Natural Language Processing (NLP). However, a mature parser requires a large treebank for training, which is still extremely costly to create. Tibetan is a kind of extremely low-resource language for NLP, there is no available Tibetan dependency treebank, which is currently obtained by manual annotation....
Chapter
Open-world knowledge graph completion aims to find a set of missing triples through entity description, where entities can be either in or out of the graph. However, when aggregating entity description’s word embedding matrix to a single embedding, most existing models either use CNN and LSTM to make the model complex and ineffective, or use simple...
Article
Neural machine translation has improved the translation accuracy greatly and received great attention of the machine translation community. Tree-based translation models aim to model the syntactic or semantic relation among long-distance words or phrases in a sentence. However, it faces the difficulties of expensive manual annotation cost and poor...
Chapter
Emotion cause detection (ECD) that aims to extract the trigger event of a certain emotion explicitly expressed in text has become a hot topic in natural language processing. However, the performance of existing models all suffers from inadequate sentiment information fusion and the limited size of corpora. In this paper, we propose a novel model to...
Article
Full-text available
Semantic textual similarity (\(\mathcal {STS}\)) seeks to assess the degree of semantic equivalence between two sentences or snippets of texts. Most methods of \(\mathcal {STS}\) are based on word surface and deem words as meaning unrelated symbols, which makes these methods indiscriminative for ubiquitous conceptual association among words. Recent...
Article
Full-text available
Word order is one of the most significant differences between the Chinese and Vietnamese. In the phrase-based statistical machine translation, the reordering model will learn reordering rules from bilingual corpora. If the bilingual corpora are large and good enough, the reordering rules are exact and coverable. However, Chinese-Vietnamese is a low...
Article
: Statistical machine translation for low-resource language suffers from the lack of abundant training corpora. Several methods, such as the use of a pivot language, have been proposed as a bridge to translate from one language to another. However, errors will accumulate during the extensive translation pipelines. In this paper, we propose an appro...
Conference Paper
Along with the imperatives of Chinese education strategy and the practical requirements of our university (Beijing Institute of Technology, BIT)'s Double-First Class development blueprint, the cultivation of innovative talents becomes even more important. Since more than a decade of professional accumulation and talent training, school of computer...
Conference Paper
Tibetan syntactic functional chunk parsing is aimed at identifying syntactic constituents of Tibetan sentences. In this paper, based on the Tibetan syntactic functional chunk description system, we propose a method which puts syllables in groups instead of word segmentation and tagging and use the Conditional Random Fields (CRFs) to identify the fu...
Conference Paper
Crowdsourcing has been used recently as an alternative to traditional costly annotation by many natural language processing groups. In this paper, we explore the use of Wechat Official Account Platform (WOAP) in order to build a speech corpus and to assess the feasibility of using WOAP followers (also known as contributors) to assemble speech corpu...
Conference Paper
Chinese-Mongolian Speech Corpus (CMSC) is utilized in many practical applications in recent years, and it is a kind of low-resource corpus due to its high-cost construction. We describe a crowdsourcing method to build a collection of bilingual speech corpus through the use of a messaging app called WeChat, in which followers can send voice and text...
Conference Paper
According to the status of unified enrollment of computer specialty for undergraduates in BIT (Beijing Institute of Technology) and the mode of EEEP (Excellent Engineer Education Project) in China, we analyze the concrete requirements of "Output-Oriented" teaching evaluation in International Engineering Education Accreditation, and propose a FCTS:...
Conference Paper
This paper describes an approach to identify suspected cybermob on social media. Many researches involve making predictions of group emotion on Internet (such as quantifying sentiment polarity), but this paper instead focuses on the origin of information diffusion, namely back to its makers and contributors. According our previous findings that hav...
Conference Paper
Minimal Recursion Semantics (MRS) is a framework for computational semantics that is suitable for parsing and generation. To represent Chinese texts using MRS, we built a lexicon with the rich semantic knowledge of HowNet and defined36 grammar rules and 47 types. The types, words, and rules are described by TDL (Type Description Language) and imple...
Conference Paper
This paper proposed a novel method to evaluate the performance of New Word Detection (NWD) based on repeats extraction. For small-scale corpus, we put forward employing Conditional Random Field (CRF) as statistical framework to estimate the effects of different strategies of NWD. For the situations of large-scale corpus, as there is no infinity of...
Conference Paper
Semantic chunk is able to well describe the sentence semantic framework. It plays a very important role in Natural Language Processing applications, such as machine translation, QA system and so on. At present, the Tibetan chunk researches are mainly based on rule-methods. In this paper, according to the distinctive language characteristics of Tibe...
Conference Paper
Internet has become an excellent source for gathering consumer reviews, while opinion of consumer reviews expressed in sentiment words. However, due to the fuzziness of Chinese word itself, the sentiment judgments of people are more subjective. Studies have shown that the polarities and strengths judgment of sentiment words obey Gaussian distributi...
Article
Full-text available
Theoretical studies on the cyclopentadienyliron derivatives Cp2Fe2(CN) n (Cp = η5-C5H5; n = 6, 5, 4, 3, 2, 1) indicate that high-spin species with terminal Cp rings and bridging cyanide ligands up to a maximum of two bridges are predicted to be the lowest energy structures, which are paramagnetic complexes. Cp2Fe2(CN)3 and Cp2Fe2(CN)2 appear to be...
Article
Full-text available
The inhibition performance of nine imidazoline molecules against the corrosion of steel in 15 wt.% HCl and 3 wt.% HF solution was studied by weight-loss method, quantum chemical calculation, molecular dynamics simulation and the quantitative structure-activity relationship (QSAR) analysis. The quantum chemical calculation involved in local reactivi...
Conference Paper
Existing sort algorithms are difficult to implement Chinese string sort in user-defined sequence. This paper proposes an efficient string sort method in user-defined character order. On the basis of the consecutive numbers which used to define the custom order of characters, the hash table structure is employed to convert each string into correspon...
Conference Paper
The number of people who obtain information and express ideas via the Internet is increasing rapidly. Research on identifying how much attention paid to a given online topic plays an important role in the field of public opinion management. We propose a method to predict the netizens' attention on a specific online topic in this paper. Firstly, we...
Conference Paper
Named entity recognition (NER) is a very important task in natural language processing (NLP). In this paper we present a semi-supervised approach to extract bilingual named entity, starting from a bilingual corpus where the named entities are extracted independently for each language. Then a bilingual co-training algorithm is used to improve the na...
Conference Paper
Domain dictionary is very useful in many Natural Language Processing (NLP) applications. This paper proposes a gloss-based word domain assignment algorithm to build domain dictionaries from machine-readable dictionary. Experiments on WordNet2.0 show that 62.53% of the first domain labels can match with the WordNet Domains3.0. Compared with the trad...
Article
Source code documents are vulnerable to being plagiarized. As the central component of Code Plagiarism Detection (CPD), Code Similarity Detection (CSD) attracts more and more attention. In this paper, we proposed a new method for CSD by combining structure metric with semantic computing techniques. It is capable of identifying not only the primary...
Article
This paper proposed a pragmatic model for repeat-based Chinese New Word Extraction (NWE). It contains two innovations. The first is a formal description for the process of NWE, which gives instructions on feature selection in theory. On the basis of this, the Conditional Random Fields model (CRF) is selected as statistical framework to solve the fo...

Network

Cited By