Meng Zhao’s research while affiliated with Beijing Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


Multi-Features Group Emotion Analysis Based on CNN for Weibo Events
  • Article

December 2017

·

86 Reads

·

5 Citations

DEStech Transactions on Computer Science and Engineering

·

MENG ZHAO

·

JUN GUAN

·

HEYAN HUANG


Utilizing Crowdsourcing for the Construction of Chinese-Mongolian Speech Corpus with Evaluation Mechanism

September 2017

·

32 Reads

·

3 Citations

Communications in Computer and Information Science

Crowdsourcing has been used recently as an alternative to traditional costly annotation by many natural language processing groups. In this paper, we explore the use of Wechat Official Account Platform (WOAP) in order to build a speech corpus and to assess the feasibility of using WOAP followers (also known as contributors) to assemble speech corpus of Mongolian. A Mongolian language qualification test was used to filter out potential non-qualified participants. We gathered natural speech recordings in our daily life, and constructed a Chinese-Mongolian Speech Corpus (CMSC) of 31472 utterances from 296 native speakers who are fluent in Mongolian, totalling 30.8 h of speech. Then, an evaluation experiment was performed, in where the contributors were asked to choose a correct sentence from a multiple choice list to ensure the high-quality of corpus. The results obtained so far showed that crowdsourcing for constructing CMSC with an evaluation mechanism could be more effective than traditional experiments requiring expertise.


Assembling Chinese-Mongolian Speech Corpus via Crowdsourcing

June 2017

·

24 Reads

Lecture Notes in Computer Science

Chinese-Mongolian Speech Corpus (CMSC) is utilized in many practical applications in recent years, and it is a kind of low-resource corpus due to its high-cost construction. We describe a crowdsourcing method to build a collection of bilingual speech corpus through the use of a messaging app called WeChat, in which followers can send voice and text message to our Official Account Platform freely. Owing to most followers are fluent in Chinese and Mongolian, we gathered natural speech recordings in our daily life, and constructed a parallel speech corpus of 20547 utterances from 296 speakers, totalling 21.43 h of speech, during the first 25 days that collecting notification was pushed. Moreover, we present a quality control measure in the evaluation part that independent subscribers voted on the translations of each source sentence and it improves the quality of corpus markedly. We show that WeChat Official Account Platform can be used to assemble speech corpus quickly and cheaply, with near-expert accuracy. As the basic research content of natural language processing (NLP), the construction of bilingual speech corpus via crowdsourcing has a reference value for the similar studies.


Identifying Suspected Cybermob on Tieba

October 2016

·

58 Reads

·

1 Citation

Lecture Notes in Computer Science

This paper describes an approach to identify suspected cybermob on social media. Many researches involve making predictions of group emotion on Internet (such as quantifying sentiment polarity), but this paper instead focuses on the origin of information diffusion, namely back to its makers and contributors. According our previous findings that have shown, at the level of Tieba’s contents, the negative information or emotions spread faster than positive ones, we centre on the maker of negative message in this paper, so-called cybermobs who post aggressive, provocative or insulting remarks on social websites. We explore the different characteristics between suspected cybermobs and general netizens and then extract relative unique features of suspected cybermobs. We construct real system to identify suspected cybermob automatically using machine learning method with above features, including other common features like user/content-based ones. Empirical results show that our approach can detect suspected cybermob correctly and efficiently as we evaluate it with benchmark models, and apply it to actual cases.

Citations (3)


... Authors used two distinct pre-trained word embeddings to produce different input images, which were then input into the LSTM. Aside from word embedding characteristics, a research [48] identified individual as well as evidenced automated characteristics from a sample obtained from the Chinese online community Sina Weibo. The characteristics derived from their research are sent into the LSTM network for classification purposes. ...

Reference:

Systematic Literature Review on Sentiment Analysis in Airline Industry
A hierarchical lstm model with multiple features for sentiment analysis of sina weibo texts
  • Citing Conference Paper
  • December 2017

... Literature [18] proposes a multilingual translation model with an incremental selflearning strategy, which solves the problem of data scarcity by generating pseudobilingual data automatically, but the pseudobilingual data may have noise issues, lowering translation quality. e cooccurrence rule of words in the target language in both parallel and comparable corpora is essentially the same as in the source language, according to the literature [19]. ...

Utilizing Crowdsourcing for the Construction of Chinese-Mongolian Speech Corpus with Evaluation Mechanism
  • Citing Conference Paper
  • September 2017

Communications in Computer and Information Science