Donghua Zhu’s research while affiliated with Beijing Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (81)


Figure 1
Figure 2
Figure 4
Figure 5
Unraveling Scientific Evolutionary Paths: An Embedding-Based Topic Analysis
  • Article
  • Full-text available

October 2023

·

349 Reads

·

5 Citations

IEEE Transactions on Engineering Management

·

·

·

[...]

·

Donghua Zhu

Understanding the evolution of knowledge has been and will continue to be the key task of science, technology, and innovation management. Existing research on evolutionary path identification relies primarily on traditional co-occurrence analysis and bag-of-words (BOW)–based models for topic extraction. However, these approaches have limitations in effectively capturing the underlying semantics and linkages of the topics. In this article, we propose a novel embedding-based methodology for scientific evolution analysis, in which word embedding, document embedding, clustering, and network analysis are applied to extract topics, measure topical semantic similarities, and quantitatively distinguish topics’ evolutionary states. We first perform benchmark experiments to demonstrate that doc2vec generally outperforms the BOW-based models in topic extraction before evolution analysis. We then consider topic consistency in vector spaces to identify evolutionary states including newborn, convergence, inheritance, and extinction. Scientific evolutionary paths are finally unraveled based on topic similarity matrixes and evolutionary states. We conduct a case study on object detection research to validate the effectiveness of our methodology. The empirical results, validated by domain experts, demonstrate that the proposed methodology is capable of effectively revealing patterns of knowledge inheritance and integration. Consequently, this methodology can be used to improve decision-making processes in future innovation management.

Download



Reviewer recommendation method for scientific research proposals: a case for NSFC

May 2022

·

269 Reads

·

10 Citations

Scientometrics

Peer review is one of the important procedures to determine which research proposals are to be funded and to evaluate the quality of scientific research. How to find suitable reviewers for scientific research proposals is an important task for funding agencies. Traditional methods for reviewer recommendation focus on the relevance of the proposal and knowledge of candidate reviewers by mainly matching the keywords or disciplines. However, the sparsity of keyword space and the broadness of disciplines lead to inaccurate reviewer recommendations. To overcome these limitations, this paper introduces a reviewer recommendation method (RRM) for scientific research proposals. This research applies word embedding to construct vector representation for terms, which provides a semantic and syntactic measurement. Further, we develop representation models for reviewers’ knowledge and proposals, and recommend reviewers by matching two representation models incorporating ranking fusions. The proposed method is implemented and tested by recommending reviewers for scientific research proposals of the National Natural Science Foundation of China. This research invites reviewers to provide feedback, which works as the benchmark for evaluation. We construct three evaluation metrics, Precision, Strict-precision, and Recall. The results show that the proposed reviewer recommendation method highly improves the accuracy. Research results can provide feasible options for the decision-making of the committee, and improve the efficiency of funding agencies.


Discovering technology opportunities based on the linkage between technology and business areas: matching patents and trademarks

November 2021

·

122 Reads

·

12 Citations

Technology Analysis and Strategic Management

Technology opportunity analysis (TOA) is of great help to technology innovation and R&D strategy of enterprises. Most previous studies of TOA only focused on scientific research or technology development phases, seldom linking technology opportunities to business applications. Using both patent and trademark data, we focused on the linkage of technology and business areas and discovered technology opportunities based on firms’ existing technology base. First, we extracted common product keywords from two data sources to construct document-keyword matrices. Then, we matched patents and trademarks with common keywords to build the relevancy network between technology and the business areas. Finally, we discovered technology opportunities from potential undeveloped business areas, taking into account technology-business relevancy and firms’ technology base. We conducted a case study on 3D printing to test our method.


Evaluating scientific impact of publications: combining citation polarity and purpose

October 2021

·

48 Reads

·

21 Citations

Scientometrics

Citation counts are commonly used to evaluate the scientific impact of a publication on the general premise that more citations probably mean more endorsements. However, two questionable assumptions underpin this idea: a) that all authors contributed equally to the paper; and b) that the endorsement is positive. Obviously, neither of these assumptions hold true. Hence, with this study, we examine two components of citations—their purpose, i.e., the reason for the citation, and polarity, being the author’s attitude toward the cited work. Our findings provide a new perspective on the scientific impact of highly-cited publications. Our methodology consists of three steps. Firstly, a pre-trained model composed of a Word2Vec—a well-known word embedding approach—and a convolutional neural network (CNN) is used to identify citation polarity and purpose. Secondly, in a set of highly-cited papers, we compare eight categories of purpose from foundational to critical and three categories of polarity: positive, negative, and neutral. We further explore how different types of papers—those discussing discoveries or those discussing utilitarian topics—influence the evaluation of scientific impact of papers. Finally, we mine and discover the knowledge (e.g. method, concept, tool or data) to explain the actual scientific impact of a highly-cited paper. To demonstrate how combining citation polarity with purpose can provide far greater details of a paper’s scientific impact, we undertake a case study with 370 highly-cited journal articles spanning “Biochemistry & Molecular Biology” and “Genetics & Heredity”. The results yield valuable insights into the assumption about citation counts as a metric for evaluating scientific impact.


An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data

August 2021

·

27 Reads

·

11 Citations

Computational Biology and Chemistry

To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework’s efficacy at identifying miRNA disease associations.


R&D trend analysis based on patent mining: An integrated use of patent applications and invalidation data

June 2021

·

50 Reads

·

40 Citations

Technological Forecasting and Social Change

Formulating good R&D strategy requires sound knowledge of the past and present R&D trends in various industry sectors. Therefore, this paper outlines a framework for mining industry level R&D trends from patents that were designed for enterprises. Unlike the current alternatives, the approach presented here considers both patent applications and invalidated patents, i.e., those patents that have expired, lapsed, or been revoked. The result is a richer and more comprehensive analysis that covers the full lifespan of a targeted technology from emergence to decline. The framework comprises of a LDA topic model that identifies the technologies and sub-technologies, and of each individual patent and invalidated patent. Then, two specifically designed measures chart the stages of the technologies’ life. An application metric reflects annual levels of interest in an area, while an invalidation metric traces waning interest. The output is a series of trend maps that show the levels of interest and disinterest in different avenues of inquiry over time. Charted on different axes, these two metrics create two distinct trend lines that reflect the different changes over a technology's lifecycle. A case study that focused on China's 3-D printing technology illustrates the approach. The analysis results are highly consistent with the present technology trends across industries, which indicates that the method could serve as a useful reference tool for analyzing R&D trends and creating new R&D strategies.


Exploring the role of companies in scientific research: a case study of genetically modified maize

September 2020

·

86 Reads

·

5 Citations

Technology Analysis and Strategic Management

Technology is a key factor for the competitiveness of the world-leading companies. Collaborating with and funding other research organisations are effective and economical ways to improve their innovation performance. This study aims to investigate the roles of companies in scientific research, as an author and as a funder, and the inner links. Big companies in the agrobiotechnology sector have made great efforts to drive the development of genetically modified techniques. Genetically modified maize, as one of the commercialised genetically modified organisms, is selected as the case study. From an investigation of data available from the Web of Science database, we find that companies’ publications are important in terms of quality and quantity. Companies collaborate with various types of organisations, and build international funding networks. These publications involve in a wide range of emerging topics, and company-authored publications have higher citations. Implications for further research are discussed.


Exploring Technology Evolution Pathways to Facilitate Technology Management: From a Technology Life Cycle Perspective

January 2020

·

729 Reads

·

42 Citations

IEEE Transactions on Engineering Management

Technological innovation is a dynamic process that spans the life cycle of an idea, from scientific research to production. Within this process, there are often a few key innovations that significantly impact a technology's development, and the ability to identify and trace the development of these key innovations comes with a great payoff for researchers and technology managers. In this article, we present a framework for identifying the technology's main evolutionary pathway. What is unique about this framework is that we introduce new indicators that reflect the connectivity and the modularity in the interior citation network to distinguish between the stages of a technology's development. We also show how information about a family of patents can be used to build a comprehensive patent citation network. Finally, we apply integrated approaches of main path analysis (MPA)—namely global MPA and global key-route main analysis—for extracting technological trajectories at different technological stages. We illustrate this approach with dye-sensitized solar cells (DSSCs), a low-cost solar cell belonging to the group of thin-film solar cells, contributing to the remarkable growth in the renewable energy industry. The results show how this approach can trace the main development trajectory of a research field and distinguish key technologies to help decision makers manage the technological stages of their innovation processes more effectively.


Citations (67)


... Although mainstream studies on patent classification continue to strive for higher precision and recall via deep learning with a large amount of samples, particularly at finer granularity levels, the approach of patent analysis or mining concerning a specific technology theme remains critical for employing technology opportunity analysis (TOA) (Wang et al., 2019;Zhang et al., 2019), forecasting emerging technologies Li et al., 2021), and analyzing the interactions between science and technology (Suominen et al., 2021). Additionally, the need for large-grained multilabel classification of patents, especially when dealing with relatively small samples, has become increasingly relevant in the era of deep 4 learning. ...

Reference:

Leveraging patent classification based on deep learning: The case study on smart cities and industrial Internet of Things
Identifying R&D partners using SAO analysis: a case study of dye-sensitised solar cells
  • Citing Article
  • January 2019

International Journal of Technology Management

... The application of the Word2Vec model in archive compliance detection not only improves the accuracy of detection, but also handles large-scale document collections [18], thus showing great potential in compliance monitoring and document classification tasks. ...

Unraveling Scientific Evolutionary Paths: An Embedding-Based Topic Analysis

IEEE Transactions on Engineering Management

... We apply regularization [24], [84] on the input layer by introducing a penalty term λ∥W ∥ p into the loss function, where W represents the input weights, λ is the regularization parameter and p is the specific norm for the penalty. The idea behind is to prevent similar features from weighing too much in loss objective and to learn more robust representation, especially when highly correlated features are present. ...

Contrastive learning enhanced deep neural network with serial regularization for high-dimensional tabular data
  • Citing Article
  • April 2023

Expert Systems with Applications

... Traditionally, the diagnostic process for lung cancer has been heavily reliant on the expertise of radiologists. These professionals meticulously examine lung images, searching for the subtlest indicators of malignancy [1,2]. However, this approach has its limitations. ...

Tree enhanced deep adaptive network for cancer prediction with high dimension low sample size microarray data
  • Citing Article
  • February 2023

Applied Soft Computing

... According to, [1] three application areas have been the most studied: the assignment of reviewers to conference papers [10][11][12][13][14][15][16] to journal papers [2,7,17] and to project proposals [5,18] In addition, solutions have been proposed for the assignment of thesis reviewers and the allocation of grants. ...

Reviewer recommendation method for scientific research proposals: a case for NSFC

Scientometrics

... This helped biotechnology researchers to effectively retrieve and analyze relevant information, promoting the rapid launch of innovative products. X. Han et al. [7] proposed a technology opportunity analysis method based on patent and trademark data to link technology opportunities with commercial applications. By constructing a file-keyword matrix and correlation network, potential technological opportunities were identified based on the existing technological foundation of the enterprise. ...

Discovering technology opportunities based on the linkage between technology and business areas: matching patents and trademarks
  • Citing Article
  • November 2021

Technology Analysis and Strategic Management

... Cluster 4, with 87 members, includes significant documents like Iqbal (2021) [86], Kilicoglu (2019) [87], and Zhu (2015a) [88]. Iqbal (2021) [86] leads with a collaboration score of 266, showing strong bibliographic coupling with Huang (2022a) [89], Jha (2017) [90], Cluster 4, with 87 members, includes significant documents like Iqbal (2021) [86], Kilicoglu (2019) [87], and Zhu (2015a) [88]. Iqbal (2021) [86] leads with a collaboration score of 266, showing strong bibliographic coupling with Huang (2022a) [89], Jha (2017) [90], and Ihsan (2019) [91] (weight 14), indicating its pivotal role in shaping the cluster's intellectual structure. ...

Evaluating scientific impact of publications: combining citation polarity and purpose
  • Citing Article
  • October 2021

Scientometrics

... Wu et al. [17] proposed a learning framework for miRNA for the positive-unlabeled problem. For the negative extraction process, a semi-supervised K-means model is used. ...

An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data
  • Citing Article
  • August 2021

Computational Biology and Chemistry

... Numerous studies on unstructured data, particularly patent data, have employed text analytics, such as titles, abstracts, and claims [12][13][14]. Text mining and natural language processing techniques are widely used to extract meaningful keyword information from patent data [15,16]. ...

R&D trend analysis based on patent mining: An integrated use of patent applications and invalidation data
  • Citing Article
  • June 2021

Technological Forecasting and Social Change

... In contrast, external scientific capabilities enable firms to collaborate with external organizations to access and integrate scientific knowledge. This collaborative approach is not only costeffective but also fosters innovation performance (Liu et al. 2021). Guerrero et al. (2019) emphasize that firms increasingly engage in partnerships with academic and industrial organizations to leverage external knowledge for scientific innovation (Guerrero et al. 2019). ...

Exploring the role of companies in scientific research: a case study of genetically modified maize
  • Citing Article
  • September 2020

Technology Analysis and Strategic Management