Liang Zhao’s research while affiliated with University of São Paulo and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (25)


Fig. 1. Comparison between existing few-shot NER methods and our knowledge-enriched deep prompt based framework that makes use the threefold knowledge features: sememe, label and context knowledge.
TKDP: Threefold Knowledge-enriched Deep Prompt Tuning for Few-shot Named Entity Recognition
  • Preprint
  • File available

June 2023

·

37 Reads

Jiang Liu

·

·

·

[...]

·

Donghong Ji

Few-shot named entity recognition (NER) exploits limited annotated instances to identify named mentions. Effectively transferring the internal or external resources thus becomes the key to few-shot NER. While the existing prompt tuning methods have shown remarkable few-shot performances, they still fail to make full use of knowledge. In this work, we investigate the integration of rich knowledge to prompt tuning for stronger few-shot NER. We propose incorporating the deep prompt tuning framework with threefold knowledge (namely TKDP), including the internal 1) context knowledge and the external 2) label knowledge & 3) sememe knowledge. TKDP encodes the three feature sources and incorporates them into the soft prompt embeddings, which are further injected into an existing pre-trained language model to facilitate predictions. On five benchmark datasets, our knowledge-enriched model boosts by at most 11.53% F1 over the raw deep prompt method, and significantly outperforms 8 strong-performing baseline systems in 5-/10-/20-shot settings, showing great potential in few-shot NER. Our TKDP can be broadly adapted to other few-shot tasks without effort.

Download

TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags

November 2022

·

24 Reads

So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.


Figure 1: Examples of three kinds of events, including a flat event (a), overlapped events (b), and nested events (c). Different event mentions are denoted in distinct colors. Triggers are marked with red boxes while arguments are underlined.
OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction

September 2022

·

23 Reads

Event extraction (EE) is an essential task of information extraction, which aims to extract structured event information from unstructured text. Most prior work focuses on extracting flat events while neglecting overlapped or nested ones. A few models for overlapped and nested EE includes several successive stages to extract event triggers and arguments,which suffer from error propagation. Therefore, we design a simple yet effective tagging scheme and model to formulate EE as word-word relation recognition, called OneEE. The relations between trigger or argument words are simultaneously recognized in one stage with parallel grid tagging, thus yielding a very fast event extraction speed. The model is equipped with an adaptive event fusion module to generate event-aware representations and a distance-aware predictor to integrate relative distance information for word-word relation recognition, which are empirically demonstrated to be effective mechanisms. Experiments on 3 overlapped and nested EE benchmarks, namely FewFC, Genia11, and Genia13, show that OneEE achieves the state-of-the-art (SOTA) results. Moreover, the inference speed of OneEE is faster than those of baselines in the same condition, and can be further substantially improved since it supports parallel inference.


Feature Ranking from Random Forest Through Complex Network’s Centrality Measures: A Robust Ranking Method Without Using Out-of-Bag Examples

August 2022

·

10 Reads

·

2 Citations

Lecture Notes in Computer Science

The volume of available data in recent years has rapidly increased. In consequence, datasets commonly end up with many irrelevant features. That increase may disturb human understanding and even lead to poor machine learning models. This research proposes a novel feature ranking method that employs trees from a Random Forest to transform a dataset into a complex network to which centrality measures are applied to rank the features. That process takes place by representing each tree as a graph where all the tree features are vertices on this graph, and the links within the nodes (father \rightarrow child) of the tree are represented by a weighted edge between the two respective vertices. The union of all graphs from individual trees leads to the complex network. Then, three centrality measures are applied to rank the features in the complex network. Experiments were performed in eighty-five supervised classification datasets, with a variation in the feature noise level, to evaluate our novel method. Results show that centrality measures in non-oriented complex networks are comparable and may be correlated to the Random Forest’s variable importance ranking algorithm. Vertex strength and eigenvector outperformed the Random Forest in 40% noise datasets, with a not statistically different result at a 95% confidence level.KeywordsFeature rankingRandom ForestComplex networksCentrality measures


Multilevel Coarsening for Interactive Visualization of Large Bipartite Networks

June 2022

·

66 Reads

Frontiers in Research Metrics and Analytics

Bipartite networks are pervasive in modeling real-world phenomena and play a fundamental role in graph theory. Interactive exploratory visualization of such networks is an important problem, and particularly challenging when handling large networks. In this paper we present results from an investigation on using a general multilevel method for this purpose. Multilevel methods on networks have been introduced as a general approach to increase scalability of community detection and other complex optimization algorithms. They employ graph coarsening algorithms to create a hierarchy of increasingly coarser (reduced) approximations of an original network. Multilevel coarsening has been applied, e.g., to the problem of drawing simple (“unipartite”) networks. We build on previous work that extended multilevel coarsening to bipartite graphs to propose a visualization interface that uses multilevel coarsening to compute a multi-resolution hierarchical representation of an input bipartite network. From this hierarchy, interactive node-link drawings are displayed following a genuine route of the “overview first, zoom and filter, details on demand” visual information seeking mantra. Analysts may depart from the coarsest representation and select nodes or sub-graphs to be expanded and shown at greater detail. Besides intuitive navigation of large-scale networks, this solution affords great flexibility, as users are free to select different coarsening strategies in different scenarios. We illustrate its potential with case studies involving real networks on distinct domains. The experimental analysis shows our strategy is effective to reveal topological structures, such as communities and holes, that may remain hidden in a conventional node-link layout. It is also useful to highlight connectivity patterns across the bipartite layers, as illustrated in an example that emphasizes the correlation between diseases and genes in genetic disorders, and in a study of a scientific collaboration network of authors and papers.


Clustered and deep echo state networks for signal noise reduction

March 2022

·

219 Reads

·

8 Citations

Machine Learning

Echo State Networks (ESNs) are Recurrent Neural Networks with fixed input and internal (hidden) weights, and adaptable output weights. The hidden part of an ESN can be considered as a discrete-time dynamical system, called reservoir. In classical ESNs, the internal connections are obtained from an Erdős-Rényi graph. A recent study proposed ESNs with clustered adjacency matrices (CESNs), where the clusters are either Erdős-Rényi graphs or Barabási-Albert-like graphs. In this work, we investigate the effectiveness of CESNs and apply them for signal denoising. In addition, we introduce and study deep CESNs with multiple clustered layers. We found that CESNs and deep CESNs can compete with deep ESNs for all tasks that we considered.


TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags

January 2022

·

15 Reads

·

25 Citations

IEEE/ACM Transactions on Audio Speech and Language Processing

So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.


Coarsening Algorithm via Semi-synchronous Label Propagation for Bipartite Networks

November 2021

·

26 Reads

·

3 Citations

Lecture Notes in Computer Science

Several coarsening algorithms have been developed as a powerful strategy to deal with difficult machine learning problems represented by large-scale networks, including, network visualization, trajectory mining, community detection and dimension reduction. It iteratively reduces the original network into a hierarchy of gradually smaller informative representations. However, few of these algorithms have been specifically designed to deal with bipartite networks and they still face theoretical limitations that need to be explored. Specifically, a recently introduced algorithm, called MLPb, is based on a synchronous label propagation strategy. In spite of an interesting approach, it presents the following two problems: 1) A high-cost search strategy in dense networks and 2) the cyclic oscillation problem yielded by the synchronous propagation scheme. In this paper, we address these issues and propose a novel fast coarsening algorithm more suitable for large-scale bipartite networks. Our proposal introduces a semi-synchronous strategy via cross-propagation, which allows a time-effective implementation and deeply reduces the oscillation phenomenon. The empirical analysis in both synthetic networks and real-world networks shows that our coarsening strategy outperforms previous approaches regarding accuracy and runtime.


Detecting Early Signs of Insufficiency in COVID-19 Patients from CBC Tests Through a Supervised Learning Approach

November 2021

·

15 Reads

Lecture Notes in Computer Science

One important task in the COVID-19 clinical protocol involves the constant monitoring of patients to detect possible signs of insufficiency, which may eventually rapidly progress to hepatic, renal or respiratory failures. Hence, a prompt and correct clinical decision not only is critical for patients prognosis, but also can help when making collective decisions regarding hospital resource management. In this work, we present a network-based high-level classification technique to help healthcare professionals on this activity, by detecting early signs of insufficiency based on Complete Blood Count (CBC) test results. We start by building a training dataset, comprising both CBC and specific tests from a total of 2,982 COVID-19 patients, provided by a Brazilian hospital, to identify which CBC results are more effective to be used as biomarkers for detecting early signs of insufficiency. Basically, the trained classifier measures the compliance of the test instance to the pattern formation of the network constructed from the training data. To facilitate the application of the technique on larger datasets, a network reduction option is also introduced and tested. Numerical results show encouraging performance of our approach when compared to traditional techniques, both on benchmark datasets and on the built COVID-19 dataset, thus indicating that the proposed technique has potential to help medical workers in the severity assessment of patients. Especially those who work in regions with scarce material resources.


Anomaly Detection in Brazilian Federal Government Purchase Cards Through Unsupervised Learning Techniques

November 2021

·

40 Reads

Lecture Notes in Computer Science

The Federal Government Purchase Card (CPGF) has been used in Brazil since 2005, allowing agencies and entities of the federal public administration to make purchases of material and provision of services through this method. Although this payment system offers several advances, in the technological and administrative aspect, it is also susceptible to possible cases of card misuse and, consequently, waste of public funds, in the form of purchases that do not comply with the terms of the current legislation. In this work, we approach this problem by testing and evaluating unsupervised learning techniques on detecting anomalies in CPGF historical data. Four different methods are considered for this task: K-means, agglomerative clustering, a network-based approach, which is also introduced in this study, and a hybrid model. The experimental results obtained indicate that unsupervised methods, in particular the network-based approach, can indeed help in the task of monitoring government purchase card expenses, by flagging suspect transactions for further investigation without requiring the presence of a specialist in this process.


Citations (13)


... However, the model only focused on the relationship between words and had the problem of the sparse distribution of grid labels. To solve the problem of the sparse distribution of grid labels, Liu et al. [27] added two labels, Previous-Neighborhood-Word (PNW) and Head-Tail-Word (HTW), based on the W 2 NER to model more fine-grained word-to-word relationships, which can alleviate some error propagation in the model to a certain extent, but predicting head-to-tail relationships is still a complex problem in this model. Inspired by the W 2 NER method, Lu et al. [28] used an attention mechanism that integrates feature information and a GRU module that enhances internal dependencies and boundary information to optimize the W 2 NER model. ...

Reference:

A discontinuous NER model based on token prediction and contrastive learning to enhance span
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
  • Citing Article
  • January 2022

IEEE/ACM Transactions on Audio Speech and Language Processing

... These methods offer several inherent advantages. For instance, Cantão et al. (2022) Wandelt et al. (2020) introduced a novel communitybased approach to systematically analyze the robustness of transportation networks. By identifying and optimizing the structure of critical communities, this method significantly improved overall network stability. ...

Feature Ranking from Random Forest Through Complex Network’s Centrality Measures: A Robust Ranking Method Without Using Out-of-Bag Examples
  • Citing Chapter
  • August 2022

Lecture Notes in Computer Science

... Gauthier et al. proposed a nonlinear vector autoregressive model to replace traditional reservoirs [22], while Li et al. proved the universality of simple cycle reservoirs [23]. In addition, scholars have studied the impact of deeper structural reservoirs and node clustering within the reservoir on reservoir computing [24,25]. Advanced designs, including intrinsic plasticity rules and biologically inspired synaptic plasticity [26][27][28], have further enhanced representational capacity. ...

Clustered and deep echo state networks for signal noise reduction

Machine Learning

... As the interest in techniques for heterogeneous networks increases, as seen in studies such as Zhou et al. (2020), Liu et al. (2018), Wu et al. (2021), research on coarsening methods has also gained traction, particularly for bipartite networks (Valejo et al. 2017a(Valejo et al. , b, 2018(Valejo et al. , 2020a(Valejo et al. , 2021. However, methods designed explicitly for heterogeneous networks have yet to be extensively explored. ...

Coarsening Algorithm via Semi-synchronous Label Propagation for Bipartite Networks
  • Citing Chapter
  • November 2021

Lecture Notes in Computer Science

... Bipartite networks model the heterogenous relationships between different types of entities. The heterogenous nature of biparite networks makes them a powerful and flexibile tool for modeling the complex relationships within various social and physical systems (Valejo et al., 2021). In CSCL, a one-mode network can be used to capture the social interactions among group members, with nodes in the network representing students. ...

A review and comparative analysis of coarsening algorithms on bipartite networks
  • Citing Article
  • June 2021

The European Physical Journal Special Topics

... The authors [21] has also introduced a network-based classification technique that uses the average lengths of the transition and the attractive cycle of the tourist walk initiated from each node to represent network patterns. Another network-based classification technique employing the community concept has been proposed for detecting stock market trends [22]. ...

Stock market trend detection and automatic decision-making through a network-based classification model

Natural Computing

... Notwithstanding such a major simplification, ESNs have successfully been employed for multi-step-ahead prediction of nonlinear time series and modeling chaotic dynamical systems at low computational cost (Bianchi et al., 2017;Han et al., 2021), triggering the development of several network topologies in recent years. For instance, clustered ESNs (CESNs) (Deng & Zhang, 2006;Junior et al., 2020), where multiple sub-graphs of sparsely connected hidden units form the reservoir, and deep ESNs, where the reservoir consists of multiple sub-reservoir layers stacked hierarchically , are two widely used architectures. Hybrid ESNs (HESNs) are another category of RC techniques introduced in a physics-informed ML framework (Oh, 2020;Willard et al., 2020), where additional inputs from physics-based mathematical models integrate corresponding domain knowledge into data-driven models (Doan et al., 2019;. ...

Clustered Echo State Networks for Signal Observation and Frequency Filtering
  • Citing Conference Paper
  • October 2020

... To have performance in government spending on complex financial systems according to Zheng and Du (2015), it is recommended to use the laws of design synchronization control of hyperchaotic financial systems. Complex systems based on economics and finance according to Tabak et al. (2020) are showing interest in the micro-level analysis, but empirical methodologies are limited to mainly linear methods brought by traditional econometric methods. According to Huang et al. (2014), although much research has been done, there are still challenging open-ended questions about complex financial and economic systems where it is recommended to develop new theories and methods as well as to refine techniques known for analyzing financial problem classes and public spending time intervals. ...

Applications of Machine Learning Methods in Complex Economics and Financial Networks

... Além disso, diferentes � pos de a� vos são o foco da previsão de preços através de algoritmos de IA, como índices de mercado (Cavdar & Aydin, 2020;Ding & Qin, 2020;Shynkevich et al., 2017); preços de ações (Colliri & Zhao, 2019;Awan et al., 2021); e preços de outros a� vos fi nanceiros, como opções (Sheu & Wei, 2011). ...

A Network-Based Model for Optimizing Returns in the Stock Market
  • Citing Conference Paper
  • October 2019

... Despite the widespread use of, and the many recent advances in machine learning methods for graphs, applications involving criminal networks are surprisingly scarce. This can likely be attributed to the fact that the complex network framework has only recently entered the toolbox of researchers working with crime data [15,16], although it is already considered an ideal approach to investigating and understanding the intricate associations among criminals [15,[17][18][19][20]. Indeed, recent research has demonstrated that patterns exhibited by complex networks related to criminal activities can tie criminal associations not only with individual skills but also with the global structure of these networks [21][22][23][24][25][26][27][28][29][30][31][32][33]. Similar to evidence at a crime scene, patterns among networked criminals may serve as predictive features for identifying missing links or properties of criminal associations, and they may even provide indications for future criminal behavior. ...

Analyzing the Bills-Voting Dynamics and Predicting Corruption-Convictions Among Brazilian Congressmen Through Temporal Networks