Chuan-Ju Wang

Chuan-Ju Wang
Academia Sinica · Research Center for Information Technology Innovation

PhD

About

81
Publications
13,373
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
872
Citations
Introduction
Dr. Chuan-Ju Wang (王釧茹) received her Ph.D. degree in Computer Science and Information Engineering at National Taiwan University in 2011. She is now a Research Fellow of the Research Center for IT Innovation, Academia Sinica in Taiwan. Her research interests include computational finance and data analytics. http://cfda.citi.sinica.edu.tw/~cjwang/ http://cfda.citi.sinica.edu.tw/
Additional affiliations
August 2016 - June 2019
Academia Sinica
Position
  • Fellow
June 2019 - present
Academia Sinica
Position
  • Fellow
February 2016 - July 2016
University of Taipei
Position
  • Professor (Associate)

Publications

Publications (81)
Preprint
This paper presents ConvRerank, a conversational passage re-ranker that employs a newly developed pseudo-labeling approach. Our proposed view-ensemble method enhances the quality of pseudo-labeled data, thus improving the fine-tuning of ConvRerank. Our experimental evaluation on benchmark datasets shows that combining ConvRerank with a conversation...
Chapter
Data sparsity is a well-known challenge in recommender systems. One way to alleviate this problem is to leverage knowledge from relevant domains. In this paper, we focus on an important real-world scenario in which some users overlap two different domains but items of the two domains are distinct. Although several studies leverage side information...
Article
This paper proposes a novel equity‐price‐tree‐based convertible bond (CB) pricing model based on the first‐passage default model under stochastic interest rates. By regarding equity values as down‐and‐out call options on firm values (FVs), at each tree node, we solve the implied FV and equity‐price volatility (EPV), and then endogenously settle the...
Article
Full-text available
Options can be priced by the lattice model, the results of which converge to the theoretical option value as the lattice’s number of time steps n approaches infinity. The time complexity of a common dynamic programming pricing approach on the lattice is slow (at least O n 2 ), and a large n is required to obtain accurate option values. Although O n...
Chapter
As foreign exchange (Forex) markets reflect real-world events, locally or globally, financial news is often leveraged to predict Forex trends. In this demonstration, we propose INForex, an interactive web-based system that displays a Forex plot alongside related financial news. To our best knowledge, this is the first system to successfully align t...
Article
Full-text available
This paper proposes analytically vulnerable vanilla option pricing formulae that simultaneously consider the premature default, the correlation between the underlying asset and the issuer’s asset, and other outstanding debts of the issuer. Our pricing formulae can be easily extended to solve the problem of pricing vulnerable barrier options, which...
Article
Full-text available
Conversational search plays a vital role in conversational information seeking. As queries in information seeking dialogues are ambiguous for traditional ad hoc information retrieval (IR) systems due to the coreference and omission resolution problems inherent in natural language dialogue, resolving these ambiguities is crucial. In this article, we...
Chapter
We propose an eXplainable Risk Ranking (XRR) model that uses multilevel encoders and attention mechanisms to analyze financial risks among companies. In specific, the proposed method utilizes the textual information in financial reports to rank the relative risks among companies and locate top high-risk companies; moreover, via attention mechanisms...
Conference Paper
In this demonstration, we develop an interactive tool, HIVE, to demonstrate the ability and versatility of an explainable risk ranking model with a special focus on financial use cases. HIVE is a web-based tool that provides users with automated highlighted financial statements, and HIVE is designed for making comparing statements rather more effic...
Preprint
Full-text available
In this paper, we examine the existence of the R\'enyi divergence between two time invariant general hidden Markov models with arbitrary positive initial distributions. By making use of a Markov chain representation of the probability distribution for the general hidden Markov model and eigenvalue for the associated Markovian operator, we obtain, u...
Article
Full-text available
Predicting extreme weather events such as tropical and extratropical cyclones is of vital scientific and societal importance. Of late, machine learning methods have found their way to weather analysis and prediction, but mostly, these methods use machine learning merely as a complement to traditional numerical weather prediction models. Although so...
Preprint
Recently, much progress in natural language processing has been driven by deep contextualized representations pretrained on large corpora. Typically, the fine-tuning on these pretrained models for a specific downstream task is based on single-view learning, which is however inadequate as a sentence can be interpreted differently from different pers...
Conference Paper
Full-text available
Textual data is common and informative auxiliary information for recommender systems. Most prior art utilizes text for rating prediction , but rare work connects it to top-N recommendation. Moreover, although advanced recommendation models capable of incorporating auxiliary information have been developed, none of these are specifically designed to...
Preprint
Full-text available
In this paper, we propose a two-stage ranking approach for recommending linear TV programs. The proposed approach first leverages user viewing patterns regarding time and TV channels to identify potential candidates for recommendation and then further leverages user preferences to rank these candidates given textual information about programs. To e...
Preprint
Full-text available
In this paper, we propose a novel optimization criterion that leverages features of the skew normal distribution to better model the problem of personalized recommendation. Specifically, the developed criterion borrows the concept and the flexibility of the skew normal distribution, based on which three hyperparameters are attached to the optimizat...
Article
Item concept modeling is commonly achieved by leveraging textual information. However, many existing models do not leverage the inferential property of concepts to capture word meanings, which therefore ignores the relatedness between correlated concepts, a phenomenon which we term conceptual “correlation sparsity.” In this paper, we distinguish be...
Preprint
Full-text available
Passage retrieval in a conversational context is essential for many downstream applications; it is however extremely challenging due to limited data resources. To address this problem, we present an effective multi-stage pipeline for passage ranking in conversational search that integrates a widely-used IR system with a conversational query reformu...
Preprint
Full-text available
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-or...
Preprint
Full-text available
We applied the T5 sequence-to-sequence model to tackle the AI2 WinoGrande Challenge by decomposing each example into two input text strings, each containing a hypothesis, and using the probabilities assigned to the "entailment" token as a score of the hypothesis. Our first (and only) submission to the official leaderboard yielded 0.7673 AUC on Marc...
Conference Paper
In the Age of Big Data, graph embedding has received increasing attention for its ability to accommodate the explosion in data volume and diversity, which challenge the foundation of modern recommender systems. Respectively, graph facilitates fusing complex systems of interactions into a unified structure and distributed embedding enables efficient...
Article
We present FRIDAYS, a financial risk information detecting and analyzing system that enables financial professionals to efficiently comprehend financial reports in terms of risk and domain-specific sentiment cues. Our system is designed to integrate multiple NLP models trained on financial reports but on different levels (i.e., word, multi-word, an...
Article
Customer reviews on platforms such as TripAdvisor and Amazon provide rich information about the ways that people convey sentiment on certain domains. Given these kinds of user reviews, this paper proposes UGSD, a representation learning framework for constructing domain-specific sentiment dictionaries from online customer reviews, in which we lever...
Conference Paper
We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation. In the proposed framework, we differentiate two types of proximity relations: direct proximity and k-th order neighborhood proximity. Wh...
Chapter
Keyword extraction is a critical technique in natural language processing. For this essential task we present a simple yet efficient architecture involving character-level convolutional neural tensor networks. The proposed architecture learns the relations between a document and each word within the document and treats keyword extraction as a super...
Preprint
We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation. In the proposed framework, we differentiate two types of proximity relations: direct proximity and k-th order neighborhood proximity. Wh...
Article
Full-text available
This paper presents timely open range breakout (TORB) strategies for index futures market trading via using one-minute intraday data. We observe that the trading volumes and fluctuations in returns on each one-minute interval of trading hours in the futures markets reach their peaks at the opening and closing of the underlying stock markets. With t...
Article
A catastrophe equity put (CatEPut) is constructed to recapitalize an insurance company that suffers huge compensation payouts due to catastrophic events (CEs). The company can exercise its CatEPut to sell its stock to the counterparty at a predetermined price when its accumulated loss due to CEs exceeds a predetermined threshold and its own stock p...
Conference Paper
Recommender systems are vital ingredients for many e-commerce services. In the literature, two of the most popular approaches are based on factorization and graph-based models; the former approach captures user preferences by factorizing the observed direct interactions between users and items, and the latter extracts indirect preferences from the...
Preprint
Full-text available
Image perception is one of the most direct ways to provide contextual information about a user concerning his/her surrounding environment; hence images are a suitable proxy for contextual recommendation. We propose a novel representation learning framework for image-based music recommendation that bridges the heterogeneity gap between music and ima...
Preprint
Cross-domain collaborative filtering (CF) aims to alleviate data sparsity in single-domain CF by leveraging knowledge transferred from related domains. Many traditional methods focus on enriching compared neighborhood relations in CF directly to address the sparsity problem. In this paper, we propose superhighway construction, an alternative explic...
Article
This paper considers the problem of measuring the credit risk in portfolios of loans, bonds, and other instruments subject to possible default. One such performance measure of interest is the probability that the portfolio incurs large losses over a fixed time horizon. To capture the extremal dependence among obligors, we study a multi-factor model...
Conference Paper
This paper attempts to conduct analysis for one certain type of user reviews; that is, the reviews on a super-entity (e.g., restaurant) involve descriptions for many sub-entities (e.g., dishes). To deal with such analysis, we propose a text embedding framework for ranking sub-entities from user reviews of a given super-entity. Experiments on two re...
Article
This paper presents the persistent behavior hypothesis for financial markets, which is tested statistically on five stock indices from 2001 to 2014. We find significant results in all five stock markets for the full sample period as well as subperiods. A persistent behavior strategy (PBS) on index futures is also presented, the net annual returns o...
Conference Paper
This paper proposes an item concept embedding (ICE) framework to model item concepts via textual information. Specifically, in the proposed framework there are two stages: graph construction and embedding learning. In the first stage, we propose a generalized network construction method to build a network involving heterogeneous nodes and a mixture...
Conference Paper
Full-text available
In this demonstration, we present FIN10K, a web-based information system that facilitates the analysis of textual information in financial reports. The proposed system has three main components: (1) a 10-K Corpus, including an inverted index of financial reports on Form 10-K, several numerical finance measures, and pre-trained word embeddings; (2)...
Article
The growing amount of public financial data makes it increasingly important to learn how to discover valuable information for financial decision making. This article proposes an approach to discovering financial keywords from a large number of financial reports. In particular, we apply the continuous bag-of-words (CBOW) model, a well-known continuo...
Article
We attempt in this paper to utilize soft information in financial reports to analyze financial risk among companies. Specifically, on the basis of the text information in financial reports, which is the so-called soft information, we apply analytical techniques to study relations between texts and financial risk. Furthermore, we conduct a study on...
Article
Although many different aspects of debt structures such as bond covenants and repayment schedules are empirically found to significantly influence values of bonds and equity, many theoretical structural models still oversimplify debt structures and fail to capture phenomena found in financial markets. This paper proposes a carefully designed struct...
Article
This paper provides a novel and general framework for the problem of searching parameter space in Monte Carlo simulations. We propose a deterministic online algorithm and a randomized online algorithm to search for suitable parameter values for derivative pricing which are needed to achieve desired precisions. We also give the competitive ratios of...
Conference Paper
How will the reputations of individuals in a social network be influenced by their communities in a quantitative way? This work attempts to observe the collaborative events occurring at individuals involved in a social network to obtain such crucial knowledge. We propose a Factorization Machine approach to find out the latent social influence among...
Article
We propose a novel approach, called random traders, to benchmark equity funds' performance. A random trader adopts an all-in-all-out strategy to buy and sell the market index at random timing with capital being negligible as compared with the market size. With the empirical distribution of a random trader's return, each equity fund is scored by the...
Article
This paper presents a general and numerically accurate lattice methodology to price risky corporate bonds. It can handle complex default boundaries, discrete payments, various asset sales assumptions, and early redemption provisions for which closed-form solutions are unavailable. Furthermore, it can price a portfolio of bonds that accounts for the...
Conference Paper
This paper attempts to use soft information in finance to rank the risk levels of a set of companies. Specifically, we deal with a ranking problem with a collection of financial reports, in which each report is associated with a company. By using text information in the reports, which is so-called the soft information, we apply learning-to-rank tec...
Conference Paper
Information Retrieval (IR) aims to discover relevant information according to a user's information need. The classic Probability Ranking Principle (PRP) forms the theoretical basis for probabilistic IR models. This ranking principle, however, neglects the uncertainty introduced through the estimations from retrieval models. Inspired by the Post-Mod...
Article
Full-text available
We examine the change of levered firm's capital structures due to different investment decisions of realised tax benefits and various sources of fund to finance coupon and dividend payouts. The complexity is analytically intractable but numerical approaches provide insights. Retaining realised tax benefits and investing them in risk-free assets ins...
Conference Paper
With the rapid growth of financial markets, many complex derivatives have been structured to meet specific financial goals. But most complex derivatives have no analytical formulas for their prices, e.g., when more than one market variable is factored. As a result, they must be priced by numerical methods such as lattice. A derivative is called mul...
Article
Complex financial instruments with multiple state variables often have no analytical formulas and therefore must be priced by numerical methods, like lattice ones. For pricing convertible bonds and many other interest rate-sensitive products, research has focused on bivariate lattices for models with two state variables: stock price and interest ra...
Article
This article presents a methodology to derive analytical formulas for a class of complicated financial derivatives with a continuously monitored barrier and a few discretely monitored ones. Numerical results based on concrete numbers for the parameters are presented and analyzed.
Article
Derivatives are popular financial instruments whose values depend on other more fundamental financial assets (called the underlying assets). As they play essential roles in financial markets, evaluating them efficiently and accurately is critical. Most derivatives have no simple valuation formulas; as a result, they must be priced by numerical meth...
Article
The structural model views debts as contingent claims written on the asset of the firm. Evaluating defaultable coupon bonds (or a set of defaultable bonds with the same issuer) is more difficult than risk-less ones since we can not evaluate each coupon (or bond) individually as we do for evaluating risk-less bonds. This is because the repayment for...

Network

Cited By