
Chenguang LuChangsha University & Liaoning Technical University
Chenguang Lu
Bachelor of Engineering
Retired
About
59
Publications
11,349
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
220
Citations
Introduction
My interests are semantic information theory, mechanism of color vision, beauty and evolution, statistical learning, probability theory (related to philosophy and mathematics). Currently, I am working on the P-T probability framework for the unification of statistics and logic.
Additional affiliations
August 1986 - July 2010
Changsha University
Position
- Professor (Associate)
Description
- I am retired now, some time as guest professor in Intelligence Engineering and Mathematics Institute, Liaoning Technical University, China
Education
September 1991 - September 1992
February 1987 - February 1988
Publications
Publications (59)
Recent advances in deep learning suggest that we need to maximize and minimize two different kinds of information simultaneously. The Information Max-Min (IMM) method has been used in deep learning, reinforcement learning, and maximum entropy control. Shannon's information rate-distortion function is the theoretical basis of Minimizing Mutual Infor...
The Variational Bayesian method (VB) is used to solve the probability distributions of latent variables with the minimum free energy criterion. This criterion is not easy to understand, and the computation is complex. For these reasons, this paper proposes the Semantic Variational Bayes' method (SVB). The Semantic Information Theory the author prev...
Reflection Theory holds that our sensations reflect physical properties, whereas Empiricism believes that sense (data), presentations, and phenomena are the ultimate existence. Lenin adhered to Re-flection Theory and criticized Helmholtz's sensory symbolism for affirming the similarity between a sensation and a physical property. By using informati...
A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information...
A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information...
When we compare the influences of two causes on an outcome, if the conclusion from every group is against that from the conflation, we think there is Simpson's Paradox. The Existing Causal Inference Theory (ECIT) can make the overall conclusion consistent with the grouping conclusion by removing the confounder's influence to eliminate the paradox....
When we compare the influences of two causes on an outcome, if the conclusion from every group is against that from the conflation, we think there is Simpson’s Paradox. The Existing Causal Inference Theory (ECIT) can make the overall conclusion consistent with the grouping conclusion by removing the confounder’s influence to eliminate the paradox....
To improve communication efficiency and provide more useful information, we need to measure semantic information by combining inaccuracy or distortion, freshness, purposiveness, and efficiency. The author proposed the semantic information G measure before. This measure is more compatible with Shannon information theory than other semantic or genera...
The core idea of Darwin’s theory of sexual selection is beauty preference selection. But where did birds’ initial beauty preferences come from? The new answer is that birds’ appreciation for beauty come from needs for survival, such as good food, shelter, or water.
16 images reveal the mystery of birds' colourful plumages or apperances.
(publishe...
In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual In-formation (MMI) distributions and ME distributions are expressed by Bayes-like formulas, in-cluding Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-dis...
In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual Information (MMI) distributions and ME distributions are expressed by Bayes-like formulas, including Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-disto...
Data Science: Measuring Uncertainties
Why can the Expectation-Maximization (EM) algorithm for mixture models converge? Why can different initial parameters cause various convergence difficulties? The Q-L synchronization theory explains that the observed data log-likelihood L and the complete data log-likelihood Q are positively correlated; we can achieve maximum L by maximizing Q. Acco...
To apply information theory to more areas, the author proposed semantic information G theory, which is a natural generalization of Shannon’s information theory. This theory uses the P-T probability framework so that likelihood functions and truth functions (or membership functions), as well as sampling distributions, can be put into the semantic mu...
This chapter aims to present a theoretical framework on the evolution stages of the machine brain and cognitive computation and systems for machine computation, learning and understanding. We divide AI subject into 2 branches—pure AI and applied AI (defined as an integration of AI with another subject: geoAI as an example). To stretch the continuat...
This chapter aims to explain how to construct a machine learning system and summarize the current level of machine intelligence as cognitive computation. One-dimensional deep neural network is utilized for illustration and detection of recorded epileptic seizure activity in Electroencephalogram (EEG) segments is given as a practical application. Ma...
To understand the basic evolution law of the machine brain, we need first understand machine cognition, which majorly depends on machine vision, machine touch and etc. Artificial intelligence (AI) has been rapidly developed in the latest decade and its importance to machine cognition has been widely recognized. But machine minds is still a dream ab...
This chapter aims to explain the processes of machine cognition for a better understanding of environmental changes at the current level of machine intelligence and conjecture how evolution of the machine brain would change the future way of knowledge discovery (data mining) in environments sensing. In order to strengthen the continuity of Chap. 2,...
This chapter finally presents a characterization of interdisciplinary evolution of the machine brain. Perspective schemes for rebuilding a real vision brain in the future are analyzed, along with the major principles to construct the machine brain, are presented, which include memory, thinking, imagination, feeling, speaking and other aspects assoc...
This chapter aims to explain the pattern of machine understanding, utilizing medical test as a practical example. The explanation is based on the semantic information theory. After long arguments between positivism and falsificationism, the verification of universal hypotheses was replaced with the confirmation of uncertain major premises. Unfortun...
This book seeks to interpret connections between the machine brain, mind and vision in an alternative way and promote future research into the Interdisciplinary Evolution of Machine Brain (IEMB). It gathers novel research on IEMB, and offers readers a step-by-step introduction to the theory and algorithms involved, including data-driven approaches...
Many researchers want to unify probability and logic by defining logical probability or probabilistic logic reasonably. This paper tries to unify statistics and logic so that we can use both statistical probability and logical probability at the same time. For this purpose, this paper proposes the P-T probability framework, which is assembled with...
Many researchers want to unify probability and logic by defining logical probability or probabilistic logic reasonably. This paper tries to unify statistics and logic so that we can use both statistical probability and logical probability at the same time. For this purpose, this paper proposes the P–T probability framework, which is assembled with...
The Bayes classifier is often used because it is simple, and the Maximum Posterior Probability (MPP) criterion it uses is equivalent to the least error rate criterion. However, it has issues in the following circumstances: 1) If information instead of correctness is more important; we should use the maximum likelihood criterion or maximum informati...
The popular convergence theory of the EM algorithm explains that the observed incomplete data log-likelihood L and the complete data log-likelihood Q are positively correlated, and we can maximize L by maximizing Q. The Deterministic Annealing EM (DAEM) algorithm was hence proposed for avoiding locally maximal Q. This paper provides different concl...
The author proposed the decoding model of color vision in 1987. International Commission on Illumination (CIE) recommended almost the same symmetric color model for color transform in 2006. For readers to understand the decoding model better, this paper first introduces the decoding model, then uses this model to explain the opponent-process, color...
After long arguments between positivism and falsificationism, the verification of universal hypotheses was replaced with the confirmation of uncertain major premises. Unfortunately, Hemple proposed the Raven Paradox. Then, Carnap used the increment of logical probability as the confirmation measure. So far, many confirmation measures have been prop...
After long arguments between positivism and falsificationism, the verification of universal hypotheses was replaced with the confirmation of uncertain major premises. Unfortunately, Hemple discovered the Raven Paradox (RP). Then, Carnap used the logical probability increment as the confirmation measure. So far, many confirmation measures have been...
An important problem in machine learning is that, when using more than two labels, it is very difficult to construct and optimize a group of learning functions that are still useful when the prior distribution of instances is changed. To resolve this problem, semantic information G theory, Logical Bayesian Inference (LBI), and a group of Channel Ma...
Example 1. U={1, 2, 3, …, 150}, true model ratio P*(y1)=0.5, true model parameters μ1*=65, μ2*=95, and σ1*= σ2*=10. Assume that the guessed ratios and parameters are P(y1)=P(y2)=0.5, μ1= μ1*, μ2=μ2*, and σ1= σ2=5.
Example 1. U={1, 2, 3, …, 150}, true model ratio P*(y1)=0.5, true model parameters μ1*=65, μ2*=95, and σ1*= σ2*=10. Assume that the guessed ratios and parameters are P(y1)=P(y2)=0.5, μ1= μ1*, μ2=μ2*, and σ1= σ2=5.
For given age population prior distribution P(x) and the posterior distribution P(x|adult), how do we obtain the denotation of a label y = “adult”? With the denotation, e.g., the membership function of class {Adult}, we can make new probability prediction, e.g., likelihood function, for changed P(x). However, existing methods including Likelihood M...
Popper and Fisher’s hypothesis testing thoughts are very important. However, Shannon’s information theory does not consider hypothesis testing. The combination of information theory and likelihood method is attracting more and more researchers’ attention, especially when they solve Maximum Mutual Information (MMI) and Maximum Likelihood (ML). This...
The Maximum Mutual Information (MMI) criterion is different from the Least Error Rate (LER) criterion. It can reduce failing to report small probability events. This paper introduces the Channels Matching (CM) algorithm for the MMI classifications of unseen instances. It also introduces some semantic information methods, which base the CM algorithm...
An example of maximum mutual information classification. After two iterations, MI reaches %99.99 of the convergent MMI.
The original version of this chapter contained a mistake. There was an error in Equation (31). The original chapter has been corrected.
The Expectation-Maximization (EM) algorithm for mixture models often results in slow or invalid convergence. The popular convergence proof affirms that the likelihood increases with Q; Q is increasing in the M -step and non-decreasing in the E-step. The author found that (1) Q may and should decrease in some E-steps; (2) The Shannon channel from th...
Bayesian Inference (BI) uses the Bayes’ posterior whereas Logical Bayesian Inference (LBI) uses the truth function or membership function as the inference tool. LBI is proposed because BI is not compatible with the classical Bayes’ prediction and does not use logical probability and hence cannot express semantic meaning. In LBI, statistical probabi...
A semantic channel consists of a set of membership functions or truth functions which indicate the denotations of a set of labels. In the multi-label learning, we obtain a semantic channel from a sampling distribution or Shannon’s channel. If samples are huge, we can directly convert a Shannon’s channel into a semantic channel by the third kind of...
Bayesian Inference (BI) uses the Bayes' posterior whereas Logical Bayesian Inference (LBI) uses the truth function or membership function as the inference tool. LBI was proposed because BI was not compatible with the classical Bayes' prediction and didn't use logical probability and hence couldn't express semantic meaning. In LBI, statistical proba...
The core idea of Charles Darwin’s theory of sexual selection is beauty preference selection. It was conceived as a companion to natural selection to help explain birds’ colourful plumage and behaviours for beauty. However, British naturalist Alfred Russell Wallace strongly objected to this idea, saying it adds another principle to the principle of...
A group of transition probability functions form a Shannon's channel whereas a group of truth functions form a semantic channel. Label learning is to let semantic channels match Shannon's channels and label selection is to let Shannon's channels match semantic channels. The Channel Matching (CM) algorithm is provided for multi-label classification....
Why do many male birds display specific colorful patterns on their plumage? The demand-relationship theory explains that beauty preferences reflect human and birds' desire for approaching some objects; these patterns look beautiful because they resemble their ideal food sources or environments. Mutants that have enhanced human and birds' ability an...
A group of transition probability functions form a Shannon's channel whereas a group of truth functions form a semantic channel. By the third kind of Bayes' theorem, we can directly convert a Shannon's channel into an optimized semantic channel. When a sample is not big enough, we can use a truth function with parameters to produce the likelihood f...
To solve the Maximum Mutual Information (MMI) and Maximum Likelihood (ML) for tests, estimations, and mixture models, it is found that we can obtain a new iterative algorithm by the Semantic Mutual Information (SMI) and R(G) function proposed by Chenguang Lu (1993) (where R(G) function is an extension of information rate distortion function R(D), G...
It is very difficult to solve the Maximum Mutual Information (MMI) or Maximum Likelihood (ML) for all possible Shannon Channels or uncertain rules of choosing hypotheses, so that we have to use iterative methods. According to the Semantic Mutual Information (SMI) and R(G) function proposed by Chenguang Lu (1993) (where R(G) is an extension of infor...
I proposed rate tolerance and discussed its relation to rate distortion in my
book "A Generalized Information Theory" published in 1993. Recently, I examined
the structure function and the complexity distortion based on Kolmogorov's
complexity theory. It is my understanding now that complexity-distortion is
only a special case of rate tolerance whi...
A symmetrical model of color vision, the decoding model as a new version of
zone model, was introduced. The model adopts new continuous-valued logic and
works in a way very similar to the way a 3-8 decoder in a numerical circuit
works. By the decoding model, Young and Helmholtz's tri-pigment theory and
Hering's opponent theory are unified more natu...
Based on the author’s new findings on the relationship between beauty and utility and the phenomenon of birds’ appreciating beauty, this paper tries to apply historical materialism to biological area in order to solve the problems with fragrance, sweetness, and beauty and to explain the cause of beauty sense. It also provides some pictures, which s...
A generalized information formula related to logical probability and fuzzy set is deduced from the classical information formula. The new information measure accords with to Popper's criterion for knowledge evolution very much. In comparison with square error criterion, the information criterion does not only reflect error of a proposition, but als...
Using fish-covering model, this paper intuitively explains how to extend Hartley's information formula to the generalized information formula step by step for measuring subjective information: metrical information (such as conveyed by thermometers), sensory information (such as conveyed by color vision), and semantic information (such as conveyed b...
A generalized information theory is proposed as a natural extension of Shannon's information theory. It proposes that information comes from forecasts. The more precise and the more unexpected a forecast is, the more information it conveys. If subjective forecast always conforms with objective facts then the generalized information measure will be...
A symmetrical model of color visions--the decoding model--has been established for us to understand color vision better. It adopts new continuous value logic or fuzzy logic and works in a way very similar to the way a 3-8 decoder in a numerical circuit does. Unlike a popular zone model of color vision, the decoding model has four pairs of opponent...
Questions
Questions (14)
A new explanation is that first needing relationships selected brids tasts, and laterly the female's tasts select the male's plumages. A video and a paper show how birds' colorful plumages reflect their favorite foods and enviroments. https://researchfeatures.com/needing-aesthetics-explain-birds-beauty-preferences/
Welcome to discuss!
I found there is a new trend in machine learning, especially in deep learning: researchers use similarity function proportional to P(x|yj)/P(x)=P(yj|x)/P(yj) to construct estimated mutual information that approximates to Shannon mutual information. So I wrote a paper about this trend: https://www.mdpi.com/1099-4300/25/5/802
Welcome to discuss!
Two different viewpoints can be found as follows:
Confirmation, causation, and simpson's paradox
Causal Confirmation Measures: From Simpson’s Paradox to COVID-19
What is your opinion?
Darwin uses peahens' beauty preference to explain peacocks' colourful plumage. But Where did peahens' beauty preferences come from? I found that many birds mimic their favorite food items; the peacock mimics a berry tree. A paper introduces my discovery:
Welcome to discuss.