## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

The reliability of mass storage systems, such as optical data recording and non-volatile memory (Flash), is seriously hampered by uncertainty of the actual value of the offset (drift) or gain (amplitude) of the retrieved signal. The recently introduced minimum Pearson distance detection is immune to unknown offset or gain, but this virtue comes at the cost of a lessened noise margin at nominal channel conditions. We will present a novel hybrid detection method, where we combine the outputs of the minimum Euclidean distance and Pearson distance detectors so that we may trade detection robustness versus noise margin. We will compute the error performance of hybrid detection in the presence of unknown channel mismatch and additive noise.

To read the full-text of this research,

you can request a copy directly from the authors.

... Furthermore, a considerable amount of literature has grown around the theme of Pearson distance that tackles the offset mismatch issues. In [56], a decoder is proposed based on minimizing a weighted sum of Euclidean and Pearson distances. A dynamic threshold detection scheme is proposed in [57], where the gain and offset are first estimated based on Pearson distance detection. ...

... In addition, for Gaussian distributed noise and offset mismatch, we derive the ML criterion considering successive channel outputs, which includes the results in [56,62] as its particular case. A concatenated coding scheme is proposed in the case of Gaussian noise and offset mismatch. ...

... Thus, we achieve more than 4 dB SNR improvement of achieving a BER = 10 −4 with the proposed RS-Coset codes. 56 3. NOISY CHANNELS WITH UNKNOWN OFFSET MISMATCH ...

... A second method, orthogonal to the first approach, is based on the premise that the detector should be designed in such a way that channel mismatch does not cause undue error performance degradation, so that, redundancy of training sequences, parameter estimation, and receiver adjustment are not needed or cannot be applied, for example, where offset/gain changes quickly from page to page. Minimum Pearson Distance (MPD) detection has been advocated since it has innate resistance, or is said to be immune, to unknown variations of the signal amplitude (gain) and offset of the received signal [4], [5], [6]. The authors assume that the offset is constant (uniform) for all symbols in the codeword. ...

... , x 1 ) denote the reverse of x. We simply find using (5) ...

We consider the transmission and storage of data that use coded binary symbols over a channel, where a Pearsondistance-based detector is used for achieving resilience against additive noise, unknown channel gain, and varying offset. We study Minimum Pearson Distance (MPD) detection in conjunction with a set, S, of codewords satisfying a center-of-mass constraint. We investigate the properties of the codewords in S, compute the size of S, and derive its redundancy for asymptotically large values of the codeword length n. The redundancy of S is approximately 3/2 log2 n + α where α = log2 √π/24 =-1.467. for n odd and α =-0.467. for n even. We describe a simple encoding algorithm whose redundancy equals 2 log2 n + o(log n). We also compute the word error rate of the MPD detector when the channel is corrupted with additive Gaussian noise.

... Prior art Minimum Pearson Distance (MPD) detection has been advocated since it has innate resistance, or is said to be immune, to unknown signal amplitude and offset of the received signal [2], [3]. It is assumed in this prior art that the offset is constant (uniform) for all symbols in the codeword. ...

... Properties of S are collected in Section IV. In Sections V and VI, we count the number of constrained codewords, and show that for asymptotically large n, the redundancy of S is approximately 3 2 log 2 n + α, where α = log 2 √ π/24 = −1.467.. for n odd and α = −0.467.. for n even. The Conclusions section concludes this paper. ...

... This range is usually measured by a threshold. In order to acquire the threshold σ i , we first need to determine the Euclidean distance [44]. The Euclidean distance S ij between the i-th band and the j-th band is as follows: ...

Band selection is one of the main methods of reducing the number of dimensions in a hyperspectral image. Recently, various methods have been proposed to address this issue. However, these methods usually obtain the band subset in the perspective of a locally optimal solution. To achieve an optimal solution with a global perspective, this paper developed a novel method for hyperspectral band selection via optimal combination strategy (OCS). The main contributions are as follows: (1) a subspace partitioning approach is proposed which can accurately obtain the partitioning points of the subspace. This ensures that similar bands can be divided into the same subspace; (2) two candidate representative bands with a large amount of information and high similarity are chosen from each subspace, which can fully represent all bands in the subspace; and (3) an optimal combination strategy is designed to acquire the optimal band subset, which achieves an optimal solution with a global perspective. The results on four public datasets illustrate that the proposed method achieves satisfactory performance against other methods.

... Since searches of analogs rely on embedding vectors being spatially similar over time, it is not certain that Euclidean distance ever leads to first-rate analogs, particularly for the spatiotemporal state processes. Pearson distance has mathematical similarities to the Euclidean approach (Immink and Weber, 2015), and could be a simplified way of rewriting its notation. ...

Accurately predicting electricity prices allows us to minimize risks and establish more reliable decision support mechanisms. In particular, the theory of analogs has gained increasing prominence in this area. The analog approach is constructed from the similarity measurement, using fast search methods in time series. The present paper introduces a rapid method for finding analogs. Specifically, we intend to: (i) simplify the leading algorithms for similarity searching and (ii) present a case study with data from electricity prices in the Nordic market. To do so, Pearson's distance correlation coefficient was rewritten in simplified notation. This new metric was implemented in the main similarity search algorithms, namely: Brute Force, JustInTime, and Mass. Next, the results were compared to the Euclidean distance approach. Pearson's correlation, as an instrument for detecting similarity patterns in time series, has shown promising results. The present study provides innovation in that Pearson's distance correlation notation can reduce the computational time of similarity profiles by an average of 17.5%. It is worth noting that computational time was reduced in both short and long time series. For future research, we suggest testing the impact of other distance measurements, e.g., Cosine correlation distance and Manhattan distances. Published under license by AIP Publishing.

... The authors would like to thank the Hyperspectral Image Analysis group and the NSF funded Center for Airborne Laser Mapping (NCALM) at Houston University for providing the Euclidean distance Cosine distance [46] Pearson distance [47] n i=1 (xi − yi) 2 ...

Recently, convolutional neural networks (CNNs) have attracted enormous attention in pattern recognition and demonstrated excellent performance in hyperspectral image (HSI) classification. However, high-dimensional HSI dataset versus limited training samples is easy to cause the over-fitting phenomenon in deep neural networks. Additionally, the intraclass distance of the embedding features extracted through the softmax-based CNNs may be greater than that of the interclass, which makes it difficult to further improve the classification accuracy. To address these issues, this paper proposes a deep prototypical network with hybrid residual attention (DPN-HRA), which can effectively investigate the spectral-spatial information in HSI. Specifically, in order to improve the generalization capability of the model, feature extraction with a hybrid residual attention module is presented to enhance the critical spectral-spatial features and suppress the useless ones in the classification task. Furthermore, a novel discriminant distance-based cross-entropy loss (D<sup>2</sup>CEL) is proposed to increase the intraclass compactness, so as to obtain more superior results. Extensive experiments on three benchmark datasets are carried out to convincingly evaluate the proposed framework. With the generation of optimal prototypes representing each class and more discriminative embedding features, encouraging classification results are achieved compared with state-of-the-art methods.

... The use of these codes to deal with noise and offset issues was already briefly discussed in [9], where hybrid Pearson and Euclidean detection was considered. By substituting the value zero for the weighing parameter γ in [9, Eq. (35)] (and then squaring because of a different notation), it appears that a δ * min of 2 − 4/n can be obtained by using a single parity bit. ...

Decoders minimizing the Euclidean distance between the received word and the candidate codewords are known to be optimal for channels suffering from Gaussian noise. However, when the stored or transmitted signals are also corrupted by an unknown offset, other decoders may perform better. In particular, applying the Euclidean distance on normalized words makes the decoding result independent of the offset. The use of this distance measure calls for alternative code design criteria in order to get good performance in the presence of both noise and offset. In this context, various adapted versions of classical binary block codes are proposed, such as (i) cosets of linear codes, (ii) (unions of) constant weight codes, and (iii) unordered codes. It is shown that considerable performance improvements can be achieved, particularly when the offset is large compared to the noise.

... In [8], optimal Pearson codes were presented, in the sense of having the largest number of codewords and thus minimum redundancy among all q-ary Pearson codes of fixed length n. Further, in [9] a decoder was proposed based on minimizing a weighted sum of Euclidean and Pearson distances. In [10], Blackburn investigated a maximum likelihood (ML) criterion for channels with Gaussian noise and unknown gain and offset mismatch. ...

Data storage systems may not only be disturbed by noise.
In some cases, the error performance can also be seriously degraded by offset mismatch. Here, channels are considered for which both the noise and offset are bounded. For such channels, Euclidean distance-based decoding, Pearson distance-based decoding, and Maximum Likelihood decoding are considered. In particular, for each of these decoders, bounds are determined on the magnitudes of the noise and offset intervals which lead to a word error rate equal to zero. Case studies with simulation results are presented confirming the findings.

... In the next theorem, we show that the ML criterion in case the offset has a normal distribution is in fact a weighted average of these two criteria. A hybrid method using a combination of the Euclidean and Pearson measures for detection purposes was already studied in [6] in a heuristic way. Here, we present the optimal balance between the two measures for a Gaussian offset. ...

Besides the omnipresent noise, other important inconveniences in communication and storage systems are formed by gain and/or offset mismatches. In the prior art, a maximum likelihood (ML) decision criterion has already been developed for Gaussian noise channels suffering from unknown gain and offset mismatches. Here, such criteria are considered for Gaussian noise channels suffering from either an unknown offset or an unknown gain. Furthermore, ML decision criteria are derived when assuming a Gaussian or uniform distribution for the offset in the absence of gain mismatch.

... However, Pearson distance detectors are more sensitive to noise. Therefore, hybrid minimum Pearson and Euclidean distance detectors have been proposed [6] to deal with channels suffering from both significant noise and gain/offset. ...

The recently proposed Pearson codes offer immunity against channel gain and offset mismatch. These codes have very low redundancy, but efficient coding procedures were lacking. In this paper, systematic Pearson coding schemes are presented. The redundancy of these schemes is analyzed for memoryless uniform sources. It is concluded that simple coding can be established at only a modest rate loss.

... We should emphasise that the model makes no assumptions on the distribution of the unknown ('nuisance') parameters a and b: if we know something about these distributions, other decoding methods might be appropriate. For example, if a is known to be very close to 1, then decoding based on minimising Euclidean distance is sensible; Immink and Weber [4] have proposed a decoder based on minimising a weighted sum of Euclidean and Pearson distances in some situations. ...

K.A.S. Immink and J.H. Weber recently defined and studied a channel with both
gain and offset mismatch, modelling the behaviour of charge-leakage in flash
memory. They proposed a decoding measure for this channel based on minimising
Pearson distance (a notion from cluster analysis). The paper derives a formula
for maximum likelihood decoding for this channel, and also defines and
justifies a notion of minimum distance of a code in this context.

K-means is one of ten popular clustering algorithms. However, k-means performs poorly due to the presence of outliers in real datasets. Besides, a different distance metric makes a variation in data clustering accuracy. Improve the clustering accuracy of k-means is still an active topic among researchers of the data clustering community from outliers removal and distance metrics perspectives. Herein, a novel modification of the k-means algorithm is proposed based on Tukey’s rule in conjunction with a new distance metric. The standard Tukey rule is modified to remove the outliers adaptively by considering whether the data is distributed to the left, right or even to the input data's mean value. The elimination of outliers is applied in the proposed modification of the k-means before calculating the centroids to minimize the outliers' influences. Meanwhile, a new distance metric is proposed to assign each data point to the nearest cluster. In this research, the modified k-means significantly improves the clustering accuracy and centroids convergence. Moreover, the proposed distance metric's overall performance outperforms most of the literature distance metrics. This manuscript's presented work demonstrates the significance of the proposed technique to improve the overall clustering accuracy up to 80.57% on nine standard multivariate datasets.

The Shapley distance in a graph is defined based on Shapley value in cooperative game theory. It is used to measure the cost for a vertex in a graph to access another vertex. In this paper, we establish the Shapley distance between two arbitrary vertices for some special graphs, i.e., path, tree, cycle, complete graph, complete bipartite, and complete multipartite graph. Moreover, based on the Shapley distance, we propose a new index, namely Shapley index, and then compare Shapley index with Wiener index and Kirchhoff index for these special graphs. We also characterize the extremal graphs in which these three indices are equal.

In many channels, the transmitted signals do not only face noise, but offset mismatch as well. In the prior art, maximum likelihood (ML) decision criteria have already been developed for noisy channels suffering from
signal independent offset
. In this paper, such ML criterion is considered for the case of binary signals suffering from Gaussian noise and
signal dependent offset
. The signal dependency of the offset signifies that it may differ for distinct signal levels, i.e., the offset experienced by the zeroes in a transmitted codeword is not necessarily the same as the offset for the ones. Besides the ML criterion itself, also an option to reduce the complexity is considered. Further, a brief performance analysis is provided, confirming the superiority of the newly developed ML decoder over classical decoders based on the Euclidean or Pearson distances.

The performance of certain transmission and storage channels, such as optical data storage and nonvolatile memory (flash), is seriously hampered by the phenomena of unknown offset (drift) or gain. We will show that minimum Pearson distance (MPD) detection, unlike conventional minimum Euclidean distance detection, is immune to offset and/or gain mismatch. MPD detection is used in conjunction with (T) -constrained codes that consist of (q) -ary codewords, where in each codeword (T) reference symbols appear at least once. We will analyze the redundancy of the new (q) -ary coding technique and compute the error performance of MPD detection in the presence of additive noise. Implementation issues of MPD detection will be discussed, and results of simulations will be given.

We explore a novel data representation scheme for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. The only allowed charge-placement mechanism is a "push-to-the-top" operation which takes a single cell of the set and makes it the top-charged cell. The resulting scheme eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells.
We present unrestricted Gray codes spanning all possible n-cell states and using only "push-to-the-top" operations, and also construct balanced Gray codes. We also investigate optimal rewriting schemes for translating arbitrary input alphabet into n-cell states which minimize the number of programming operations.

Predetermined fixed thresholds are commonly used in nonvolatile memories for reading binary sequences, but they usually result in significant asymmetric errors after a long duration, due to voltage or resistance drift. This motivates us to construct error-correcting schemes with dynamic reading thresholds, so that the asymmetric component of errors are minimized. In this paper, we discuss how to select dynamic reading thresholds without knowing cell level distributions, and present several error-correcting schemes. Analysis based on Gaussian noise models reveals that bit error probabilities can be significantly reduced by using dynamic thresholds instead of fixed thresholds, hence leading to a higher information rate.

Codes were designed for optical disk recording system and future options were explored. The designed code was a combination of dc-free and runlength limited (DCRLL) codes. The design increased minimum feature size for replication and sufficient rejection of low-frequency components enabling a simple noise free tracking. Error-burst correcting Reed-Solomon codes were suggested for the resolution of read error. The features of DCRLL and runlength limited (RLL) sequences was presented and practical codes were devised to satisfy the given channel constraints. The mechanism of RLL codes supressed the components of the genarated sequences. The construction and performance of alternative Eight to fourteen modulation (EFM)-like codes was studied.

The speaker will describe his adventures during the past eight months when he has been intensively studying as many aspects of SAT solvers as possible, in the course of preparing about 100 pages of new material for Volume 4B of The Art of Computer Programming.

Coding schemes for storage channels, such as optical recording and non-volatile memory (Flash), with unknown gain and offset are presented. In its simplest case, the coding schemes guarantee that a symbol with a minimum value (floor) and a symbol with a maximum (ceiling) value are always present in a codeword so that the detection system can estimate the momentary gain and the offset. The results of the computer simulations show the performance of the new coding and detection methods in the presence of additive noise.

In non-volatile memories, reading stored data is typically done through the use of predetermined fixed thresholds. However, due to problems commonly affecting such memories, including voltage drift, overwriting, and inter-cell coupling, fixed threshold usage often results in significant asymmetric errors. To combat these problems, Zhou, Jiang, and Bruck recently introduced the notion of dynamic thresholds and applied them to the reading of binary sequences. In this paper, we explore the use of dynamic thresholds for multi-level cell (MLC) memories. We provide a general scheme to compute and apply dynamic thresholds and derive performance bounds. We show that the proposed scheme compares favorably with the optimal thresholding scheme. Finally, we develop limited-magnitude error-correcting codes tailored to take advantage of dynamic thresholds.

A class of codes and decoders is described for transmitting digital information by means of bandlimited signals in the presence of additive white Gaussiau noise. The system, called permutation modulation, has many desirable features. Each code word requires the same energy for transmission. The receiver, which is maximum likelihood, is algebraic in nature, relatively easy to instrument, and does not require local generation of the possible sent messages. The probability of incorrect decoding is the same for each sent message. Certain of the permutation modulation codes are more efficient (in a sense described precisely) than any other known digital modulation scheme. PCM, ppm, orthogonal and biorthogonal codes are included as special cases of permutation modulation.

Novel Coding Strategies for Multi-Level Non-Volatile Memories

- F Sala

F. Sala, "Novel Coding Strategies for Multi-Level Non-Volatile
Memories", Master Thesis, UCLA, 2013.