Science topic

# Statistical Pattern Recognition - Science topic

Explore the latest questions and answers in Statistical Pattern Recognition, and find Statistical Pattern Recognition experts.
Questions related to Statistical Pattern Recognition
• asked a question related to Statistical Pattern Recognition
Question
Hi all!
I am wondering on what would you consider to be the 10 most essential formulas in statistics and 10 in signal processing necessary for statistical pattern recognition and structural health monitoring.
• asked a question related to Statistical Pattern Recognition
Question
I have aquired the coordinates (X and Y) of the placement of objects (cubes) in a defined space (a tray) made by different individuals. I know when the first object has been disposed, the second one etc. and I can create a trajectory starting from the first object disposed to the last one (I attach a figure to be more specific). I am very new to this type of data and I was wondering how can I analyze coordinates and trajectories (I have 50 trajectories) comparing them to each other? Mainly I would like to find similarities between the spatio-temporal pattern of placement of the subjectcs, and check for common placement strategies. What kind of analysis shoud I run and what software I should use?
Dear Pedro, and remaining question followers, please correct me if I’m wrong but, as far as I understand Procrustes analysis as well as convolution 2D approaches do not take into consideration the kinetic nature (or order) of the points defining a shape (or trajectory). Take, for instance, the following two scenarios consisting of 4 points each (all the 4 points have the exact same coordinates but in a different order, see attached figure):
Scenario 1: A(x=0, y=0), B(x=0, y=1) , C(x=1, y=1) , D(x=1, y=0)
Scenario 2: A(x=0, y=0), B(x=1, y=1) , C(x=1, y=0) , D(x=0, y=1)
The fact that the shape defined by the 4 points is exactly the same (actually a square), the trajectory (defined as the time dependent location/positioning in the 2D plane) or the placement of objects (as in the question of Antonio) are quite dissimilar, as there is no geometric transformation able to convert one scenario into the other. Please share your thoughts/feedback on the topic.
Best regards, Luis
• asked a question related to Statistical Pattern Recognition
Question
Hi!
We are trying to estimate body mass (W) heritability and cross-sex genetic correlation using MCMCglmm. Our data matrix consists of three columns: ID, sex, and W. Body mass data is NOT normally distributed.
Following previous advice, we first separated weight data into two columns, WF and WM. WF listed weight data for female specimens and “NA” for males, and vice-versa in the WM column. We used the following prior and model combination:
prior1 <- list(R=list(V=diag(2)/2, nu=2), G=list(G1=list(V=diag(2)/2, nu=2)))
modelmulti <- MCMCglmm(cbind(WF,WM)~trait-1, random=~us(trait):animal, rcov=~us(trait):units, prior=prior1, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)
The resulting posterior means of posterior distribution were suspiciously low (e.g. 0.00002). We calculated heritability values anyway, using the following:
herit1 <- modelmulti\$VCV[,'traitWF:trait WF.animal']/
(modelmulti\$VCV[,'traitWF:trai tWF.animal']+modelmulti\$VCV[,' traitWF:traitWF.units'])
herit2 <- modelmulti\$VCV[,'traitWM:trait WM.animal']/
(modelmulti\$VCV[,'traitWM:trai tWM.animal']+modelmulti\$VCV[,' traitWM:traitWM.units'])
corr.gen <- modelmulti\$VCV[,traitWF.traitW M.animal']/
sqrt(modelmulti\$VCV[,'traitWF: traitWF.animal']*modelmulti\$VC V[,'traitWM:traitWM.animal'])
We get heritability estimates of about 50%, which is reasonable, but correlation estimates were extremely low, about 0.04%.
Suspecting the model was wrong, we used the original dataset with all weight data in a single column and tried the following model:
prior2 <- list(R=list(V=1, nu=0.02), G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))
model <- MCMCglmm(W~sex, random=~us(sex):animal, rcov=~us(sex):units, prior=prior2, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)
The model runs, but it refuses to calculate “herit” values, with the error message “subscript out of bounds”. We’d also add that in this case, the posterior density graph for sex2:sex.animal is not shaped like a bell.
What are we doing wrong? Are we even using the correct models?
Eva and Simona
See our published paper on the topic: Cross-sex genetic correlation does not extend to sexual size dimorphism in spiders
• asked a question related to Statistical Pattern Recognition
Question
what is the differences between classifires and associative memories ?
As an associative memory-like neural network extensively used to solve robot motion control problems, let us cite the biologically-inspired Cerebellar Model Articulation Controller (CMAC) invented by Albus in 1975. For further details, follow:
• asked a question related to Statistical Pattern Recognition
Question
John-Tagore Tevet
Let us try to open the essence of graphs, from that's so far tried to circumvent.
1. What is a graph
Graph is an association of elements with relationships between these that has a certain structure.
Graphs are represented for different purposes. On the early rock paintings have been found the constellations show schemes. Graphs was used also for explain the theological tenets.
Example 1. Graph (structural formula) of isobutane C4H10:
Graphs began to investigate after then when Leonhard Euler in 1736 has solved the problem of routing on the seven bridges between four banks of Königsberg [1].
Example 2. Königsberg’s bridges and corresponding graph:
Also in present time used the graphs mainly for solving the problems of routing and flowing. Already in 1976 considered that such one-sided approach is a hindering factor for studying of graphs [2]. To the essence of graph, to its structure and symmetry properties has the interest practically non-existent. The last explorer was evidently Boris Weisfeiler in 1976 [9].
Definition a graph as an object consisting in node set V and edge set E, G=(V, E), is a half-truth that beget confusions. Essential is to explain the properties of inner organizing (inner building) or structure, i.e. identification of graphs.
Graph is presentable: 1) as a list L of adjacencies; 2) in the form of adjacency matrix E; 3) graphically G, where the elements to “nodes” and relations to “edges” called.
Example 3. List of adjacencies L, corresponding adjacency matrix E and for both corresponded graphs GA and GB:
Explanations:
The outward look and location of the enumerated elements in graph not have something meaning. But on the emotional level it rather engenders some confusion.
One graph can be differs from the other on its looking or its inner organizing (inner-building) or structure S what in ordinarily visually not be opened. Maybe just due to this is to the present days the existence of structure ignored.
We can here make sure that graphs GA and GB have the same structure and these are isomorphic GA @ GB. Ordinarily differentiate in the objects just the “outward” differences and refuse to see some common structure.

Propositions 1. Structure axioms:
P1.1.    Structure S is presentable as a graph G and each graph G has its certain structure S.
P1.2.    Isomorphic graphs have the same structure – structure is the complete invariant of isomorphic graphs.

Identification of graph is based on identification the binary relations between elements [3 - 8]. Binary relation can a “distance relation”, “circle relation”, “clique relation” etc. and is measurable. Binary relation characterized by corresponding binary sign.
2. Identification of the graph
For identification of the graphs uses two each others complementary ways:
Multiplicative identification (products of adjacency matrixes);
Heuristic identification.

Propositions 2. Multiplicative identification: multiplication the adjacency matrixes:
P2.1.       To multiplying the adjacency matrix with itself E´E´E´…=En and fixing in case of each degree n the number p of different multiplicative binary signs enij that as rule enlarges. Forming the sequence vectors ui of different multiplicative binary signs.
P2.2.       In each case if p enlarges (change) must transpose the rows and columns of En correspondingly to the obtained frequency vectors ui.
P2.3.       Stop the multiply if p more no enlarges and to present the current En and the following En+1.
Explanation: Multiplicative signs differentiate the binary signs but no characterize these.
Example 4. Adjacency matrix E and its transposed products E2, E3 of graphs on example 3:
1  2  3  4  5  6| i
0  1  0  1  0  1| 1
1  0  1  0  1  1| 2
E    0  1  0  1  0  1| 3
1  0  1  0  1  0| 4
0  1  0  1  0  1| 5
1  1  1  0  1  0| 6
ui
2  6| 1  3  5| 4     |    i    0 1 3 4   k
4  3| 1  1  1| 3     |    2    0 3 2 1   1
3  4| 1  1  1| 3     |    6    0 3 2 1   1
E2   1  1| 3  3  3| 0     |    1    1 2 3 0   2
1  1| 3  3  3| 0     |    3    1 2 3 0   2
1  1| 3  3  3| 0     |    5    1 2 3 0   2
3  3| 0  0  0| 3     |    4    3 0 3 0   3
ui
2  6| 1  3  5| 4|   i    0 2 3 6 7 9 10  k
6  7|10 10 10| 3|   2    0 0 1 1 1 0 3   1
7  6|10 10 10| 3|   6    0 0 1 1 1 0 3   1
E3   10 10| 2  2  2| 9|   1    0 3 0 0 0 1 2   2
10 10| 2  2  2| 9|   3    0 3 0 0 0 1 2   2
10 10| 2  2  2| 9|   5    0 3 0 0 0 1 2   2
3  3| 9  9  9| 0|   4    1 0 2 0 0 3 0   3
Explanations:
a)      The set of similar relations (and elements) recognize their position W in the structure. Position W is in group theory known as transitivity domain of automorphisnsms, equivalence class or orbit.
Multiplicative binary signs enij recognize here five positions of binary relations WR and on they base three positions of elements WV.

Propositions 3. Position axioms:
P3.1.       If structural elements (graph nodes) vi , vj , … have in graph G the same position WVk then corresponding sub-graphs (Gi=G\vi) @ (Gj=G\vj) @....   are isomorphic.
P3.2.       If relations (edges) eij, ei*j*, … have in graph G the same binary(+)position WR+n then corresponding greatest subgraphs (Gij=G\eij) @ (Gi*j*=G\ei*j*) @....  are  isomorphic.
P3.3.       If relations (“non-edges”) eij, ei*j*, … have in graph G the same binary(–)position WRn– then corresponding smallest supergraphs (Gij=GÈeij) @ (Gi*j*=GÈei*j*) @....  are isomorphic.

Before elaboration of the multiplicative identification way was elaborated a heuristic way.
Propositions 4. Heuristic identification:
P4.1.       Fix an element i and form its neighborhood Ni, where the elements, connected with i divide according to distance d to entries Cd.
P4.2.       Fix an element j and fix its neighborhood Nj by condition P4.1.
P4.3.       Fix the intersection Ni ÇNj as a binary graph gij, and fix the distance –d between i and j (in case of adjacency collateral distance +d), the number n of elements (nodes) in gij, number q of adjacencies (edges). Fixing the heuristic binary sign ±d.n.q.ij of obtained graph gij.
P4.4.       Realize P4.1 to P4.3 for each pair i,jÎ[1, |V|]. Obtained preliminary heuristic structure model SMH.
P4.5.       Fixing for each row i its frequency vector ui. Transpose the preliminary model SM by frequency vectors ui lexicographically to partial models SMk.
P4.6.       In the framework of SMk transpose the rows and columns lexicographically by position vectors si to complementary partial models. Repeat P4.6 up to complementary transposing no arises.
Explanation: Heuristic binary signs differentiate the binary signs and characterize these.
Example 5. On the Example 3 presented differently enumerated graphs GA and GB, their heuristic binary signs and structure models SMA and SMB with their common product E3:
ui
3  4| 1  4  5| 2|   iA
1  2| 1  2  5| 6|        iB    0 2 3 6 7 9 10  k
6  7|10 10 10| 3|   3    3    0 0 1 1 1 0 3   1
6|10 10 10| 3|   4    6    0 0 1 1 1 0 3   1
E3         | 2  2  2| 9|   1    1    0 3 0 0 0 1 2   2
…..  |    2  2| 9|   2    4    0 3 0 0 0 1 2   2
|       2| 9|   5    5    0 3 0 0 0 1 2   2
|        | 0|   6    2    1 0 2 0 0 3 0   3
Explanations:
”Diverse” graphs GA and GB have equivalent heuristic structure models SMA » SMB and the same multiplicative model E3. This means that structures are equivalent and all on the examples 2 and 4 presented graphs GA and GB are isomorphic GA @ GB.
The binaries are divided to five binary positions WRn, where the “adjacent pairs” or “edges” divided to three binary(+)positions (full line, a dotted, dashed-line) that coincide with heuristic binary signs C, D, E and corresponding multiplicy binary signs 10, 7, 2, and with two binary(–)positions with signs –A and –B and multiplicative signs 9 and 3. In base of these divide the structural elements to three positions WVk.
The column ui constitutes frequency vectors, where each element i characterize its relationships with other elements. On the base of frequency vectors ui obtained the positions of elements WVk.
The column si constitutes position vectors that represent the connecting of i with elements on the position k.
A principal theoretical algorithm of isomorphism recognition exists really – it consists in rearranging (transposing) the rows and columns of adjacency matrices EA of graph GA as yet these coincides with the EB of GB. But this has an essential lacking – it is too complicated, the number of steps can be up to factorial n!
Propositions 5. On the relationships between isomorphism and structural equivalence:
P5.1.    Isomorphism GA@GB is a such one-to-one correspondence, a bijection j: VA®VB, between elements what retains the structure GS of graphs GA and GB.
P5.2.    Isomorphism recognition does not recognize the structure GS and its properties (positions etc.), but the structure models SM and En recognize the structure and its properties with exactness up to isomorphism.
P5.3.    Structural equivalence SMA»SMB and EnA»EnB is a coincidence or bijection j: WA®WB on the level of binary positions WRn and positions of nodes (elements) WVk.
P5.4.    In the case of large symmetric graphs recognizes the products En the binary positions more exact than heuristic models SM, where need to use the binary signs of higher degree. That why it is necessary to treat both in together, bearing in mind also that the heuristic binary signs characterize the essence of relationship itself.
P5.5.    Recognition of the positions by the structure model is more effective than detecting the orbits on the base of the group AutG.
Example 6. To the recognition on the Example 1 represented structure of isobutane suffice use the heuristic model SM:
Explanation: Decomposing the elements C and H to four positions corresponds to actuality. The positions are visually appreciable also on the Example 1.
3. List of tasks that solving based on the identified graphs (structure)
To conclusion it should be emphasized that the recognition of graph’s structure (organizing) is based on the identification (distinction) of binary relations between elements. Binary relation can be measured as a “relation of the distance”, “circle relation”, “clique relation”, etc. Binary relation is recognizable by the corresponding binary sign.
The complex of tasks that are based to recognizing structures is broad, various and novel (differ from up to now set up) [3 - 8]. We list here some.
1.      The relations between structural positions, automorphismsm and group-theoretical orbits.
2.      Structural classification the symmetry properties of graphs.
3.      Measurement the symmetry of graphs.
4.      Analyzing different situations of structural equivalency and graphs isomorphism.
5.      Positional structures that open the “hidden sides” of graphs.
6.      Unknown sides of well-known graphs.
7.      Adjacent structures and reconstruction problem. It is connected with general solving the notorious Ulam’s Conjecture.
8.      Sequences of adjacent structures and their associations – the systems of graph structures.
9.      Probabilistic characteristics of graph’s systems.
10.    The relations of graph systems with classical attributes.
References
1.       Euler. L. Solutio problematis ad geometriam situs pertinentis. – Comment. Academiae Sci. I. Petropolitanae 8 (1736), 128-140.
2.       Mayer, J. Developments recents de la theorie des graphes. – Historia Mathematica 3 (1976) 55-62.
3.       Tevet, J.-T. Semiotic testing of the graphs: a constructive approach and development. S.E.R.R., Tallinn, 2001.
4.                     Hidden sides of the graphs. S.E.R.R. Talinn, 2010.
5.                     Semiotic modeling of the structure. ISBN 9781503367456, Amazon Books. 2014.
6.                     Süsteem. ISBN 9789949388844. S.E.R.R., Tallinn, 2016.
7.                     Systematizing of graphs with n nodes. ISBN 9789949812592. S.E.R.R., Tallinn, 2016.
8.                     What is a graph and how it to study. ISBN 9789949817559. S.E.R.R., Tallinn, 2017.
9.       Weisfeiler, B. On Construction and Identification of Graphs. Springer Lect. Notes Math., 558, 1976 (last issue 2006).
It is easy to prove that (quote wikipedia):
If A is an adjacency matrix of the directed or undirected graph G, then the matrix An (i.e., the matrix product of n copies of A) has an interesting interpretation: the element (i, j) gives the number of (directed or undirected) walks of length n from vertex i to vertex j. If n is the smallest nonnegative integer, such that for some i, j, the element (i, j) of An is positive, then n is the distance between vertex i and vertex j.
• asked a question related to Statistical Pattern Recognition
Question
I have a data frame called p.1 with several column of information for each of the points recorded by the Argos system of a penguin.
In the beggining I calculated that the distance  travelled by the penguin in each trip is the sum of the distances from the coast, but that´s wrong, the distance travelled per trip is the distance between the first points plus the distance between the second and third point and successively and re-start for every trip. With the next code I could calculate the distance between points:
distancebetwenpoints=spDists(locs1_utm, longlat=FALSE)
p.1\$dist=distancebetwenpoints
locs1_utm\$dist=distancebetwenpoints
where locs1_utm is the same as p.1 but converted into a SpatialPointsDataFrame.
The problem of this code is that I obtain a huge matrix of the distances between all the points, I tried several alternatives to just select the column that i need but don´t work.
Does someone know how i can calculate the distances of every trip made by the penguin?
In the attached file there are some columns called todelete those i just created to clean better the data.
Thank you Veli-Matti, I am going to try that code now, I am at the moment trying to use the adehabitat package, trying to convert the dataframe into a ltraj list to calculate distances, times, and angles and see if I can make it work as well with your code.
Thank you very much.
Daniel G.
• asked a question related to Statistical Pattern Recognition
Question
what is the difference between random binary sensing matrix  and random Gaussian sensing matrix??
How can  i choose the suitable matrix for a certain signal ?
A sensing matrix maps input vector to measurement vector through linear wighted summation of input. What makes a specefic matrix good, is application dependent. Now, both distributions more or less satisfy RIP. However hardware implementation of the Bernoulli matrix (binary or bipolar) is much much easier especially in analog domain. A Bernoulli wight is either 0 or 1 (or -1/1 in case of polar Bernoulli), but a Gaussian wight is a floating point figure. Multiplication of a flouting point number either in digital or analog, is resource consuming, while multiplication of a Bernoulli wight is feasible through implementation of a simple switch in analog domain or and instruction in digital. As an example consider RMPI analog to information devices which compressively sample the signal in analog and then reconstruct the signal in digital. Prior to quantization the sensing matrix should be applied, were through incorporating a Bernoulli matrix, the mulpliers are implemented as simples switches
• asked a question related to Statistical Pattern Recognition
Question
For image processing, unlike the methods which split the image into pitches, total variation is on the whole image. In this case, for an image with size n by n, what is the complexity of total variation minimization? Is total variation too slow compared to the pitch based method?  Thanks.
• asked a question related to Statistical Pattern Recognition
Question
I am working in applying sparse representation to classification tasks. In Sparse Representation Classifier (SRC), the test signal is assigned to the dictionary(class) that gives the minimum residual error. Another class assignment measure is the minimum sparsity. My question is: Does the sign of the sparse coefficients plays a role in class assignment or no?.
Dear Mandal,
Thanks for your answer. I am using class specific dictionaries. Say for example three dictionaries (D1,D2,D3) where they represent three classes and I have a test signal of class (D2). How I would assign the test signal  based on class voting assignment ?
• asked a question related to Statistical Pattern Recognition
Question
Is it important correlation between labels in multi-label classification?why?
.
i tend to formalize ML classification through the canonical analysis framework
(i admit this is a somewhat larger perspective but i believe it allows to have a simple picture)
in canonical analysis, you have individuals (say areas of land) described by two sets of features (say, the animals living in the area on the one hand, and the plants growing in the area on the other hand)
the aim of canonical analysis is to find linear factors in each description space so that factors are maximally correlated across description spaces ; for the land area example, this means finding (mutually orthogonal) combinations of animal population (linear factor in animal description space) strongly correlated to (mutually orthogonal) combinations of plant population (linear factor in animal description space)
if you imagine the correlation matrix of the full description of the areas, it has a block structure
A ; B
tB ; C
with A the correlation matrix of the animal description, C the correlation matrix of the plant description and B the cross correlation matrix of the animal-plant descriptors
while PCA uses all this correlation matrix, canonical analysis concentrates only on the B submatrix (indeed, canonical analysis technically boils down to a svd of B)
.
all this handwaving introduction to indicate that there are two kinds of correlations when you have two description spaces (which is the case in multi-label problems : you have a feature space and a label space) : correlations within each description space (which are not the objective of the analysis : you are not interested in correlations between labels stricly speaking if such correlations have nothing to do with your feature space) and correlations across description spaces (which are the objective of your analysis)
for instance, some labels might well be just correlated noise from the perspective of your description space : PCA would take such correlation into account but without any gain from the classification perspective ; canonical analysis will ignore such correlations and concentrate on cross-correlations between your feature space and your label space
.
• asked a question related to Statistical Pattern Recognition
Question
I am trying to compare the dominant colors in approx. 40.000 images. The papers I find offer the mathematics for comparing two colors. I need to create clusters, or palettes to represent the colors of such clusters.
Once you have performed a pair-wise comparisons and got the inter-object distance matrix, it is often useful to visualize results in form of a 2D scatterplot using multi-dimensional scaling (MDS) method. This may inspire you for more ideas beyond the K-means.
• asked a question related to Statistical Pattern Recognition
Question
I need human gait data for ascending stairs, is there any database for this type of data? I need RAW dataset.
Hi,
I think this link may be useful for you
• asked a question related to Statistical Pattern Recognition
Question
I want matlab code I can use to extract features from this cattle image using Fourier descriptor. And also code to applied them as input to ANN for classification. I don't know how to go about it. Can any one help me?
function df=dscr_fourier(matrice_contour)
% Calcul des descripteurs de Fourier correspondant a un contour donne
%
% df=dscr_fourier(matrice_contour);
%
% matrice_contour - matrice Nx2, ayant sur la premiere colonne les abcisses des
% points qui definissent le contour et sur la deuxieme
% colonnes leurs ordonnees
% df - vecteur des descripteurs de Fourier
%
map=matrice_contour;
tgt=map(:,1)+1i*map(:,2);
medie1=mean(tgt);
tgt0=tgt-medie1;
% recuperer les angles en radian pour faire le tri.
phpb=angle(tgt0);
phpb(phpb<0)=phpb(phpb<0)+2*pi;
[phpbs,idx]=sort(phpb);% trier la liste pour recuperer les indice afin de le faire pour le nombre complexe
%changer l'ordre ::inverser l'ordre
idx=flipud(idx);
% reorganiser les nombre complexe de haut vers le bas cd ordre descroissant
tgt=tgt(idx);
tgt0=tgt0(idx);
tgt0=tgt0+medie1;
vpb=tgt0;
m=length(vpb);
vpb=[vpb;vpb(1)];
l=[0,cumsum(abs(diff(vpb))).'];
b=[diff(vpb)./abs(diff(vpb))].';
L=l(m+1);
for n=1:m
cf=(L*(2*pi*n/L)^2)^(-1);
df(n)=0;
for k=2:m+1
df(n)=df(n)+b(k-1)*(exp(-1i*n*2*pi*l(k)/L)-exp(-1i*n*2*pi*l(k-1)/L));
end
df(n)=cf*df(n);
end
df=df.';
df=df/df(1); % normalisation pour similitude
df=df(2:m); % pour rendre invariant par changement d'echelle
df=abs(df);
• asked a question related to Statistical Pattern Recognition
Question
Hi all,
I want to ask that I have total 11 classes [A,B,C,D,E,F,G,H,I,J,K] in which 7 classes are used in training i.e [A,B,C,D,E,F,G]. Remaining 4 are not used in training. So, those 4 classes are considered as reject classes i.e. [H,I,J,K], when they are used in testing.
My current system is actually finding the similarity measure between the tested instance with the set of all available classes used in training. So, in that case, my system is currently reporting that the tested sample belongs to one of the class which is used in training, which has high similarity measure.
Now, I want to improve it by implementing reject class threshold. So, system should report that as similarity measure value lies under threshold value, that tested sample belongs to the rejected class.
So, can you please guide me about how can I identify threshold values for those reject classes [H,I,J,K]?
How can I do it by using Matlab code? Please let me know about it.
I am going to use Dr. Vikas option.
• asked a question related to Statistical Pattern Recognition
Question
i am using this formula of gabor kernel
it return zero value for x,y>10 due to which when it is multiplied with pixel value it returns 0 and filter effect applied to pixel becomes 0. Can anybody help me with this?
I have a try,it's ok.maybe you need to have a check about the data type￼
• asked a question related to Statistical Pattern Recognition
Question
The defenitions for FPIR, SEL, FNIR for an identification problem are as follows, but they look complicated for me. (specially using Rank and L candidates concept in computation).
Specially the concept of selectivity is complicated for me!
can any one explain them by a simple identification example?
FPIR (N,T,L)=(Num. nonmate searches where one or more enrolled candidates are returned at or above threshold, T)/(Num. nonmate searches attempted)
SEL(N,T,L)=(Num. nonmate enrolled candidates returned at or above threshold, T)/(Num. nonmate searches attempted)
FNIR(N,R,T,L)=(Num. mate searches with enrolled mate found outside top R ranks or score below threshold, T)/(Num. mate searches attempted)
The following are simple definitions for FPIR and FNIR:
A threshold "T" is used to classify a test case to be either a correct (true or positive) case or false (negative) case. If the case is below a threshold "T" then it is classified false (negative) and if it is above threshold "T" it is classified true (positive).
FPIR: is the false positive identification rate. It is the ratio of the test cases that are classified as true cases although they are false cases. It is type I error.
FPIR= Number of test cases classified above threshold "T" (true) / Number of all test cases
FNIR: is the false negative identification rate. It is  the ratio of the test cases that are classified as false cases although they are true cases. It is type II error.
FNIR= Number of test cases classified below threshold "T" (false) / Number of all test cases
• asked a question related to Statistical Pattern Recognition
Question
Has any R package to exhausting search a sub sequencing like a pattern with 5 minutes walking then 5 minutes sitting with Actigraph accelerometer data? I would like to fives a behavior with  with 5 minutes walking then 5 minutes sitting, is there any existing pattern?
https://cran.r-project.org/web/packages/GGIR/index.html is your best bet. It is not developed specifically for Actigraph. However, it's relatively brand independent (http://www.ncbi.nlm.nih.gov/pubmed/27183118 )
• asked a question related to Statistical Pattern Recognition
Question
It's easy to understand what bias and variance mean in general in machine learning. The link below makes this very clear. But, its always hard to figure out which classifiers are of high/low bias and variance. Each classifier would have its own set of tuning parameters to alter this characteristic.
So, how will one determine whether a given classifier is of high bias or high variance?
The following article discusses a chance-corrected and maximum-corrected index of predictive accuracy, that is based on the sensitivity of the model in accurately classifying observations in the various classes that are being discriminated. This index can be applied to any classification methodology, and can be used to directly compare model performance.
• asked a question related to Statistical Pattern Recognition
Question
Dear all,
I have a long list of ordered factor level combinations A = {a1...an}, .... E = {e1...ek} which is non-exhaustive (e.g. not full-factorial) e.g.:
A   B  C   D  E
...
a1 b1 c1 d1 e1
a1 b1 c1 d1 e2
a2 b1 c1 d1 e1
...
Now I want to merge all entries which differ only by 1 factor at a time (e.g. E) into a pattern. In the example this should lead to
A   B   C  D   E
...
a1 b1 c1 d1 {e1,e2}
a2 b1 c1 d1 e1
...
Now my problem is to do that for all factors. In the example keeping A fixed, one can not simple merge the two entries to:
{a1,a2} b1 c1 d1 {e1,e2}
since this implies that also the combination
a2 b1 c1 d1 e2
was part of the original factor combinations - which it wasn't.
My first try was to only merge entries where all levels of the fixed factor were present and replace the according pattern with a wildcard,e.g. for E = {e1,e2,e3}
a3 b2 c4 e1
a3 b2 c4 e2
a3 b2 c4 e3
becomes
a3 b2 c4 *
Since in this case I know that E is not important for the combination of factors A to C. But this approach is unsatisfactory since it leaves a lot of entries unmerged (e.g. the example in the beginning will not be merged).
So, could someone point me to a direction where a solution to this problem might be found (e.g. graph/subset reduction, also thought of bioinformatic methods treating the factor combinations somehow as strings).
Any help would be very welcome!
Greetings, David
Hello David et alii,
Seen as a theoretical formula : " How to reduce a list of factors ..." could sound as thinking to build general types of computing as pattern processes in a nearly neuronal way. But you introduce the question in a more analytical manner when we admit that " reducing a list " could not only be the economical part of the work but also  finding a processing solution.
Practically I remember about LISP a language thougnt for manipulating lists you could be interrested in. Could a neural network process achieve the task as you wish?
You can also ask your question to people interrested in Set Operads related to combinatorics for example a book : Set Operads in Combinatorics and Computer Sciences. By Miguel A. Mendez. Springer briefs in mathematics. Springer.
But it seems to me aslo true that your question as a specific practical and direct problem could largely progress in your mind if better defined. and more deeply explanable for any further help.
Best wishes.
Jean-Yves Tallet
• asked a question related to Statistical Pattern Recognition
Question
Is there any stopping criteria that i should use to make the algorithm converge rather number of iterations??
• asked a question related to Statistical Pattern Recognition
Question
I am building predictive models in the fraud domain. Does anybody of you have experience with supervised learning to predict learning to predict multiple offenders?
And the advantages and disadvantages of different ways of labelling (for example multiple offenders versus once upon a time offenders, or multiple offenders against the rest of the population)?
What are the experiences? And what are the benefits of different ways of labelling?
Thanks beforehand for everybody who can advice me from his/her experience.
Gerard Meester
Netherlands
Fraud is a very general term and you need to be more specific on the type of fraud that you are targeting. I would suggest the following reference:
If you look at the preview it will show you the fraud tree which are the subcategories of fraud. Each of these types of fraud have particular characteristics that you can target. Some of them will be outlier detection through clustering, while others will be based on values within each vector for supervised learning.
• asked a question related to Statistical Pattern Recognition
Question
Hi. I am looking for tips or any sort of help on a topic "Power Transmission Alarms Pattern recognition". A database of alarms and events collected by the ESKOM control centre is available.I am to use this database to find rules of association that will enable the prevention of unwanted event in real-time. Sequences of alarms are used to generate the inference of a particular event. I have attached one of the database to see how it is structured.
Thank you Sayed, but could you give me a hint on how can I get the sequential Alarm patterns from this alarm data?
• asked a question related to Statistical Pattern Recognition
Question
Although several approaches were developed for handling missing data, majority of them are only suitable to restore incomplete patterns which have random variation. When random variation exists beside shift, trend, systematic and cyclic behavior in which the order of data also contains valuable information, what is the best approach to handle missing data?
Razieh -
It seems that perhaps your application here is for a univariate time series for which you have no other data to do anything more complex.  If your current value is "missing," then that sounds like you are really saying that you need a forecast.   And because you have no clearly defined patterns established in your time series, the simplest format for "exponential smoothing" may be as good as you can do, at least for now.  It will put the strongest influence on the most recent data that you do have.
Jim
• asked a question related to Statistical Pattern Recognition
Question
Hello everyone!
I'm developing a pattern recognition algorithm on images. So far I have been using the MNIST database but due to some reason I need to switch to another database with higher resolution. It would be highly appreciated if someone could help me to find one or if anyone knows a trick for that that would be great!
Are you searching for digit databases or general ones?
For the latter I think the ImageNet DB contains more than enough data. ;)
• asked a question related to Statistical Pattern Recognition
Question
Dear colleagues,
I am currently working on classifying certain objects which are (visually) represented as waves/curves. I am wondering whether there is a way to characterize the curves so as to be able to build classification rules which can then be used to classify the curves.
P.S. I have 3 pre-defined classes of curves.
There are several ways, some that you can use without using algorithms from machine learning/soft computing/etc. For example, you can use curvature (see attached graphic) and find the maximum curvature for curves within each of your classes and simply group curves based upon the similarity of the resulting values. Similarly, for a curve gamma we can obtain the value k of the curve at p=gamma(x) using the equation in the second attached image, if the curve is smooth you can obtain global curvature using the integral in the third attached image.
There are many other mathematical ways to classify curves, but I suspect you are using the word colloquially, making many of the mathematical approaches useless (i.e., you really mean a depiction of a line with one or more "bends" in it, not something like the Peano curve or Koch curve).
Also, mathematically curves are generally already classified. The simple classification schemes may be similar to what you used to form your initial 3 classes (e.g., parabolas, hyperbolas, ellipses, etc.).
Then there are statistical analyses of shape, such as flux graphs using AOFs (average ouward flux) or other medial representation based approaches (e.g., flux based skeletonization). I've attached a chapter that might have something useful for you.
• asked a question related to Statistical Pattern Recognition
Question
I am studying seasonal changes in abundance of a fish species along a disturbance gradient. I sampled three locations at four seasons. My sampling sites at each location were very heterogeneous and the overall data was overdispersed . I am planning to analyze data using a GLMM with zero inflated model, considering LOCATION as a fixed factor and sampling site as a random factor. Should I also consider SEASON as a random factor (due to probable autocorrelation) or just nest it within LOCATION?
Thank you. I am actually interested in differences among seasons but the data are highly correlated across seasons and I am not sure about using season as a fixed effect.
• asked a question related to Statistical Pattern Recognition
Question
Let us I have a student data like city,hobby , age, mother tongue, gender, UG Specialization .
Now i want to make group of those students , having some similarity/ common among them(for exp they have common hobby, same mother tongue, belonging from same city etc.).
Can I used k-means clustering for it?
Note:I stored these fields data in numeric value only.
Thanks wei chen....i got a solution using R Studio...it has inbuilt Kmeans  algorithm...that for your expert advice...will be in touch for future...
• asked a question related to Statistical Pattern Recognition
Question
Repeated polygonal shapes or repeated colours are sources of visual patterns.   Another important source of patterns are the  presence of convex sets and convex hulls in digital images, especially in naturally camouflaged or in artificially camouflaged objects .   A set A is convex provided the line segment connecting any points A is contained in A.   A convex hull is the smallest convex set containing a set of points (see the attached image).   Also,  see the many convex sets in the natural camouflage of the dragon in the attached image and in
Convex sets have many applications in the study of digital images.   For example, convex sets are used in solving image recovery problems:
and in image restoration:
Convexity recognition is useful in object shape analysis in digital images:
Another important application of convexity is rooftop and building detection in aerial images:
Dear James Peters,
Yes, there are many applications of convexity, or convex sets, which sounds very similar to Christopher Alexander's positive space, one of the 15 structural properties he developed in The Nature of Order: http://en.wikipedia.org/wiki/The_Nature_of_Order
People tend to perceive things from this perspective of convexity in order to see identifiable pieces of a whole. This perspective (focusing on individuals) however is somehow contradictory to the holistic (or fractal) perspective. For example, drop a wine glass on the ground, and it is very likely broken into many pieces. These many pieces are actually mixed up with convex and concave ones. This mixed up is the same for natural cities we identified or detected using massive geographic information; see related paper: http://arxiv.org/ftp/arxiv/papers/1501/1501.03046.pdf
Having said so, may I draw a tentative conclusion as such? The perspective of convexity is Euclidean, focusing on individual, relying on human perception, whereas the fractal or holistic perspective is not necessary to be convex only, but mixed up with convex and concave ones. For example, the math snowflake is concave in essence, and is made of many convex triangles of course.
• asked a question related to Statistical Pattern Recognition
Question
During my research work I came across a constructive demonstration that two symmetric matrices can always be simultaneously diagonalised, provided one is positive definite. I am talking about pages 31--33 of "Introduction to Statistical
Pattern Recognition" by Keinosuke Fukunaga.
Now, it is well known that two matrices are simultaneously diagonalisable of and only if they commute [e.g. Horn & Johnson 1985, pp. 51–53].
This should imply that any positive-definite symmetric matrix commutes with any given symmetric matrices. This seems to me an unreasonably strong conclusion.
Also, since Fukunaga's method can be used also with Hermitian matrices, the same conclusion should be true even in this more general matrix field.
I seem to be missing something, can someone help me elaborate?
Dear Luca,
you were right with finding the statement suspicious. Simultaneously diagonalizable matrizes would indeed commute, and it is easy to see that this is not true in general, even if one of the matrizes is assumed to be positive definite. (For example take a diagonal 2x2 Matrix with entries 1 and 2 and the 2x2 matrix with all four entries equal to 1.) So two symmetric matrizes cannot be diagonalized simultaneously in general.
The book by Fukunaga you mention indeed does something different. Namely, given a positive definite matrix X and a symmetric matrix Y, the author finds a (non-orthogonal) invertible matrix A such that AtXA and AtYA are both diagonal (so he uses the transpose rather than the inverse) and indeed AtXA is the identity matrix. So this does not actually concern diagonalization of symmetric matrizes but of bilinear forms. The result can be expressed invariantly as the fact that given two symmetric bilinear forms, one of which is positive definite, there is a basis which is orthogonal for both forms.  This is not a deep result, it can be deduced from standard linear algebra results as follows: There is an orthonormal basis for the positive definite form. The other form is represented with respect to this basis by a symmetric matrix and then the usual oprthogonal diagonalization of symmetric matrices gives you an orthonormal basis for the first form with respect to which the second form is diagonal. (The book uses this argument converted to matrix form.)
• asked a question related to Statistical Pattern Recognition
Question
I used PCA and GA but I got only  ranking of features I was unable to find best features
hello,
you should read "feature selection methods" like filters and wrappers
and hybrid approaches.
I recommend you this papers;
"A survey on feature selection methods"
Girish Chandrashekar , Ferat Sahin
Journal Computers and Electrical Engineering 40 (2014) 16–28
and
"A novel feature selection approach for biomedical data classification"
Yonghong Peng , Zhiqing Wu, Jianmin Jiang
Journal of Biomedical Informatics 43 (2010) 15–23
Best Regards
• asked a question related to Statistical Pattern Recognition
Question
Fallowing data available for mining and pattern recognition
X coordination of the cursor with system time
y coordination of the cursor with system time
( System time for every 100 ms )
Each and every user's mouse movements data available for 30 min.
How  identify mouse movements patters if exits among these user ?
The first decision is if you're trying to find a pattern that you know something about (supervised case), or try find a pattern without knowing anything about it (unsupervised case).
I don't have much experience in the unsupervised task, which is what interests you probably.
I would start my search by combining terms like "time series" and "motif discovery".
A quick search gives these papers that seem like they might be what you're looking for:
I don't have enough experience with these types of tools or your type of data to give you a single recommendation.
• asked a question related to Statistical Pattern Recognition
Question
Neger, Rietveld and Janse (2014; attached) recently found that perceptual and statistical learning may rely on the same mental mechanism. Now, fearless of stretching, if we relate that to the issue of symbol grounding in language comprehension (cf. modal/embodied vs. amodal/symbolic accounts), would the previous findings support the mixed proposals that language comprehension is both perceptual and statistical (e.g., Barsalou's Language and Situated Simulation; Louwerse's Symbol Interdependency)?
Thank you very much
Maye, J., Weiss, D. J., & Aslin, R. N. (2008). Statistical phonetic learning in infants: Facilitation and feature generalization. Developmental Science, 11(1), 122-134.
Keidel, J. L., Jenison, R. L., Kluender, K. R., & Seidenberg, M. S. (2007). Does grammar constrain statistical learning? Commentary on Bonatti, Pena, Nespor, and Mehler (2005). Psychological Science, 18(10), 922-923.
ten Cate, C., & Okanoya, K. Revisiting the syntactic abilities of non-human animals: natural vocalizations and artificial grammar learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1598), 1984-1994.
• asked a question related to Statistical Pattern Recognition
Question
I download a part of connectome fmri data related to working memory. I visualize it with connectome workbench, but how can I get the fmri signal with SPM8, there are so many files with suffix .nii the SPM8 tool didn't recognize any of them.
Hi Osama, sorry for the slow reply. These files do indeed appear to be NIfTI-2 (the header size is 540, as described here http://nifti.nimh.nih.gov/pub/dist/doc/nifti2.h, which you can check with f=fopen('file'); fread(f,1,'int') in matlab).
SPM12b can read them, though it warns that "code 4 is not an option for units". It seems that the files are one-dimensional (i.e. not images, but perhaps individual time-series), with length 91282. E.g. the following works for me in SPM12b:
N = nifti('cope1.dtseries.nii')
N.dat.dim % [1 1 1 1 91282]
plot(squeeze(N.dat(1,1,1,1,:)))
I've checked that SPM12b is happy with one of the NIfTI-2 test images from here: http://nifti.nimh.nih.gov/pub/dist/data/nifti2/ and it does seem to be, so if you were not expecting these files to be 1D, then you might need to investigate further in the connectome workbench (I can't help you with that I'm afraid).
• asked a question related to Statistical Pattern Recognition
Question
I am working on pattern detection of dermatological images and I would like to know how to extract and match them.
I think the previous two references provided by @Christos P Loizou and @Lucia Ballerini are new and excellent to start with.
• asked a question related to Statistical Pattern Recognition
Question
What other features such as shape, texture and color can be suggested to classify (recognize) the three dimensional fragments in order to reassemble/reconstruct fragmented objects using their 3D digital images?
Thanks Hasan, but what do you mean of the camera position, orientation, focal length? I think it will be the same for all image.
• asked a question related to Statistical Pattern Recognition
Question
Image processing in matlab
Or you can try this
for i=1 to n
for j=1:n
f= p(i,j) log p(i,j)
end
end
• asked a question related to Statistical Pattern Recognition
Question
Classification using svm on image dataset.
bwlabel is used on a binary image and it gives the number of connected objects/components in a binary image.Using that label you can access the particular object and calculated its different features like area,centroid,eccentricity etc (use regionprops to calculate these) that can help in classification.Regarding label vector matrix i did not get what you mean by it.
• asked a question related to Statistical Pattern Recognition
Question
As bwlabel() works for labeling binary image how can we do the same for greyscale image in matlab?
As the people earlier have correctly pointed out, using K-means you can "segment" the image into K regions. However, you need to manually decide on K so that you have K distinct labelsl.
If you want to label the regions of a gray-image similar to bwlabel, you need to have a criteria of similarity between adjacent pixel values to have a region-level criteria. Please go through the function "imregionalmax" which finds a region level maxima from a gray-scale image, and works by the neighborhood relationship exactly similar to bwlabel.
You can also achieve the same through any region-growing segmentation algorithm, or even watershed segmentation algorithm.
• asked a question related to Statistical Pattern Recognition
Question
How would I specify that group 1=apple,group 2 = orange and group3=banana?
Thank you
• asked a question related to Statistical Pattern Recognition
Question
I have three 30x20 grids, and I am wondering if the pixel's value are similar amongst those grid...
Late answer, but perhaps of interest for later readers; the similarity metric proposed by Warren et al. (Warren, D. L., Glor, R. E., & Turelli, M. 2008. Environmental Niche Equivalency Versus Conservatism: Quantitative Approaches to Niche Evolution. Evolution 62: 2868–2883.). It seems similar to the before mentioned root mean square deviation and can be used as a relative measure of similarity of the grids. To test for significance, you could perhaps create confidence intervals around the similarity index using bootstrapping.
In R you can calculate the Warren similarity index using the Istat function in the SDMTools package or the niche.overlap function in the phyloclim package. If you use Maxent for spatial distribution modeling, the hypothesis.testing function in the phyloclim package may be of interest.
In GRASS GIS I wrote an extension that calculates Warren's similarity: http://grasswiki.osgeo.org/wiki/AddOns/GRASS_6#r.niche.similarity
• asked a question related to Statistical Pattern Recognition
Question
Suppose we have two classes separated linearly. Do we always get a correct line (100% correct classification) for every data separated linearly using Fisher. How about SVM and when we have two classes not separated linearly, is the Fisher answer optimal? How about svm? I mean one line can classify two classes and results (1 misclassification) and another line 2 misclassification.
Well theoretically, it does find the optimal vector in the feature space such that two data sets are well separated while "maintaining a small variance for each group."
The line that you are talking about is the separating line or the hyperplane on which the projections are made? The hyperplane will definately be optimal.
• asked a question related to Statistical Pattern Recognition
Question
I'm preparing for an exam and looking for some exercises with solutions about pattern recognition and machine learning specially in the field of : SVM, Bayes, Decision Tree, Overfitting and underfitting, VC dimension, PCA, KNN, Kmeans, Combination, Renforcement learning,MAP and ML.
Hi, try also looking through Numerical Recipes in C++. Although the algorithms are in C++, the background theory and examples are given in general. http://www2.units.it/ipl/students_area/imm2/files/Numerical_Recipes.pdf
• asked a question related to Statistical Pattern Recognition
Question
Genetic information gathered from autistic patients is transformed to multidimentional data. this is huge and requires machine learning techniques to create an automated Autism detection system. I wonder if there are publications along this track.
Thank you
• asked a question related to Statistical Pattern Recognition
Question
Assume we fit a GMM on a train set which contains normal samples. I'm wondering if we can use probability density function (which is a linear combination of component densities) in order to decide about new samples (test samples). In other words, our final decision about being normal or abnormal should be should be based on which probabilities: probability density function, Likelihood or posterior?
as Eric pointed out, the idea of using GMM for anomaly detection, OCC, etc. is not new.
Usually, one uses the estimated probability density p(x)=sum_i p(i)*p(x|i), i.e. the linear combination of cluster likelihoods you mentioned in your question above.
• asked a question related to Statistical Pattern Recognition
Question
I have thought that even the feature is increase, the performance increase isn't it?
For example if we have color skin, eye and hair, it gives more performance for face recognition than situation than if we just have eye color.
That's why I am so confused about GMM where even the size of feature decrease the probability is increased. Suppose that following Matlab code:
train=[1 2 1 2 1 2 100 101 102 99 100 101 1000 1001 999 1003];
No_of_Iterations=10;
No_of_Clusters=3;
[mm,vv,ww]=gaussmix(train,[],No_of_Iterations,No_of_Clusters);
test1=[1 1 1 2 2 2 100 100 100 101 1000 1000 1000];
test2=[1 1 2 2 100 99 1000 999];
test3=[1 100 1000];
[lp,rp,kh,kp]=gaussmixp(test1,mm,vv,ww);
sum(lp)
[lp,rp,kh,kp]=gaussmixp(test2,mm,vv,ww);
sum(lp)
[lp,rp,kh,kp]=gaussmixp(test3,mm,vv,ww);
sum(lp)
The results are as follow :
ans =
-8.0912e+05
ans =
-8.1782e+05
ans =
-5.0381e+05
Why does the feature size decrease as the probability increases? I expect that with more feature the performance increases isn't it?
.
a small vocabulary problem here (maybe) :
in your example, the number of "features" (that is the dimensionality of the representation space of your data) is constant and equal to one for the three test sets. What is changed is the number of "instances" of the test sets, that is the number of elements ("cardinality") of the test sets.
now, i am not familiar with your matlab code but i assume that sum(lp) is the loglikelihood, that is the log probability of observing your test set given the model you buit on your training set.
Indeed, under the independence hypothesis for your observations in the test set, the probability is just the product of the probabilities of observing each sample (therefore the sum(lp) stuff) ; since a probability is always less than one, the more elements you have in your test set, the smaller the product of probabilities is.
As simple as that, assuming i understood your question correctly !
.
• asked a question related to Statistical Pattern Recognition
Question