Science topic

Categorical Algebra - Science topic

Explore the latest questions and answers in Categorical Algebra, and find Categorical Algebra experts.
Questions related to Categorical Algebra
  • asked a question related to Categorical Algebra
Question
4 answers
I am trying to find correlation/association between two categorical vectors. I tried using Chi2, but it does not have a predefined upper limit or lower limit. This is when I found Tschuprow's T and Cramer's V. Now, I have many questions in this regard, and I would appreciate all help
  1. Are they good informative indicators for the categorical association?
  2. Are they reliant on p-values and degrees of freedom similar to Chi2?
  3. Are the scores skewed? I read that having Cramer score > 0.25 means it is very strong relation, which is not the case with all other metrics
  4. Do they have any preliminary conditions to be applied
  5. Can you recommend any other metrics that measure the data association/correlation/dependency for categorical/nominal values?
  6. I understand that these metrics rely on contingency table counts. Are there any metrics that use a different method?
I am looking for some metric that is as usable and informative as Pearson correlation or Spearman correlation and having 0-1 limits for the score.
Relevant answer
Answer
Cramer's V will range from 0 to 1. You can play with some toy data to see how it reacts in different cases. The interpretation for a "large" effect changes depending on how many categories there are (in the dimension with the smaller number of categories). These interpretations are addressed in Cohen, 1988, Statistical Power Analysis for the Behavioral Sciences, 2nd Edition. Some other effect size statistics to consider are Tschuprow's T and Goodman Kruskal lambda.
  • asked a question related to Categorical Algebra
Question
6 answers
Since J. von Neumann physicists stick to categories of Hilbert spaces to modelize quantum phenomena. Categories of modules over a ring might represent an alternative if we add axioms (e.g. the existence of particular limits or co-limits) that would respond to the experimental requirements.
A very general setting for the purpose would be abelian categories. Have there been attempts to make use of them?
References:
Relevant answer
Answer
Of course it is well known that the category of R-modules is suitable for this setting. Let me add one more reference
"Continuous Geometry" by J von Neumann, Oxford University Press, 1960.
  • asked a question related to Categorical Algebra
Question
5 answers
I need to perform a winsorization in a big set of data. I found an add in for excell that does that automatically, but unfortunately it only gives options with percents, whereas I was looking for a winsorization with standard deviations. Any of you knows how to perform this in an automated way?
Relevant answer
Answer
Linda,
You can do it by using Stata's winsor2 command.  I recommend that you check the code. In excel, it would be a tougher way than in Stata.
  • asked a question related to Categorical Algebra
Question
7 answers
Hello all, I have a question. I have a categorical response variable (accuracy on a two-alternative forced choice task), and I have divided my participants into quartiles based on their reaction time during the intertrial interval.
I followed the first rule of statistics I was taught: I made a picture, which I'm sharing in the link.
Now, I want to assess the probability of getting these different mean accuracy measures for the different groups. I've considered a few ways to do this, but I figured, perhaps a chi-square calculation would be simplest.
I obtained my expected count for the cells by getting the average for the entire condition, i.e. ~85%. Then, I divided the length of my observations into 4 parts, corresponding to the length of observations for each quartile of the group, i.e. 63.25. Then, I mutlipled that number by the average and I obtained my expected correct count for each cell of my table. I got the observed count for the four cells based on the sum of correct observations. I then followed the rest of the procedures for calculating my chi-square statistic, which turned out to be pretty low (0.4604651).
So, if my assumptions are ok, it's safe to say to maintain my null hypothesis that the the time spent on the intertrial interval didn't make much of a difference in terms of accuracy.
Can someone check my assumptions here. As I said, I'm fairly new to stats, and I'm still learning. 
Relevant answer
Answer
Hi Daniel, 
I agree with both Jochen and Luke that are losing quite a lot of information using your approach. Now, I can think of two paths you could take to analyze your model, depending on the effort you want to put in: 
1) The easiest is just to fit a generalized linear model on your full data; the Accuracy would be the DV, the log reaction time (and any experimental condition you may have) the IV, the subject would be a random effect. 
In R, it would look somewhat like this: 
fit <- glmer(Accuracy ~ I(log(RT)) + ExpCondition + (1 | ID), family="binomial")
I suggest using the log RT to make the RT distribution somewhat more normally distributed. Mind you, it'll stiill probably be non-normal. 
2) The trickier path would be fitting a hierachical drift-diffusion model. It's a model that jointly analyzes RTs and Accuracy, taking in account their inter-dependence and their non-normal distribution, and it's specifically tailored for two-alternative forced choice response tasks. You would expect, I think, a difference between experimental conditions for parameter alpha. 
It's much trickier to implement, but I think it's the perfect fit for your experimental design. So, if you have time to dedicate to the project, check it out. The article I'm linking would be a good starting point, I think. 
  • asked a question related to Categorical Algebra
Question
3 answers
Hi, I'm looking for references to research papers specifically from the field of engineering.
Relevant answer
Answer
KR-20 doesn't offer useful insight about score validity, though it is true that the maximum criterion-related validity coefficient for a given measure is equal to the square root of its estimated internal consistency reliability (given by KR-20). 
While the Rasch model suggests that item difficulty is the only specific aspect of items that is important to consider (which is why it is characterized as a one-parameter item response theory model), KR-20 and classical measurement models presume that items are fully interchangeable, and are therefore analogous to a zero-parameter model!  IRT models offer many advantages over classical models, including: (1) ability to check how well the model functions with a given data set (model-data fit); (2) estimates of item parameters that are theoretically independent of the sample of examinees used; (3) estimates of examinee proficiency that are theoretically independent of the set of items used; (4) simplifying many technical issues, like equating; and (5) calibrating both stimuli (items) and objects (examinees) onto a common scale. Moreover, Rasch model parameter estimates tend to converge quickly and accurately even with modest sample sizes.  For these reasons, IRT models have rightfully gained popularity since their introduction over 60 years ago. 
But, score validity is a broad arena, and neither classical reliability estimates nor IRT analysis alone will address the concerns associated with validity.