Science topic
Probability - Science topic
Probability is the study of chance processes or the relative frequency characterizing a chance process.
Questions related to Probability
Suppose A is a set measurable in the Caratheodory sense such for n in the integers, A is a subset of Rn, and function f:A->R
After reading the preliminary definitions in section 1.2 of the attachment where, e.g., a pre-structure is a sequence of sets whose union equals A and each term of the sequence has a positive uniform probability measure; how do we answer the following question in section 2?
Does there exist a unique extension (or method constructively defining a unique extension) of the expected value of f when the value’s finite, using the uniform probability measure on sets measurable in the Caratheodory sense, such we replace f with infinite or undefined expected values with f defined on a chosen pre-structure depending on A where:
- The expected value of f on each term of the pre-structure is finite
- The pre-structure converges uniformly to A
- The pre-structure converges uniformly to A at a linear or superlinear rate to that of other non-equivalent pre-structures of A which satisfies 1. and 2.
- The generalized expected value of f on the pre-structure (an extension of def. 3 to answer the full question) satisfies 1., 2., and 3. and is unique & finite.
- A choice function is defined that chooses a pre-structure from A that satisfies 1., 2., 3., and 4. for the largest possible subset of RA.
- If there is more than one choice function that satisfies 1., 2., 3., 4. and 5., we choose the choice function with the "simplest form", meaning for a general pre-structure of A (see def. 2), when each choice function is fully expanded, we take the choice function with the fewest variables/numbers (excluding those with quantifiers).
How do we answer this question?
(See sections 3.1 & 3.3 in the attachment for an idea of what an answer would look like)
Edit: Made changes to section 3.5 (b) since it was nearly impossible to read. Hopefully, the new version is much easier to process.
Hello!
I have a few questions about a TDDFT calculation that I ran ( # td b3lyp/6-31g(d,p) scrf=(iefpcm,solvent=chloroform) guess=mix) and when I calculation the % probability of some of the excitation states I am getting >100%. What I remember from statistics is that we cannot actually have >100% probability so I am trying to figure out why I have that occurring in my data.
I calculated %probability by 2*(coefficient^2). I have included one of data's oscillator strength information below.
"Excitation energies and oscillator strengths:
Excited State 1: 2.047-A 0.5492 eV 2257.37 nm f=0.1453 <S**2>=0.797
339B -> 341B 0.20758 (8.62%)
340B -> 341B 0.97366 (189.6%)
This state for optimization and/or second-order correction.
Total Energy, E(TD-HF/TD-DFT) = -6185.76906590
Copying the excited state density for this state as the 1-particle RhoCI density.
Excited State 2: 2.048-A 0.6312 eV 1964.25 nm f=0.0730 <S**2>=0.798
339B -> 341B 0.97645 (190.69%)
340B -> 341B -0.20706 (8.57%)
Excited State 3: 2.037-A 0.7499 eV 1653.42 nm f=0.0000 <S**2>=0.787
331B -> 341B 0.98349 (193.45%)
SavETr: write IOETrn= 770 NScale= 10 NData= 16 NLR=1 NState= 3 LETran= 64.
Hyperfine terms turned off by default for NAtoms > 100."
The other three questions I have are:
- what the ####-A means (in bold above) as I have some calculations with various numbers-A and others that have singlet-A. (ZnbisPEB file)
- I obtained a negative wavelength what should I do? I have already read a question on here about something similar, but the only suggestion was to remove the +, which I do not have in my initial gjf file. Should I solve for more states or completely eliminate (d,p)? (Trimer file)
- In another calculation I obtained a negative oscillator strength which I know from some web searches is not theoretically possible and indicates that there is a lower energy state (is that correct?) - how would I fix that? I have included it below, the same basis set as above is used. (1ZnTrimer file)
"Excitation energies and oscillator strengths:
Excited State 1: 4.010-A -0.2239 eV -5538.35 nm f=-0.0004 <S**2>=3.771"
Any clarification would be super helpful. I have also included the out files for the three compounds I am talking about.
Thank you so much!
Could any expert try to examine our novel approach for multi-objective optimization?
The brand new approch was entitled "Probability - based multi - objective optimization for material selection", and published by Springer available at https://link.springer.com/book/9789811933509,
DOI: 10.1007/978-981-19-3351-6.
P(y)=Integral( P(x|y)*P(y)*dx)
function is above if I didnt write wrongly. P(x|y) is conditional probability and it is known but P(y) is not known, thanks.
there may be an iterative solution but is there any analytical solution.
Good day! The question is really complex since CRISPR do not have any exact sequence - so the question is the probability of generation of 2 repeat units, each of 23-55 bp and having a short palindromic sequence within and maximum mismatch of 20%, interspersed with a spacer sequence that in 0.6-2.5 of repeat size and that doesn't match to left and right flank of the whole sequence, in a random sequence.
My lab recently got donated a 5500xl Genetic Analyzer from Applied Biosystems. However they are discontinuing the official reagents for this system come Dec 31, 2017; which is probably why it was given for free.
So I am wondering if anyone can offer help to get this machine running on generic reagents, or any tips/hints/advice; or even if it's worth the effort.
Basically it would be nice to get it sequencing, but if that can't be done are there any salvageable parts (for instance I know it has a high def microscope and a precise positioning system) ?
Here is a link to the machine we have:
We have all the accessory machines that go with it.
Thanks.
Suppose that we have a two-component series system, what is the probability of failure of two components at the same time?
*** both components' failure times are continuous random variables,
*** Is it important that they follow the same distribution or the different ones or the same distribution with different parameters?
In more detail, which conditions should be held that if X is a continuous random, then f(X) is also a continuous RV?
More specially, if X and Y are two continuous RVs, is X-Y a continuous RV?
Bests
If for example the position of an electron in a one-dimensional box is measured at A (give and take the uncertainty), then the probability of detecting the particle at any position B at a classical distance from A becomes zero instantaneously.
In other words, the "probability information" appears to be communicated from A to B faster than light.
The underlying argument would be virtually the same as in EPR. The question might be generalized as follows: as the probability of detecting a particle within an arbitrarily small interval is not arbitrarily small, this means that quantum mechanics must be incomplete.
Yet another formulation: are the collapse of the wave function and quantum entanglement two manifestations of the same principle?
It should be relatively easy to devise a Bell-like theorem and experiment to verify "spooky action" in the collapse of the wave function across a given classical interval.
I want to draw a graph between predicted probabilities vs observed probabilities. For predicted probabilities I use this “R” code (see below). Is this code ok or not ?.
Could any tell me, how can I get the observed probabilities and draw a graph between predicted and observed probability.
analysis10<-glm(Response~ Strain + Temp + Time + Conc.Log10
+ Strain:Conc.Log1+ Temp:Time
,family=binomial(link=logit),data=df)
predicted_probs = data.frame(probs = predict(analysis10, type="response"))
I have attached that data file
I want to see the distribution of an endogenous protein(hereafter as A), and I followed the protocol from Axis-Shield (http://www.axis-shield-density-gradient-media.com/S21.pdf). In order to gain stronger signals, I tried small ultracentrifuge tube (SW60, 4ml).
In this protocol, Golgi is enrich in #1~3, ER is #9~12. But in my experiments,the enrichment of ER (Marker using Calnexin) is usually failed (#3~12), while Golgi (marker using GM130) is good (#1~3).
Here are some questions :
1. the amount of protein loaded on gradient: Should it be considered? I mean, is the capacity of the gradient need to be think out? does it effect the fraction efficiency?
2. Is it necessary to use large tube (12ml)? Previously, to gain stronger signals of A, I switched to smaller tube (4ml). I have searched many papers, and some of them use small tube for fractionation. (Probably they use less protein for loading)?
Thanks a lot for answering

Why the Chi-square cannot be less than or equal to 1 ?
I have camera traps data which was deployed at several sites (islands). The data consist of only N (abundance; independent image) of each species at their respected islands. No occupancy modelling could be run as I do not have the habitat data. Is it possible to calculate the occupancy and probability without the temporal data/repeated sampling (the week the species was detected)? Or would calculating the Naive Occupancy would do? Furthermore, does Occupancy and Probability has a range of high and low?
probability and fuzzy numbers are ranged between 0 to 1. both are explaned in similar manner. what is the crisp difference between these terms?
Hi everyone,
In engineering design, there are usually only a few data points or low order moments, so it is meaningful to fit a relatively accurate probability density function to guide engineering design. What are the methods of fitting probability density functions through small amounts of data or low order statistical moments?
Best regards
Tao Wang
I Will be more than happy if somebody help me in this case. Does it has an specific function in R? or we should utilize quantile -copula methods...? or other???
Imagine there is a surface, with points randomly spread all over it. We know the surface area S, and the number of points N, therefore we also know the point density "p".
If I blindly draw a square/rectangle (area A) over such surface, what is the probability it'll encompass at least one of those points?
P.s.: I need to solve this "puzzle" as part of a random-walk problem, where a "searcher" looks for targets in a 2D space. I'll use it to calculate the probability the searcher has of finding a target at each one of his steps.
Thank you!
Dear colleagues.
In the following question i try to extend the concept of characters in group theory to a wilder class of functions. A character on a group G is a group homomorphism $\phi:G \to S^1$.
For every character $\phi=X+iY$ on a group $G$, we have $Cov(X.Y)=0$.
This is a motivation to consider all $\phi=X+iY: G\to S^1$ with $Cov(X,Y)=0$.
Please see this post:
So this question can be a starting point to translate some concepts in geometric group theory or theory of bamenability of groups in terms of notations and terminologies in statistics and probability theory.
Do you have any ideas, suggestions or comments?
Thank you
For example, dirchlet and multinomial distribution were conjugated. We want to know the probability of variable A occurs with B. Therefore, we train two probability model with enough samples and computational power. First model based on dirchlet distribution. Second model based on multinomial distributions. When we infer the parameter of dirchlet-multinomial and multinomial distributions, will the accuracy of models be different?
I need help with Queuing theory, easy explanation to M/M/C?
What are the parameters of the M/M/C queuing model?
Hi, everyone
In relation with the statistical power analysis, the relationship between effect size and sample size has crucial aspects, which bring me to a point that, I think, most of the time, this sample size decision makes me feel confusing. Let me ask something about it! I've been working on rodents, and as far as I know, a prior power analysis based on an effect size estimate is very useful in deciding of sample size. When it comes to experimental animal studies, providing the animal refinement is a must for researchers, therefore it would be highly anticipated for those researchers to reduce the number of animals for each group, just to a level which can give adequate precision for refraining from type-2 error. If effect size obtained from previous studies prior to your study, then it's much easier to estimate. However, most of the papers don't provide any useful information neither on means and standard deviations nor on effect sizes. Thus it makes it harder to make an estimate without a plot study.
So, in my case, when taken into account the effect size which I've calculated using previous similar studies, sample size per group (4 groups, total sample size = 40 ) should be around 10 for statistical power (0.80). In this case, what do you suggest about the robustness of checking residuals or visual assessments using Q-Q plots or other approaches when the sample size is small (<10) ?
Kind regards,
I suspect this is a well-worn topic in science education and psychology, but these are fields I don't know well. I'd like a source or two to support my sense that probability/statistics are hard for people to understand and correctly interpret because they defy "the way our minds work" (to put it crudely). Any suggestions?
I have a large set of sampled data. How can I plot a normal pdf from the available data set?
In R-studio, there are many commands of Gumbel package. Arguments are also different.
I`m asking about the alpha parameter of the Copula which must be greater than 1. If this is the one used to plot the probability paper, how can I choose the value of alpha?
I am considering to distribute N-kinds of different parts among M-different countries and I wan to know the "most probable" pattern of distribution. My question is in fact ambiguous, because I am not very sure how I can distinguish types or patterns.
Let me give an example. If I were to distribute 3 kinds of parts to 3 countries, the set of all distribution is given by a set
{aaa, aab, aac, aba, abb, abc aca, acb, acc, baa, bab, bac, bba, bbb, bbc, bca, bcb, bcc, caa, cab, cac, cba, cbb, cbc, cca, ccb, ccc}.
The number of elements is of course 33 = 27. I may distinguish three types of patterns:
(1) One country receives all parts:
aaa, bbb, ccc 3 cases
(2) One country receives 2 parts and another country receives 1 part:
aab, aac, aba, abb, aca, acc, baa, bab, bba, bbc, bcb, caa, cac, cbb, cbc, cca, ccb 17 cases
(3) Each county rceives one part respectively:
abc, acb, bac, bca, cab, cba 6 cases
These types may correspond to a partition of integer 3 with the condition that (a) number of summands must not exceed 3 (in general M). In fact, 3 have three partitions:
3, 2+1, 1+1+1
In the above case of 3×3, the number of types was the number of partitions of 3 (which is often noted p(n)). But I have to consider the case when M is smaller than N.
If I am right, the number of "different types" of distributions is the number of partitions of N with the number of summands less than M+1. Let us denote it as
p*(N, M) = p( N | the number of summands must not exceed M. )
N.B. * is added in order to avoid confusion with p(N, M), wwhich is the number of partitions with summands smaller than M+1.
Now, my question is the following:
Which type (a partition among p*(N, M)) has the greatest number of distributions?
Are there any results already known? If so, would you kindly teach me a paper or a book that explains the results and how to approach to the question?
A typical case that I want to know is N = 100, M = 10. In this simple case, is it most probable that each country receives 10 parts? But, I am also interested to cases when M and N are small, for example when M and N is less than 10.
Do we have a mathematical formula to compute the p-value of an observation from the Dirichlet distribution in exact sense at https://en.wikipedia.org/wiki/Exact_test?
I have a data which consists of an excess of zero counts. The independent variables are number of tree, diameter at breast height and basal area, and the dependent variable (predictors) is number of recruits (with many zero counts).
So, I want to use Zero-inflated negative binomial model and Hurdle negative binomial model to analyze. My problem is I do not know the code of these models in R package.
How can I calculate and report degrees of freedom for repeated mesure ANOVA?
I have 48 observations N=48 and 2 factors of 3(P) and 8(LA) levels.
I calculate degrees of freedom as follows:
dF P = a-1= 2
df LA = b-1= 7
df LA*P =(a-1)(b-1)= 14
Error dF P = (a-1) (N-1) = 94
Error dF LA = (b-1) (N-1) = 329
Error dF P*LA = (a-1)(b-1)(N-1) = 658
My JASP analysis gave me these results:
Within Subjects Effects
Cases Sum of Squares df Mean Square F p η²
P 1.927 2 0.964 33.9 < .001 0.120
P*LA 8.450 14 0.604 21.2 < .001 0.528
Residuals 0.454 16 0.028
Can I write P : F(2,14)= 33.9
and P*LA: F(14, 658) =21.2 ???
Or is it P: F(2, 16)=33.9
P*LA: F(14, 16) =21.2 ???
Thanks to anyone who would like to answer
Please consider a set of pairs of probability measures (P, Q) with given means (m_P, m_Q) and variances (v_P, v_Q).
For the relative entropy (KL-divergence) and the chi-square divergence, a pair of probability measures defined on the common two-element set (u_1, u_2) attains the lower bound.
Regarding general f-divergence, what is the condition of f such that a pair of probability measures defined on the common two-element set attains the lower bound ?
Intuitively, I think that the divergence between localized probability measures seems to be smaller.
Thank you for taking your time.
How to prove or where to find the integral inequality (3.3) involving the Laplace transforms, as showed in the pictures here? Is the integral inequality (3.3) valid for some general function $h(t)$ which is increasing and non-negative on the right semi-axis? Is the integral inequality (3.3) a special case of some general inequality? I have proved that the special function $h(t)$ has some properties in Lemma 2.2, but I haven't prove the integral inequality (3.3) yet. Wish you help me prove (3.3) for the special function $h(t)$ in Lemma 2.2 in the pictures.


Hello dear,
A great opportunity for statisticians and mathematicians around the World,
Join the Bernoulli Society and IMS for the first-ever Bernoulli-IMS One World Symposium 2020, August 24-28, 2020! The meeting will be virtual with many new experimental features. Participation at the symposium is free, registration is mandatory to get the passwords for the zoom sessions.
Good luck dear colleagues
In Garman's inventory model, buying order and selling order are poisson process with order size = 1. Buying price and selling price are denoted by pb and ps, that is, the market maker gets pb when she sells a stock to the others, and spends ps to buy a stock from the others.
Garman than calculates the probability of the inventory of the market maker, says Q(k, t+dt) = probability to get 1 dollar x Q(k-1, t) + probability to lose 1 dollar x Q(k+1, t) + no buying or selling order x Q(k, t), where Q(k, t+dt) = probability to have k money at time t+dt.
In the above equation, I think Garman had split the money received and loss by buying or selling a shock in many sub-poisson process, otherwise, getting 1 dollar or losing 1 dollar are impossible, as market maker receive pb dollar and loses ps dollar in each order, but not 1 dollar. Do my statement correct? Thank you very much.
I have a list of chromosomes, say A, B, C, and D. The respective fitness values are 1, 2, 3, and 4. The chromosomes with higher fitness values (C and D) are more likely to be selected for the parent in the next generation. Therefore, how to assign probability in MATLAB such that C and D get a higher probability for parent selection?

Suppose we have statistics N(m1, m2), where m1 is the value of the first factor, m2 is the value of the second factor, N(m1, m2) is the number of observations corresponding to the values of factors m1 and m2. In this case, the probability P(m1, m2) = N(m1, m2) /K, where K is the total number of observations. In real situations, detailed statistics N(m1, m2) is often unavailable, and only the normalized marginal values S1(m1) and S2(m2) are known, where S1(m1) is the normalized total number of observations corresponding to the value m1 of the first factor and S2(m2) is the normalized total number of observations corresponding to the value m2 of the second factor. In this case P1(m1) = S1(m1)/K and P2(m2) = S2(m2)/K. It is clear that based on P1(m1) and P2(m2) it is impossible to calculate the exact value of P(m1, m2). But how to do this approximately with the best confidence? Thanks in advance for any advice.
Dear all. The Normal distribution (or Gaussian) is mostly used in statistics, natural science and engineering. Their importance is linked with the Central Limit Theorem. Is there any ideas how to predict the numbers and parameters of thos Gaussians ? Or any efficient deterministic tool to decompose Gaussian to a finite sum of Gaussian basic functions with parameter estimations ? Thank you in advance.
Hi guys,
Can anyone describe PCE in simple words, please? How can we find a PC basis and what is an appropriate sparse PC basis?
Thanks in advance
Dear all. The Gaussian function is mostly used in statistics, physics and engineering. Some examples include:
1. Gaussian functions are the Green's function for the homogeneous and isotropic
2. The convolution of a function with a Gaussian is also known as a Weierstrass transform
3. A Gaussian function is the wave function of the ground state of the quantum harmonic oscillator
3. The atomic and molecular orbitals used in computational chemistry can be linear combinations of Gaussian functions called Gaussian orbitals
4. Gaussian functions are also associated with the vacuum state in quantum field theory
5. Gaussian beams are used in optical systems, microwave systems and lasers
6. Gaussian functions are used to define some types activation function of artificial neural networks
7. Simple cell response in primary visual cortex has a Gaussian function modeled by a sine wave
8. Fourier transforms of Gaussian is a Gaussian
10. Easy and efficient approximation for signals analysis and fitting (Gaussian process, gaussian mixture model, kalman estimator , ...)
11. Discribe the Shape of the UV−Visible Absorption
12. Used in Time-frequency analysis (Gabor Transform)
13. Central Limit Theorem (CLT) : Sum of independent random variables tends toward a normal distribution
14. The Gaussian function serves well in molecular physics, where the number of particles is closed to the Avogadro number NA = 6.02214076×1023 mol−1 ( NA is defined as the number of particles that are contained in one mole)
15. ...
Why Gaussian is everywhere ?


I was working on the C++ implementation of Dirichlet Distribution for the past few days. Everything is smooth, but I am not able to deduce the CDF (Cumulative Distribution Function) of Dirichlet distribution. Unfortunately, neither Wikipedia nor WolframAlpha shows the CDF formula for Dirichlet Distribution.
Forgetting the context for a second, the overall question is how to compare data that is expressed in a probability.
Scenario 1: Let us say there are two events A and B. The rules are:
- A (union) B = 1
- A (intersection) B = 0
- Probability of A or B is dependent on a dollar amount. Depending on the amount, the probability either A or B happens changes. For e.g. @20,000 chance of A is 80%, then B is 20%.
Scenario 2: we have A, B, and C.
- A (union) B (union) C = 1
- A (intersection) B or C = 0
- Probability dependent on dollar amount. Same as above.
- A and B in scenarios 1 and 2 are same but their probabilities of happening are changed due to the introduction of C.
QUESTION: How can I compare the probability of the events in these two scenarios?
Possible solutions I was thinking of:
1) A is X times as likely to happen as B, then I could plot all events as a factor of B on the same graph to get a sense of how likely all events are compared to a common denominator (event B)
2) Could also get a "cumulative" probability of each event as area under the curve and express as a % or ratio. So if A occupies 80% of the area under the curve, then B should be 20%, so overall A is four times as likely, and similarly in scenario 2.
3) Maybe the way to compare is to take the complement of each event separately, and express as a percentage at each point and graph them.
Any help is greatly appreciated. Please refer to attached pic for some visual understanding of the question as well. I am making a lot of assumptions, which are not true (as concerned with the graphs etc), but theoretically, I am interested in knowing. Thank you!

How to find the distance distribution of a random point in a cluster from the origin? I have uniformly distributed cluster heads following the Poisson point process and users are deployed around the cluster head, uniformly, following the Poisson process. I want to compute the distance distribution between a random point in the cluster and the origin. I have attached the image as well, where 'd1', 'd2', and 'theta' are Random Variables. I want to find the distribution of 'r'.

hello dears, I have just interested in IFRS9 ECL models. I have three question and I appreciate all answers.
1) which models are best for pd, LGD & EAD calculation when I have scarce (about 5-7 years quarterly data) data?
2) can I calculate lifetime PD without macroeconomic variables and then add macro effects?
3) when I use transition matrix approach how have to estimate "stage 2" for earlier period, when IFRS9 was not valid and there was not any classification by stages.
Let us consider two boxes of different energies that are separated by a barrier and box 1 is filled with many electrons. The barrier allows the electrons to tunnel from box 1 to the other side with a 1% probability. In the event that an electron in a given time frame does tunnel through the barrier, does this affect the probability that a different electron teleports through the barrier during the same time frame, or is this probability statistically independent? Therefore in the extreme case, do we know if the probability of ALL electrons tunneling is equal to P(1 electron tunneling)^(Number of electrons). Let us assume that the change in energy due to electrons filling/leaving certain energy levels is negligible.
Dear statisticans and HTA experts ,
I am newbie and working on my first model using a markov model with probabilities varying according to time in state. After synthesis clinical trials, unfortunately I have difficulties to derive transition probabilities from pooled HR. How to calculate shape and scale parameter of Weilbull distribution from this pooled HR?
I really appreciate your helps.
Kind regards
Cuc
Hello,
I have a panel database for firms i (90 firms) across year t (from 2013 to 2019). Some firms witnessed a cross border investment in a year that is between 2013 and 2019. I created a dummy=1, when the firm has received the investment in the corresponding year.
I want to create three groups small (1), medium (2) and large (3) using the revenue generated by the company and then compute how likely one firm can go from group 1 to group 2 or 3, if received foreign investment.
I am not sure which statistical method I should use and how to arrange my data according to this method.
Who has the higher chance to get the jackpot, (i) the one who spins on one slot-fruit machine, (ii) the other who spins on thousands of slot-fruit machines? Is there a summation or a multiplication of probabilities?
The problem is very well described and explained by a plenitude of sources. So, just a very brief reminder:
a guest is given the opportunity to select one of the three closed doors. There is a prize behind exactly one of the doors.
Once the guest has selected a door, the host opens one another (not selected) door to reveal that it hides nothing.
Then the guest can either confirm the initial choice or select another (obviously still closed) door.
Which strategy leads to success with higher probability?
Long story short, if the guest's decision remains unchanged then the probability is 1/3, if the guest changes the selection then the probability is 2/3
The solution can be found elsewhere, along with the software simulators (among them one is mine https://github.com/tms320c/threedoorstrial)
The solution itself is fine no doubt, but let us change the trial a bit.
After the guest has made the first selection, the host removes the open box (let's use boxes instead of the doors). Because everybody knows that the open box contains nothing this action does not change the distribution of the probabilities (or does it? see the questions below).
Now we have two boxes, which are not equal: one contains the prize with the probability of 1/3, while the probability to find the prize in another box is 2/3.
The host says goodbye to the first guest and invites another one. The new guest is given the opportunity to select one of the two closed boxes.
Because new guest does not know the history, so the assumption is that the probability to get the prize is 1/2. Which is totally wrong as we all (and the host, and the first guest) know now.
If the host is a kind of generous person he can tell the story. After that, the new guest can get the prize with 2/3 probability. Is it a correct probability?
I propose to discuss the following questions:
if you are in a situation where you should choose one of two presumably equal boxes/packages/whatever, should you insist on the disclosure of their history? Perhaps, they were part in such trial once upon a time in a galaxy far far away and are not equal in fact.
for how long the boxes are in the non-equilibrium state? The time delay between the two trials (first and second guests) can be arbitrary long.
may the situation lead to the conclusion that probability depends on someone's opinion, and your decision always explicitly depends on someone else's decision, which could be made in a deep past?
may the probability to have a treasure be an internal property of every box that can be modified mentally by humans?
I know that comparison of these terms may require more than one question but I would appreciate it if you could briefly define each and compare with relevant ones.
Hello, I recently found in a test bank that those two can't be increased simultaneously. As we all know, Type I error is when you reject a true null hypothesis (you think there's a treatment effect when there is not), and Type II error is when you accept a false null hypothesis (you think there's no treatment effect when there is). But I'm having trouble understanding the logic of why these two can't be increased at the same time. Thank you.
iam working on cyclostationary spectrum sensing in cognitive radio, can any one help me to find the MATLAB code or the main procedures to compute pd (probability of detection), ROC curves ?
On the traditional statistics tasks simultaneously with Probability Value calculation usually we can also estimate Confidence Interval for this Probability value (with some pre-defined Confidence Level value). For SVM method there are developed a lot of expressions, which allow us not only recognize class of new object, but also to calculate probability, that this object belong for this class. But is it possible to estimate Confidence Interval for this Probability value? Thanks for your answer beforehand. Regards, Sergey.
Is there any probabilistic study on When/If Artificial Intelligence may take over the humanity?
Negative binomial random variable with parameter r and p can be thought of as the number of attempts before rth success. This is a generalised form of geometric random variable with n=1. I am interested in calculating the expectation of max of N negative binomial random variables which are independent and identically distributed (i.i.d.). The difficulty that I am facing is that there is no known closed form formula for cumulative distribution function (CDF) of negative binomial random variables, therefore I can not apply the multiplication rule to the CDF of max of N negative binomial random variables.
I am going to develop a queueing model in which riders and drivers arrive with inter-arrival time exponentially distributed.
All the riders and drivers arriving in the system will wait for some amount of time until being matched.
The matching process will pair one rider with one driver and takes place every Δt unit of time (i.e., Δt, 2Δt, 3Δt, ⋯). Whichever side outnumbers the other, its exceeding portion will remain in the queue for the next round of matching.
The service follows the first come first served principle, and how they are matched in particular is not in the scope of this problem and will not affect the queue modelling.
I tried to formulate it as a double-ended queue, where state indicates the exceeding number in the system.

However, this formulation didn't incorporate the factor Δt in it, it is thereby not in a batch service fashion. I have no clue how I can formulate this Δt (somewhat like a buffer) into this model.
I want to realise a sampling in a target law knowing the equation of this last one. I want to make random sampling in this law. Does exist a general method to make this kind of sampling ? As for example for realise sampling in the Gamma or Gaussian function
- Purpose of the post:
I'm struggling to understand the significance of the fat-tailed distribution especially in career choice. 80000hours career guide argues that the more accurate distribution for career choice is the long-tailed one.
I'm trying to understand how the implication would differ between a normal bell-curve and a long-tailed distribution. My request: are the implications I wrote in "Part 2: Significance of the fat-tail distribution" accurate? Please focus on points 1 and 2.
Other Names: heavy-tailed distribution, long-tailed distribution, , pareto distribution
- Part 1: Description of Fat-Tail Distribution phenomenon in World Problems and Career Success in 80000hours career guide:
The guide is available for free download at https://80000hours.org/book/
"the most effective actions achieve far more than average. These big differences in expected impact mean that it’s really important to focus on the best areas. Of course, making these comparisons is really hard, but if we don’t, we could easily end up working on something with comparatively little impact. This is why many of our readers have changed which problem they work on. "p.60
"Each change took serious effort, but if changing area can enable you to have many times as much impact, and be more successful, then it’s worth it." (p.61)
"the top 10% of the most prolific elite can be credited with around 50% of all contributions, whereas the bottom 50% of the least productive workers can claim only 15% of the total work" (p. 89)
Simonton, Dean K. ʺAge and outstanding achievement: What do we know after a century of research?ʺ Psychological bulletin 104.2 (1988): 251 as cited in p.89 of 80000hours guide
"Areas like research and advocacy are particularly extreme, but a major study still found that the best people in almost any field have significantly more output than the typical person." Hunter, J. E., Schmidt, F. L., Judiesch, M. K., (1990) “Individual Differences
in Output Variability as a Function of Job Complexity”, Journal of Applied as cited in P.90 of 80000hours guide
"Moreover, success in almost any field gives you influence that can be turned into positive impact by using your position to advocate for important problems (p.91).
This all also means you should probably avoid taking a “high impact” option that you don’t enjoy, and lacks the other ingredients of job satisfaction, like engaging work". (p.91)
"Finally, because the most successful people in a field achieve far more than the typical person, choose something where you have the potential to excel. Don’t do something you won’t enjoy in order to have more impact. " (p.95)
📷
Figure Assumed (NOT actual) Bell-Curve for problems (p.59)
📷
Figure: Actual fat-tail distribution for problems (p.60)
📷
Figure: Actual Fat-Tail distribution of Careers (p.90)
- Part 2: Significance of the fat-tail distribution especially for individuals (All points are comparing fat-tail to bell-curve) :
1. It seems to me that in both distributions we have a motivation to aim for the top (obviously the top would be better) but the nature of motivation changes.
The nature of the carrot and stick changes from "aim to be among the best" because
(bell-curve carrot): if you do great things will happen to you.
(bell curve stick): if you don't, you will remain in the average and mean range
To" aim to be among the best" because
(fat-tail carrot): if you do, VERY VERY VERY great things will happen to you.
(fat-tail stick):If you don't, you will remain in the average (BELOW mean)
2. The more variance there is in a statistical sample, the wiser it is to aim to move to the exploration direction in the exploration-exploitation (deliberation-action) spectrum (the relationship between variance and value of new information). In terms of career, the variance is great so the exploration investment should be great as well.
3. Median does not equal average. The average could be misleading and the median could be misleading
4. If all other factors are moderate or weak but you have a good reason to think you can reach the top, go for it because average can be misleading, your rank is as important as your field).
5. Choosing a job at random is not advised because the median is lower than we think.
6. Even though below mean is much more likely in this distribution, probability of being a hyper-performer is higher , which makes the hyper-performer goal a more realistic goal hence, it's easier to become motivated by it.
7. Looking at the outliers becomes more important because they become more influential.
8. Pareto principle (80/20) can encourage generalist over specialization because the effort you put in one field result in a better impact if distributed over many fields but at the same time it can encourage specialization over generalism because the pareto distribution makes you want to aim for the very top because its impact is disproportionate.
9. Does the long-tailed distribution favor an "all-star" "super-star" or "SWAT" team approach of fewer more qualified people (quality over quantity)?
10. Prioritization becomes more important?
11. We often behave as if the bell-curve distribution is true while it's not. So the significance lies in adjusting our behaviour?
12. Why does personal fit become more important under the long-tailed distribution? Because the long-tailed distribution has a higher variance. If success in a specific field had zero variance, then individual differences wouldn't matter much so the person-environment fit would not be v. important because as persons change nothing changes as a result. However, the reality is that there is great variance so individual differences matter a lot (how would you "react" to a field NOT JUST how good is a career).



Can numbers (the Look then Leap Rule OR the Gittins Index) be used to help a person decide when to stop looking for the most suitable career path and LEAP into it instead or is the career situation too complicated for that?
^^^^^^^^^^^^^^^^^
Details:
Mathematical answers to the question of optimal stopping in general (When you should stop looking and leap)?
Gittins Index , Feynman's restaurant problem (not discussed in details)
Look then Leap Rule (secretary problem, fiancé problem): (√n , n/e , 37%)
How do apply this rule to career choice?
1- Potential ways of application:
A- n is Time .
Like what Michael Trick did https://goo.gl/9hSJT1 . Michael Trick A CMU Operations Research professor who applied this to his decide the best time for his marriage proposal., though he seems to think that this is a failed approach.
In our case, should we do it by age 20-70= 50 years --- 38 years old is where you stop looking for example? Or Should we multiply 37% by 80,000 hours to get a total of 29600 hours of career "looking"?
B- n is the number of available options. Like the secretary problem.
If we have 100 viable job options, we just look into the first 37? If we have 10, we just look into the first 4? If we are still in a stage of our lives where we have thousands of career paths?
2- Why the situation is more complicated in the career choice situation:
A- You can want a career and pursue it and then fail at it.
B- You can mix career paths. If you take option c, it can help you later on with option G. for example, if I went as an IRS, the irs will help me later on if I decide to become a writer so there's overlap between the options and a more dynamic relationship. Also the option you choose in selection #1 will influence the likelihood of choosing other options in Selection 2 (For example, if in 2018 I choose to work at an NGO, that will influence my options if I want to do a career transition in 2023 since that will limit my possibility of entering the corporate world in 2023).
C- You need to be making money so "looking" that does not generate money is seriously costly.
D- The choice is neither strictly sequential nor strictly simultaneous.
E- Looking and leaping alternates over a lifetime not like the example where you keep looking then leap once.
Is there a practical way to measure how the probability of switching back and forth between our career options affects the optimal exploration percentage?
F- There is something between looking and leaping, which is testing the waters. Let me explain. "Looking" here doesn't just mean "thinking" or "self-reflection" without action. It could also mean trying out a field to see if you're suited for it. So we can divide looking into "experimentation looking" and "thinking looking". And what separates looking from leaping is commitment and being settled. There's a trial period.
How does this affect our job/career options example since we can theoretically "look" at all 100 viable job positions without having to formally reject the position? Or does this rule apply to scenarios where looking entails commitment?
G- * You can return to a career that you rejected in the past. Once you leap, you can look again.
"But if you have the option to go back, say by apologizing to the first applicant and begging them to come work with you, and you have a 50% chance of your apology being accepted, then the optimal explore percentage rises all the way to 61%." https://80000hours.org/podcast/episodes/brian-christian-algorithms-to-live-by/
*3- A Real-life Example:
Here are some of my major potential career paths:
1- Behavioural Change Communications Company 2- Soft-Skills Training Company, 3- Consulting Company, 4-Blogger 5- Internet Research Specialist 6- Academic 7- Writer (Malcolm Gladwell Style; Popularization of psychology) 8- NGOs
As you can see the options here overlap to a great degree. So with these options, should I just say "ok the root of 8 is about 3" so pick 3 of those and try them for a year each and then stick with whatever comes next and is better?!!
It´s interesting the muscle answer after RPE, but probably it has another result if we experiment a RPE with bone anchorage
I have applied "gaussmix" function in voicebox MATLAB tools to calculate GMM. However, the code gives me error when I run it for 512 GMM components.
No_of_Clusters = 512;
No_of_Iterations = 10;
[m_ubm1,v_ubm1,w_ubm1]=gaussmix(feature,[],No_of_Iterations,No_of_Clusters);
Error using *
Inner matrix dimensions must agree.
Error in gaussmix (line 256)
pk=px*wt; % pk(k,1) effective number of data points for each mixture (could be zero due to underflow)
I need 1024 or 2048 Mixtures for Universal Background Model (UBM) construction. Could anyone give me matlab code to calculate GMM for big number of mixture such as 512 or 2048?
Dear scholars, the points in graph (attached, plotted using R) is from a probability mass function and it reduces exponentially to 0 but up to a certain index point (eg. x = 53). Above the point, the values start to fluctuate. Anyone has any idea why this happens? Any tips on how to fix this using R?
Thanks very much to everyone in advance.
Hello.
I run REST2 simulations, which is one of the replica exchange simulations. I would like to ask one question about a validity of REST2.
I run REST2 simulations of the system of a membrane protein. I prepared 32 replicas and temperatures were assumed to be from 310 to 560 K.
Now, I run about 40ns simulations, but "tunnel" has not been observed yet. I mean, for example, a replica starting at 310 K has not been shifted completely to 560 K. I thought one of the requirements for demonstrating validity of REST2 is "tunnel". However, to my knowledge, most of papers about work with REST2 do not even mention if "tunnel" has completely occurred in the simulation time.
I would like to ask you whether "tunnel" is a necessary and general factor to show the validity of your simulations in REST2. The reason why I ask is that it seems to take much more times for "tunnel" to happen completely. And I actually wonder that "tunnel" can not be observed in REST2 for the big system like membrane proteins.
By the way, I confirmed any acceptance ratios were about 30% and the potential energies of each replica were overlapped. So apart from "tunnel", my REST2 simulations would be not strange in theory.
I would be happy if you gave me a reply.
Thank you in advance.
Could somebody point me towards an efficient implementation of a (displaced) chi-square cumulative distribution?
I have two histogram. One has some sort of non-uniform distribution and the other one looks like exponential or similar to some standard distribution. Visually, The second one is smoother and curve fillting is possible. I need to quantify these two histograms to conclude the second one smoother than first one and it may follow some distribution(not necessarily normal).
What are typical parameters we can compare on histograms? I have tested skewness(not significant), kurtosis and ks test(failed).
Thanks
Take N examples sampled from a multinomial distribution (p1,p2,⋯,pm) for m outcomes, with Ni being the number of examples taking the outcome i. Here I assume p1,p2,⋯,pm are listed in the descending order; if not, we can reorder them to make them descending.
My objective is to make the most frequently occuring outcome, i.e., argmax_i{N_i} as close to 1 as possible.
So my question is what size of N we should take to make the most frequently occuring outcome is less than some predetermined number m0 (1<=m0<=m) ? Furthermore, how N determines the most frequently occuring outcome?
The tool I can think might be useful is Bretagnolle-Huber-Carol inequality. Any clues or references are appreciated. Thanks.
Design-based classical ratio estimation uses a ratio, R, which corresponds to a regression coefficient (slope) whose estimate implicitly assumes a regression weight of 1/x. Thus, as can be seen in Särndal, CE, Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling, Springer-Verlang, page 254, the most efficient probability of selection design would be unequal probability sampling, where we would use probability proportional to the square root of x for sample selection.
So why use simple random sampling for design-based classical ratio estimation? Is this only explained by momentum from historical use? For certain applications, might it, under some circumstances, be more robust in some way??? This does not appear to conform to a reasonable data or variance structure.