Science topic

Bayesian Methods - Science topic

Explore the latest questions and answers in Bayesian Methods, and find Bayesian Methods experts.
Questions related to Bayesian Methods
  • asked a question related to Bayesian Methods
Question
4 answers
When the research sample consists of individual specific cases, researchers face unique challenges due to the limited size or specific nature of the cases involved. However, several general statistical methods can be applied to ensure the reliability and validity of the research and its results. These methods include descriptive statistics, which help summarize the sample’s characteristics using measures such as the mean and standard deviation; inferential statistics, which are used to draw predictions or inferences about the population based on the sample; regression analysis to understand the relationships between variables; and chi-square tests to examine relationships between categorical variables. Correlation analysis can also be used to measure the strength of relationships between variables. Furthermore, Bayesian methods provide a unique approach by leveraging prior knowledge and updating the analysis with new data. The importance of these methods in ensuring the accuracy of research lies in their ability to interpret data systematically, ensure reliability through hypothesis testing and confidence intervals, reduce bias that could affect the results, and allow researchers to draw valid, evidence-based conclusions. They also ensure the replicability and reproducibility of results, which enhances the credibility of the research. Overall, statistical methods remain essential tools that help researchers understand complex data and draw valid, scientifically supported conclusions.
Relevant answer
Answer
The varying perceptions of a statistician’s value, from “critical” to “not needed for 99.9% of applications”: A national study of human research ethics committees
"Currently, much health and medical research is wasted due to inappropriate study design or analysis. Study designs could be improved using an expert review from a qualified statistician. The ethical review process is an ideal stage for this input; however, we do not know how many ethics committees in Australia have access to a qualified statistician. To answer this question, we approached all human research ethics committees in Australia. Sixty percent of committees had access to a qualified statistician, either as a full committee member or as a non-member who could be consulted when needed, but this dropped to 35% after accounting for statistical qualifications. Many committees rely on “highly numerate” researchers in place of qualified statisticians, as they viewed research experience and advanced statistical training as equivalent. Committees without access to statisticians tended to locate responsibility for study design with other parties, including researchers, trial sponsors, and institutions. Some committee chairs viewed formal statistical input as essential to the work of the committee; however, there was also a common belief that statistical review was only applicable to some study designs, and that “simple” or “small” studies did not need review. We encountered a surprising variance in practice and attitudes towards the use of statisticians on research ethics committees. The current high number of research studies receiving approval without statistical review risks approving studies that will in the best-case waste resources and in the worst-case cause harms due to flawed evidence."
  • asked a question related to Bayesian Methods
Question
10 answers
In the domain of clinical research, where the stakes are as high as the complexities of the data, a new statistical aid emerges: bayer: https://github.com/cccnrc/bayer
This R package is not just an advancement in analytics - it’s a revolution in how researchers can approach data, infer significance, and derive conclusions
What Makes `Bayer` Stand Out?
At its heart, bayer is about making Bayesian analysis robust yet accessible. Born from the powerful synergy with the wonderful brms::brm() function, it simplifies the complex, making the potent Bayesian methods a tool for every researcher’s arsenal.
Streamlined Workflow
bayer offers a seamless experience, from model specification to result interpretation, ensuring that researchers can focus on the science, not the syntax.
Rich Visual Insights
Understanding the impact of variables is no longer a trudge through tables. bayer brings you rich visualizations, like the one above, providing a clear and intuitive understanding of posterior distributions and trace plots.
Big Insights
Clinical trials, especially in rare diseases, often grapple with small sample sizes. `Bayer` rises to the challenge, effectively leveraging prior knowledge to bring out the significance that other methods miss.
Prior Knowledge as a Pillar
Every study builds on the shoulders of giants. `Bayer` respects this, allowing the integration of existing expertise and findings to refine models and enhance the precision of predictions.
From Zero to Bayesian Hero
The bayer package ensures that installation and application are as straightforward as possible. With just a few lines of R code, you’re on your way from data to decision:
# Installation devtools::install_github(“cccnrc/bayer”)# Example Usage: Bayesian Logistic Regression library(bayer) model_logistic <- bayer_logistic( data = mtcars, outcome = ‘am’, covariates = c( ‘mpg’, ‘cyl’, ‘vs’, ‘carb’ ) )
You then have plenty of functions to further analyze you model, take a look at bayer
Analytics with An Edge
bayer isn’t just a tool; it’s your research partner. It opens the door to advanced analyses like IPTW, ensuring that the effects you measure are the effects that matter. With bayer, your insights are no longer just a hypothesis — they’re a narrative grounded in data and powered by Bayesian precision.
Join the Brigade
bayer is open-source and community-driven. Whether you’re contributing code, documentation, or discussions, your insights are invaluable. Together, we can push the boundaries of what’s possible in clinical research.
Try bayer Now
Embark on your journey to clearer, more accurate Bayesian analysis. Install `bayer`, explore its capabilities, and join a growing community dedicated to the advancement of clinical research.
bayer is more than a package — it’s a promise that every researcher can harness the full potential of their data.
Explore bayer today and transform your data into decisions that drive the future of clinical research: bayer - https://github.com/cccnrc/bayer
Relevant answer
Answer
Many thanks for your efforts!!! I will try it out as soon as possible and will provide feedback on github!
All the best,
Rainer
  • asked a question related to Bayesian Methods
Question
8 answers
According to Fisher [1], “… probability and likelihood are quantities of an entirely different nature.” Edwards [2] stated, “… this [likelihood] function in no sense gives rise to a statistical distribution.” According to Edwards [2], the likelihood function supplies a nature order of preference among the possibilities under consideration. Consequently, the mode of a likelihood function corresponds to the most preferred parameter value for a given dataset. Therefore, Edwards’ Method of Support or the method of maximum likelihood is a likelihood-based inference procedure that utilizes the mode only for point estimation of unknown parameters; it does not utilize the entire curve of likelihood functions [3]. In contrast, a probability-based inference, whether frequentist or Bayesian, requires using the entire curve of probability density functions for inference [3].
The Bayes Theorem in continuous form combines the likelihood function and the prior distribution (PDF) to form the posterior distribution (PDF). That is,
posterior PDF ~ likelihood function × prior PDF (1)
In the absence of prior information, a flat prior should be used according to Jaynes’ maximum entropy principle. Equation (1) reduces to:
posterior PDF = standardized likelihood function (2)
However, “… probability and likelihood are quantities of an entirely different nature [1]” and “… this [likelihood] function in no sense gives rise to a statistical distribution” [2]. Thus, Eq. (2) is invalid.
In fact, Eq. (1) is not the original Bayes Theorem in continuous form. It is called the "reformulated" Bayes Theorem by some authors in measurement science. According to Box and Tiao [4], the original Bayes Theorem in continuous form is merely a statement of conditional probability distribution, similar to the Bayes Theorem in discrete form. Furthermore, Eq. (1) violates “the principle of self-consistent operation” [3]. In my opinion, likelihood functions should not be mixed with probability density functions for statistical inference. A likelihood function is a distorted mirror of its probability density function counterpart; its usein Bayes Theorem may be the root cause of biased or incorrect inferences of the traditional Bayesian method [3]. I hope this discussion gets people thinking about this fundamental issue in Bayesian approaches.
References
[1] Edwards A W F 1992 Likelihood (expanded edition) Johns Hopkins University Press Baltimore
[2] Fisher R A 1921 On the ‘Probable Error’ of a coefficient of correlation deduced from a small sample Metron I part 4, 3-32
[3] Huang H 2022 A new modified Bayesian method for measurement uncertainty analysis and the unification of frequentist and Bayesian inference. Journal of Probability and Statistical Science, 20(1), 52-79. https://journals.uregina.ca/jpss/article/view/515
[4] Box G E P and Tiao G C 1992 Bayesian Inference in Statistical Analysis Wiley New York
Relevant answer
Answer
I reread your post and understand it better now as I have been rereading some of what I once knew in an instant. I do think that this is a problem with technical terms. A probability density function (pdf) is not a probability. For each statistical distribution, the pdf is a function which when integrated over the interval (0,1) provides a probability of the data X given a parameter 𝜽.
In contrast, the likelihood is a function of the parameter 𝜽 given the data. The integration is over all parameter values given the data
By definition, If f(x/𝜽) is the joint pdf (or pmf for discrete data) of a sample X= (Xi,... ,Xn), then the likelihood function for observations X=x, is L(𝜽/x)=f(x/𝜽). (see for example, any textbook of theoretical statistical inference such as Casella and Berger).
So you are right that the Likelihood method does not calculate probabilities, per se, but the pdf, which is a function, not a probability, is a crucial part of the likelihood of a parameter (a hypothesis).
The difference is that the pdf treats the parameter or hypothesis as fixed and the data x as random variables. The likelihood function treats the data as fixed and the parameter to vary over all possible parameter values.
Similarly for inference, a single likelihood is a meaningless value but the ratio of the likelihood of two different hypotheses (the null and the alternative) is highly meaningful as we try to decide which hypothesis better fits the observed data.
Taking the difference in the natural logs of each hypothesis tells us which is preferred. A log likelihood difference of 2 means that one hypothesis fits the data more than 7 times better than the other. The meaning of the preference is clear.
Regarding bias in Bayes, it is my view - and that of other people in my main field (molecular phylogenetics), much Bayesian bias is the result of improper priors. The wrong prior guarantees bias in the posterior probability.
Your equation rightly shows that the posterior function is a function of the likelihood times the prior. And using prior knowledge seems intuitively sensible but it is not as easy to determine the proper function for the prior. Using a uniform distribution renders the posterior probability a likelihood.
I use both Likelihood and Bayes methods but for the wonderful quality of likelihood parameters (sufficiency, robust to model violation, low variance, invariance), I prefer likelihoods over probabilities.
  • asked a question related to Bayesian Methods
Question
1 answer
here is the code:
library(babette)
library(seqinr)
library(BeastJar)
library(beastier)
fasta <- read.fasta("nuc.fasta")
get_default_beast2_bin_path(
beast2_folder = get_default_beast2_folder(),
os = rappdirs::app_dir()$os
)
fasta_filename <- "nuc.fasta"
output <- bbt_run(fasta_filename)
##############after this run it gives following error:
Error in beastier::check_input_filename_validity(beast2_options) :
'input_filename' must be a valid BEAST2 XML file. File 'C:\Users\User\AppData\Local\beastier\beastier\Cache\beast2_9ec3aae162d.xml' is not a valid BEAST2 file. FALSE
Relevant answer
Answer
You appear to have something wrong with your input file. Perhaps the name is not valid in Beast2. BTW Next time realize that not everyone reading your question is a phylogeneticist. Regards David Booth
  • asked a question related to Bayesian Methods
Question
3 answers
I am trying to understand and use the best model selection method for my study. I have inferred models using swarm optimisation methods and used AIC for model selection. On the other hand I am also seeing a lot of references and discussions about BIC as well. Apparently, many papers have tried and concluded to use both AIC and BIC to select the best models. My question here is, What if I use ABC along side with AIC and BIC, how this will effect my study in better way and what would be its pros and corns of using ABC, BIC and AIC as model selection methods ?
Thanks
Relevant answer
Answer
First let me comment on a couple of things. See the first screenshot for some cautions to doing what you suggested. Now the second attachment shows what we do.. That has worked well for us in our application. Best wishes David Booth
  • asked a question related to Bayesian Methods
Question
6 answers
Hello
I have been in constant discussions with my friends and colleagues in recent years, in my experience I generally use multivariate statistics because most data sets do not have the assumptions for classical frequentist statistics. However, I know some people who use univariate and Bayesian methods to answer the same hypothesis questions. With this, the question would be, what would be the most appropriate way to answer our research questions?
  • asked a question related to Bayesian Methods
Question
3 answers
Dear colleagues,
I will appreciate your help in setting the parameters to use two nodes fossil-calibration for a phylogenetic-chronogram reconstruction using BEAST.
I have the divergence time between the outgroup (two species) 43 mya and the crown group of species that I would like to date, and the divergence time of these two species in the outgroup (8.7 mya).
Is there any way to get the a posteriori mutation rate in the way I can use this value later on to estimate populations divergence time using among populations genetic distance?
Many thanks in advance for your help.
Roberto
Relevant answer
Answer
Technically you can define more than one prior in BEAST2.
However, scientifically there might be some issues. To date your "ingroup", one callibration point might be enough 8split ingroup/outgroup). Of course it would be best to have another calibration point within the "ingroup". To calibrate the outgroup two times (split outgroup-ingroup and split of the two outgroup taxa) might bias your results. Be aware, that also sampling density as well as underlying data might bias the dating. You will find plenty of papers about that...
  • asked a question related to Bayesian Methods
Question
8 answers
MCMC sampling is often used to produce samples from Bayesian posterior distributions. However, the MCMC method in general associates with computational difficulty and lack of transparency. Specialized computer programs are needed to implement MCMC sampling and the convergence of MCMC calculations needs to be assessed.
A numerical method known as “probability domain simulation (PDS)” (Huang and Fergen 1997) might be an effective alternative to MCMC sampling. A two-dimensional PDS can be easily implemented with Excel spreadsheets (Huang 2020). It outputs the joint posterior distribution of the two unknown parameters in the form of an m×n matrix, from which the marginal posterior distribution of each parameter can be readily obtained. PDS guarantees that the calculation is convergent. Further study of comparing PDS with MCMC is warranted to evaluate the potential of PDS as a general numerical procedure for Bayesian methods.
Huang H 2020 A new Bayesian method for measurement uncertainty analysis and the unification of frequentist and Bayesian inference, preprint,
Huang H and Fergen R E 1995 Probability-domain simulation - A new probabilistic method for water quality modeling. WEF Specialty Conference "Toxic Substances in Water Environments: Assessment and Control" (Cincinnati, Ohio, May 14-17, 1995),
Relevant answer
Answer
The Excel spreadsheet isn't transparent-it's exactly the opposite, since it provides the result, without showing how it's obtained. There's no problem of principle in programming a Monte Carlo sampling using Excel; just that the code won't be efficient, since an Excel speadsheet isn't designed for such calculations, when they really get interesting.
  • asked a question related to Bayesian Methods
Question
12 answers
I want to use BEAST to do EBSP analyses with two loci. I open two "input.nex" files in BEAUti to generate a "output.xml" file (In the Trees panel select Extended Bayesian skyline plot for the tree prior), and then run BEAST. I do not know if this is right and I do not know what to do next. I can not construct the trend of demographic history in Tracer just like BSP. I got one log file but two trees files (for each locus), and I do not know how to import both tree files into Tracer.
Relevant answer
Answer
One of the best websites to find the answers to these questions is the following link:
  • asked a question related to Bayesian Methods
Question
5 answers
I'm trying to establish Bayes factor for the difference between two correlation coefficients (Pearson r). (That is, what evidence is there in favor for the null hypothesis that two correlation coefficients do not differ?)
I have searched extensively online but haven't found an answer. I appreciate any tips, preferably links to online calculators or free software tools that can calculate this.
Thank you!
Relevant answer
Answer
Is it possible to do a Bayesian reanalysis, from OR data, which are converted to r-correlation values to estimate the Bayes factor?
  • asked a question related to Bayesian Methods
Question
4 answers
I am completing a Bayesian Linear Regression in JASP in which I am trying to see whether two key variables (IVs) predict mean accuracy on a task (DV).
When I complete the analysis, for Variable 1 there is a BFinclusion value of 20.802, and for Variable 2 there is a BFinclusion value of 1.271. Given that BFinclusion values quantify the change from prior inclusion odds to posterior inclusion odds and can be interpreted as the evidence in the data for including a predictor in the model, can I directly compare the BFinclusion values for each variable?
For instance, can I say that Variable 1 is approximately 16 times more likely to be included in a model to predict accuracy than Variable 2? (Because 20.802 divided by 1.271 is 16.367 and therefore the inclusion odds for Variable one are approximately 16 times higher).
Thank you in advance for any responses, I really appreciate your time!
Relevant answer
Answer
Interesting question, and the answer is yes, with a couple caveats. First, the "N times more likely" is on the odds (p/1-p) scale rather than the probability scale. And also, the Bayes factor is equivalent to the odds ratio only if the predictors had the same prior inclusion odds. The BFinclusion value is just posterior inclusion odds divided by prior inclusion odds, and what you seem to be wanting to say is that the posterior inclusion odds of variable 1 are X times higher than variable 2. If possible, I would just compare posterior inclusion probabilities of the two variables (which IMO are much more intuitive). I'm not sure what JASP output includes, but if nothing else, can be calculated from the Bayes factor and prior inclusion odds).
To make that a little clearer, let's do a worked example, where both variables have a prior inclusion probability of 0.2 (i.e., they were included in 20% of the models in the full set of testable models, and you have equal prior probabilities for each model). Prior inclusion odds are 0.2/0.8 = 0.25. For the sake of this example, let's say you have the same Bayes factors of 20.8 and 1.2 for variables X and Y, respectively. The posterior odds are then 20.8*0.25 = 5.2 for variable X and 1.2*0.25 = 0.3 for variable Y. The ratio of Bayes factors (20.8/1.2 = 17.333) is the same as the ratio of posterior inclusion odds (5.2/0.3), but this would not be the case if prior inclusion odds were different for the two variables (which is often the case e.g., if one variable is an interaction effect that requires the main effect to be in the model too). However, if you convert to posterior inclusion probabilities, variable X has an inclusion probability of 5.2/(1+5.2) = 0.839 and variable Y has an inclusion probability of 0.3/(1+0.3) = 0.231. The ratio of these probabilities is only about 3.63. Depending what you want "N times more likely" to mean, you can report the ratio of odds, ratio of probabilities, or both. Hope this helps clear things up.
  • asked a question related to Bayesian Methods
Question
7 answers
My intuition is that Bayesian methods are more commonly being used in ecology, particularly given relatively recent user-friendly tutorials and software packages being published in fields of ecology...
e.g.,:
Kery (2010) Introduction to WinBUGS for Ecologists: A Bayesian approach to regression, ANOVA, mixed models and related analyses
Kery and Schaub (2012) Bayesian Population Analysis using WinBUGS: A hierarchical perspective.
Korner-Nievergelt F, et al. (2015) Bayesian data analysis in ecology using linear models with R, BUGS and Stan.
Kery and Royle (2016) Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance, and species richness in R and BUGS
etc.,
But, are there any review articles or manuscripts with informal reviews that show the trend of Bayesian analysis implementations in ecology over time?
Thank you!
Relevant answer
Answer
True, although their interpretations differ, which is sometimes important depending on your goals. From my (limited) perspective, Id argue that Bayesian statistics are relatively underused (in ecology) because until recently there were very few intuitive implementations, user-friendly software, or courses in university curricula for it. In my own experience, frequentist packages (in R) don't have the programmed functionality to run the types of models I am interested in- and Bayes offers an alternative solution.
Here, given the recent uptick in Bayesian tutorials, software, published examples, and friendly workshops, I am interested in the trend in the application of Bayes in the ecology literature!
  • asked a question related to Bayesian Methods
Question
3 answers
Dear researchers , i want to ask you how we can use the censored data and bayesian method in insurance, please give me some examples
Thank you in advance .
  • asked a question related to Bayesian Methods
Question
3 answers
Would an indirect comparison using frequentist or bayesian methods be the best approach? Are there any other tests to run, and if so, what software would be the best?
Relevant answer
Answer
  • asked a question related to Bayesian Methods
Question
2 answers
Both the formal (MCMC) and informal (Generalized Likelihood Uncertainty Estimation) Bayesian methods are widely used in the quantification of uncertainty. As far as I know, the GLUE method is extremely subjective and the choice of the likelihood function is various. This is confusing. So, what are the advantages of GLUE and is it worthy of being admired? Is it just because it doesn't need to reference the error function? What is the pros and cons between the two methods? What to pay attention to when constructing a new informal likelihood function(like LOA)?
Relevant answer
Answer
Mr. Peng, I think, your problem can be solved with you read Vrug's and Baven's paper about MCMC formal and GLUE.
  • asked a question related to Bayesian Methods
Question
3 answers
Looking for example data sets and R code for applying Bayesian methods to determine population trends of wildlife using time series abundance or density estimates.
Relevant answer
Answer
Not that I know of
  • asked a question related to Bayesian Methods
Question
6 answers
I recently moved from distance-based techniques to model-based techniques and I am trying to analyse a dataset I collected during my PhD using the Bayesian method described in Hui 2016 (boral R package). I collected 50 macroinvertebrate samples in a river stretch (approximatively 10x10 m, so in a very small area) according to a two axes grid (x-axis parallel to the shoreline, y-axis transversal to the river stretch). For each point I have several environmental variables, relative coordinates inside the grid and the community matrix (site x species) with abundance data. With these data I would create a correlated response model (e.i. including both environmental covariates and latent variables) using the boral R package (this will allow me to quantify the effect of environmental variable as well as latent variables for each taxon). According to the boral manual there are two different ways to implement site correlation in the model: via random row-effect or by assuming a non-independence correlation structure for the latent variables across sites (in this case the distance matrix for sites has to be added to the model). As specified at page 6, the latter should be used whether one a-priori believes that the spatial correlation cannot be sufficiently well accounted for by row effect. However, moving away from an independence correlation structure for the latent variables massively increases computation time for MCMC sampling. So, my questions are: which is the best solution accounting for spatial correlation? How can be interpreted the random row-effect? Can it be seen as a proxy for spatial correlation?
Any suggestion would be really appreciated
Thank you
Gemma
Relevant answer
Answer
Thank you very much for your return. Thank you for this exchange, for links and documents too.
cordially
  • asked a question related to Bayesian Methods
Question
2 answers
I have simulated a data using gama distribution that resulted in normal phenotypic distribution, however non-normal true breeding values (simulated) with a little skewed QQ plot and significant Shapiro Wilks.
Now the question is I want to use GBLUP and not Bayesian approach. However, I am not getting a reference for the same. BLUP is sturdy, however can it be used ignoring bayesian methods? Can I use BLUP for such data/ and if yes, is there a reason to do so?
Relevant answer
Answer
The best linear unbiased methods typically assume everything is multivariate normal and all relationships are fully described with a Gaussian copula. Your data does not have that. The functional form to fit your data will not be linear, so you need a different method. The qq plot shows non-normality so you need to use a non normal fit. Try a few distributions to see which fits your data better and respects the Shapiro-Wilks and qq results. This can be frequentist or Bayesian, classical or machine learning, parametric or non parametric. What it cannot be is BLUP.
  • asked a question related to Bayesian Methods
Question
5 answers
Dear Researchers/Scholars,
Suppose we have time series variable X1, X2 and Y1. where Y1 is dependent on these two. They are more or less linearly related. Data for all these variables are given from 1970 to 2018. We have to forecast values of Y1 for 2040 or 2060 based on these two variables.
What method would you like to suggest (other than a linear regression)?
We have a fact that these series es have a different pattern since 1990. I want to make this 1990-2018 data as prior information and then to find a posterior for Y1. Now, please let me know how to asses this prior distribution?
or any suggestions?
Best Regards,
Abhay
Relevant answer
Answer
Let me play the devil's advocate:
You have data for the past 50 years. However, you say that there is a mayor break or change in the pattern around 1990, so that you want to use only the more recent 30 years ... to predict what will be in 30 or 50 years in the future?
I doubt that this makes any sense. Toss some dice. It will be as reliable as your model predictions.
If "phase changes" like around 1990 can happen, they can happen in the future, too. Additionally, many other things can happen that we are not even aware of today. The uncertainty about such things must be considerd. Further, as you don't have any model that might be justifed by subject matter arguments, there is a universe of possibilities, again adding to the uncertainty of the prediction. If you consider all this, you will almost surely find that the predcition interval 30 or 50 years ahead will be so wide that it can't have any practical benefit.
You can surely grab one possible model, select some subset of your data, and neglect anything else, then you can make a possibly sufficiently precise forecast, which applies to this model fitted on this data, assuming that nothing else happens or can impact the dependent variable. Nice. But typically practically useless. It's a bit different when you has a model, based on a theory. Then you could at least say that this theory would predict this and that. But if you select a model just because the data looks like it's fitting, you actually have nothing.
It's important to think about all this before you invest a lot of work and time in such projects! It may turn out, in the end, that your approach is still good and helpful. But many such "data-driven forecast models" I have seen in my life have benn completely worthless, pure waste. Good enough to give a useful forecast for the next 2-3 years, but not for decades.
  • asked a question related to Bayesian Methods
Question
3 answers
Bayesian methodologies are oftentimes referred to as being subjective in nature, given the involvement of prior specification, which is usually dependent on researchers' preferences. On the other hand, the frequentists do not require any specification of prior and are said to be objective. However, in the Bayesians paradigm, the process of prior specification has been formalized such that the most appropriate priors suitable for given cases are elicited. Thus, bayesian methods are totally transparent and very much objective as the frequentist methods, if not more.
Relevant answer
Answer
Where is objectivity in the frequentits paradigm?
The only interpretation of probability I know that aims to give an objective foundation is that the probability of an event is the limiting relative frequency of the event occuring in an infinitively long series of trials/observations.* However, this interpretation lacks a physical (i.e. "objective") basis (as infinite series can in principle not be observed, and any result from a finite series is arbitrarily unrepresentative as an estimate, because it resembles only an infinitively small part of that infinite series the probability should refer to), and is mathematically (i.e. logically) impracticable. There need only two cases be considered: in this infinite series, the event will either happen an infinite number of times, or it will happen only finitly often. In the second case, the limit is 0, and in the first case, the limit is either undefined or 1 (depending if you want to define Inf/Inf=1). Apart from that, there are many other points that demonstrate that the limiting frequency definition is actually not a definition, it is neither objective nor logically sound. See http://www.joelvelasco.net/teaching/3865/hajek%20-%20mises%20redux%20redux.pdf and http://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199607617.001.0001/oxfordhb-9780199607617-e-17.
After all, any frequentist method starts with assuming a particular probability model. Then, the probability of observed data can be evaluated w.r.t. that given model. That's "objective", as anyone following this method arrives at the same conclusions from the same data. However, the conclusions refer to a "subjectively chosen" model - and it gives just the probability (whatever it is!) of the data, where we are typically interested in statements about the likely cause of these observations (i.e. the probabilities of hypotheses - a term that is not even thinkable in a frequentist world, because hypotheses are no events that would occur in series of trials that could be repeated an infinite number of times!).
---
*another candidate might be the disputed propensity interpretation, but as being a metaphysical concept I wouldn't connect it to "objectivity".
  • asked a question related to Bayesian Methods
Question
8 answers
Is Bayesian Method the most popular method in reliability prediction and if not why?
Relevant answer
Answer
Bayesian methods are suitable for industrial sites, where there is a good understanding "a priori" of deterioration, failure and rupture mechanisms. This understanding comes form decades of cumulated experience, which is the basis of knoweledge. If Knowledge, instead, is poor, may be the case of a very new industry, bayesian approach could be misleading ....
  • asked a question related to Bayesian Methods
Question
15 answers
When we use the Bayesian method to do prediction of a time series, a typical way is to build the prior and using the time series data to update and predict. This process only uses the current time series data, which could be a kind of the waste if we have a lot of historical time series. I am wondering if there is a way we can take a good use of all the historical time series data library. Let's take an example to make the question clear: we were planning to repeat an experiment for 10 times. We have finished the first 9 times and the acquired 100 data points in each experiment from the begin to the end. When the 10th experiment is half done and we want to predict the next 50 data points. People may do the prediction using the first 50 data points obtained in the 10th experiment, but is there any other way we can employ to use all the data we acquired to do the prediction? Thanks!
Relevant answer
Answer
Hey guys, we just published a paper answering this question very well. Our paper proposed a reference based prediction approach using Kalman filtering. Please add your comments to the paper.
citation: Cai, Haoshu, et al. "A combined filtering strategy for short term and long term wind speed prediction with improved accuracy." Renewable Energy (2018).
  • asked a question related to Bayesian Methods
Question
3 answers
Hello!
I am trying to run DAPC analysis in my genome-wide dataset incluiding 188736 genotypes for 188 individuals from 18 different geographic populations. I already know there is some genetic structure in the dataset, at least 2 or 3 groups could be defined. However, when running "find.clusters()" function in order to define the most plausible number of groups that could explain my dataset I obtain strange plots of "Cumulative variance explained by PCA" and "Value of BIC vs number of clusters". The cumulative variance should be higher in the first PCs and decrease as we look at the next ones, showing a curve in the graph up to the 100%. Furthermore, the BIC graph should show an elbow at some point with the smallest level of BIC and do not to represent such a perfect line.
Do you have any idea about why obtaining these results and what do they actually mean? Could it be because the amount of genotypes and samples is such high that the function cannot work with them?
Thank you in advance.
André
Relevant answer
Answer
I am not sure if there are recommendation for max dataset size. Obviously you are looking for a Cumulative-Variance plateau, however the first figure suggests that there isn't one for any number of retained PCA's. Something you could try is to randomly reduce your dataset by half (or even more) and see whether these figures change. That would be my next step.
Good luck, and please let me know whether that worked, interesting to know!
Matthijs
  • asked a question related to Bayesian Methods
Question
7 answers
Hi all,
I have conducted a mediation analysis using AMOS. The model included 5 predictors, 4 mediators and 1 outcome.
My outcome is binary therefore I conducted Bayesian statistics using the Markov Chain Monte Carlo option. I want to control for multiple comparisons using false rate discovery (FDR) methods. Since there is no p-value with Bayesian, is it correct that the the posterior p-values can be used instead?
If yes, can someone guide me on where to find these posterior p-values for each mediation estimate, as i can only find the posterior predictive p value under "fit measures".
Open to trying multiple comparison correction methods other than FDR.
Relevant answer
Answer
There is nothing like a type-I error in a Bayesian understanding. That concept doesn't exist. The Bayesian analysis gives you estimates, not tests. If you are concerend that your estimates are too much influenced by your sample, you may use more conservative priors (peaking narrowly at zero).
  • asked a question related to Bayesian Methods
Question
3 answers
I have done quiet a bit of reading (the manual, publication, papers that have used the method); however, I still have a question regarding sigma² and obtaining reliable results. The publication states:
"It is important to note that when the method fails to identify the true model, the results obtained give hints that they are not reliable. For instance, the posterior probability of models with either factor is nonnegligible (>15%) and the estimates of sigma² are very high (upper bound of the HPDI well over 1)."
This is one of my latest result:
Highest probability model: 5 (Constant, G3)
mean mode 95% HPDI
a0 -4.80 -4.90 [-7.67 ; -1.81 ] (Constant)
a3 8.32 8.70 [5.02 ; 11.4 ] (G3)
sigma² 37.9 25.6 [11.9 ; 75.9 ]
Am I to conclude that these results are not reliable? What might cause such a large sigma² and unreliable results? I ran the program for a long time so I do not think that's the issue. This problem continues to happen with many other trials that I've done. Does any one have any advice or recommendations? Thanks!
Relevant answer
Answer
Thanks Mabi
  • asked a question related to Bayesian Methods
Question
2 answers
I need the coding or command for estimating the parameters of endogenietic model through Bayesian method of moment or Bayesian two stage least square. If any one have coding or command or package in any software of language to estimating the endogenous predictors of model please mail me I will be very glad.
Relevant answer
Answer
thank you sir for your wonderful suggestion,
The Bayesian Generalized Method of Moment is one of the best estimation procedure to estimate the endogenous independent variables and the endogeneity problem we have not full information for usual likelihood, So instead of usual likelihood we use pseudo likelihood and the multiply with specified priors. If someone mail me the command or coding of pseudo likelihood in Bayesian approach I will be very thankful of him.
  • asked a question related to Bayesian Methods
Question
5 answers
Hello everyone ,
I have a question about the Bayesian Hierachical model and Simultaneous equations . I want to establish two model, one for the individual tree growth, another for the individual tree mortality. The two models can all build by Bayesian method through the SAS PROC MCMC or R2WinBUGS.
Obviously, there are some relationship between the growth and mortality ,so I want to konw, Can I use the Simultaneous equations or SUR to estimate the two Baysian model together.
Could you give me some advice, Nomatter an essay or code example, thanks a lot.
Relevant answer
Answer
CHECK THE FILE GIVEN BELOW
  • asked a question related to Bayesian Methods
Question
3 answers
When doing Bayesian analyses, is there a certain effective sample size that is considered the minimum acceptable sample size, within psychology?
Relevant answer
Answer
Hi all,
although the question is 2 months old, there seems to be the, i.m.o., most important answer missing, which is a reference to this article:
Schönbrodt, F. D., Wagenmakers, E. J., Zehetleitner, M., & Perugini, M. (2017). Sequential hypothesis testing with Bayes factors: Efficiently testing mean differences. Psychological Methods, 22(2), 322.
Basic message: (also brought up against standard Null-HypothesisTesting) Even if your "t-test" (or any other classic 'standard method') produces a significant result, while having an inconclusive Bayes Factor for this result (around 1 +-3), the significant result is not worth mentioning (unreliable). What you then do: just go on sampling, untill your Bayes Factor is conclusive, either for or against your hypothesis. :) (Makes things easy, doesn't it?)
Hope this helps.
Best, René
  • asked a question related to Bayesian Methods
Question
4 answers
Using R's arms package, I've run two Bayesian analyses, one with "power" as a continuous predictor (the 'null' model) and one with power + condition + condition x power. The WAIC for the two models are nearly identical: -.017 difference. This suggests that there are no condition differences.
But, when I examine the credibility intervals of the condition main effect and the interaction, neither one includes zero: [-0.11, -0.03 ] and [0.05, 0.19]. Further complicating matters, when I use the "hypothesis" command in brms to test if each is zero, the evidence ratios (BFs) are .265 and .798 (indicating evidence in favor of the null, right?) but the test tells me that the expected value of zero is outside the range. I don't understand!
I have the same models tested on a different data set with a different condition manipulation, and again the WAICs are very similar, the CIs don't include zero, but now the evidence ratios are 4.38 and 4.84.
I am very confused. The WAICs for both models indicate no effect of condition but the CIs don't include zero. Furthermore, the BFs indicate a result consistent with (WAIC) no effect in the first experiment but not for the second experiment.
My guess is that this has something to do with my specification of the prior, but I would have thought that all three metrics would be affected similarly by my specification of the prior. Any ideas?
Relevant answer
As Gelman says on the BDA book, model selection still is a on going subject of research. I would trust more on the posterior intervals, given that they were grounded on probability axioms.
If the answer is useful, please recommend for others,
  • asked a question related to Bayesian Methods
Question
10 answers
Hello, 
I'm using  dominant genetic markers to measure genetic differentiation among populations and genetic diversity within populations. Since I'm using dominant markers, I had to calculate allelic frequencies using a bayesian method (Zhivotovsky 1999). There are a lot of measures In the literature (eg. FST; F'ST; GST; G'ST; PhiST; Phi'ST; D) and I want to know which must I choose. I want compare my results with published works with similar species too.
Thank you
P.S. Sorry for my bad english
Relevant answer
Answer
Hi João 
Thank you for the paper. I sen you two papers with regard to genetic differentiation which may be helpful.
Good Luck
  • asked a question related to Bayesian Methods
Question
2 answers
I have both the Reaction Time (RT) and Accuracy Rate (ACC,range from 0 to 1) data for my 2*2 designed experiment. I want to know whether there is an interaction between the two factors. Two-way ANOVA could be used easily for my purpose. But I would like to calculate the bayes factor (BF) to further confirm the result. I use the same R code with MCMCregress function to do this. The RT variale seems to get compatible BF results as compared to the ANOVA. However, the BFs of ACC are pretty small (less than 1) for all my data, even when the interaction effects are very significant based on the ANOVA.The code goes as follows:
model1.ACC <- MCMCregress(ACC~cond+precond+cond*precond,
data=matrix.bayes,
b0=0,
B0=0.001,c0=0.001, d0=0.001, delta0=0, Delta0=0.001,
marginal.likelihood="Chib95", mcmc=50000)
model2.ACC <- MCMCregress(ACC~cond+precond,
data=matrix.bayes,
b0=0,
B0=0.001,c0=0.001, d0=0.001, delta0=0, Delta0=0.001,
marginal.likelihood="Chib95", mcmc=50000)
BF.ACC <- BayesFactor(model1.ACC, model2.ACC)
mod.probs.ACC <- PostProbMod(BF.ACC)
print (BF.ACC)
print(mod.probs.ACC)
Is there anything wrong in my code? Any guidance will be appreciated.
Relevant answer
Answer
Thank you so much, Chiedu.
  • asked a question related to Bayesian Methods
Question
2 answers
To see complete details, please find the attached file. Thanks.
Relevant answer
Answer
Another option is to try Acceptance-rejection method. It works as long as you have PDF. However, it is quite inefficient, especially for large m.
  • asked a question related to Bayesian Methods
Question
10 answers
I have prepared some sets of climatic data such as temperature and precipitation and have found a strong correlation between these climatic data and my disease incidence data. There are some ways to demonstrate this correlation such as following methods. Which one is more academic and appropriate in this case to publish? And also please let me know if there is any other academic and reliable method in this case.
These are some methods that I know up to now:
-Linear Regression
-Pearson, Kendall and Spearman coefficients
-ANOVA
-Bayesian Interactive Model
-Principal Component Analysis
Relevant answer
Answer
Mohsen -
If you have weather-related regressor data for predicting disease, that seems very interesting, but I suspect you would also have omitted variable bias in a regression, as it seems hard to believe that weather-related independent variables are all you need. 
It seems you need to concentrate on regression, and the kind depends upon your type of data, best variables to use, and whether or not there may be substantial nonlinearity.   Also, you don't want to overfit or underfit.  Further, heteroscedasticity is often a natural and important part of the error structure as well.
I suggest you consider your subject matter, and perhaps graphical analyses, and from these considerations develop candidate models.  Then you could leave out part of your data for testing, and see which model performs best. 
It may be interesting to see individual regressor variables used to check correlations or regressions, but graphs may show nonlinearity between continuous variables, or other phenomema may be found graphically, and the interaction of regressor variables may be important, so if I understand your situation - and I'm sure I don't understand all of it - I worked mostly with simple one regressor continuous variables - it appears best if you were to concentrate on comparing candidate-model performances. 
Cheers  - Jim
PS -  If you have count data you might want to look at some discussion such as that at the beginning of the document found at the following link, and consider from there where you want to go:
  • asked a question related to Bayesian Methods
Question
3 answers
Can we use Bayesian processing methods to achieve the reconstruction of linear scrambler? What should be focused on when using an optimization algorithm to solve the problem?
Relevant answer
It depends on the type of optimization problem. If the problem is modeled by a continuous function and its derivatives are available, you can try to use a gradient method.
If the above is not possible, I suggest you a nature inspired algorithm like mentioned by Ali.
Regards.
  • asked a question related to Bayesian Methods
Question
1 answer
When using Bayesian methods to estimate mixture models, such as latent class analysis (LCA) and growth mixture models (GMMs), researchers seem to use common priors for class-specific parameters. Of course, this makes sense when using so-called "noninformative" priors, yet Monte carlo studies often indicate that such priors provide biased and variable estimates, even with relatively large samples.
Confirmatory approaches to mixture modeling, using (weakly) informative priors, performs much better than "noninformative" priors.  However, care must be taken when selecting prior distributions (e.g., through careful elicitation from experts or previous research findings). 
But consider a scenario in which latent classes are unbalanced (e.g., a two-class model with .80/.20 class proportions).  To my knowledge, most researchers use the same priors for parameters in each class, regardless of differences in relative class size.  Does anybody know of research in which priors for class-specific parameters have been adjusted to equate their informativeness, dependent on the number of observations in each class?  I would be happy to hear of any research where such an approach has been used.
Relevant answer
Answer
Hi,
I wouldn't call this approach as 'weakly informative'. It only applies to the case that the number of states is known. In addition, it imposes the explicit assumption that the latent states are identifiable: The term 'component k-th' of the mixture is rarely a defined object, at least a priori. But I agree, that if these restrictions are indeed true, then the specific prior choice would perform reasonably well.
However, there are some studies that use similar prior set up. For example Diebolt and Robert (1994) place a prior ordering on the mixture weights (and this is actually a generalization of the scenario mentioned above). Richardson and Green (1997) consider a prior ordering of the means. It is well known that in general cases these assumptions lead to serious biases in the resulting estimates, therefore it is advised to avoid them and to use flat priors with common prior parameters (see papers related to the label switching problem beginning from Stephens, 2000)
  • asked a question related to Bayesian Methods
Question
4 answers
I'm using a Bayesian Network for reliability analysis of a structures. The network is quite big and result in NaN after inference. Different algorithms for inference have been tried but I always obtained the same NaN results. I found the only way to avoid this problems it to reduce the BN size, does any one know another solutions to avoid this ? Thank a lot
Relevant answer
Answer
I am afraid that I do not fully agree with neither of the above answers.
The main reason to get NaN answer is to ask for an inference with contradictory evidences : P(X|y) with P(y)=0. In that case it is not a question of the method you are using, the problem comes from you incoherent hypotheses. Jaynes in his book "probability theory : the logic of science" (Cambridge, 2003) describes this problem in detail and proposes interesting considerations on the fact that NaN responses is an interesting way to discover incoherences.
Another possibility is indeed that the precision of the probabilistic coding you are using is not sufficient. There is no completely satisfactory answer to this problem. Improving the coding is of course a possibility (for instance using log-probability) but any code as its limits. Picking-up a threshold is neither a solution completely satisfying as it changes arithmetics and you cannot both satisfy \epsilon+\epsilon=\epsilon and 1-\epsilon=1.
Best wishes :)
  • asked a question related to Bayesian Methods
Question
15 answers
I´d like to use approximate bayesian computation to compare three different demographic scenarios (bottleneck vs. constant population vs. population decline) for several species with microsatellites. Is there any tutorial or paper that gives an outline over the whole process (from simulation to parameter estimation and model selection) on how to do this in R or with the command line (not in a point-and-klick way)? 
Relevant answer
Answer
Dear Martin,
You can use the R package 'abc' from Csillery et al (http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00179.x/abstract). You can easily find in the internet an R vignette explaining with examples the different functions of the package through examples (i.e., different methods of parameter estimation/model comparison, posterior predictive checking, cross-validation procedures...). The only thing that you cannot do with this package is to simulate the microsatellite data, but you can use other command line programs for simulate the data (e.g. simcoal, ms) and make the abc analysis with this package.
This is what we have done in the following papers for instance (http://onlinelibrary.wiley.com/doi/10.1111/mec.13401/full and http://onlinelibrary.wiley.com/doi/10.1111/mec.13345/full), where we used the flexible ABCtoolbox framework of Wegmann et al 2010 (BMC Bioinformatics, 11, 116) to build a pipeline of different programs (i.e. simcoal, PDGSpider, arlequin, ADZE and R) that allowed us to calculate many different summary statistics (including the M-ratio/GW statistic) for our simulations. Then, we analysed the output with the abc package.
A similar approach was also used by Marino et al 2013 (http://onlinelibrary.wiley.com/doi/10.1111/mec.12458/full).
You can also call the ms program of Hudson et al 2012 (Bioinformatics, 18, 337–338) as an external program with R, then analyse the output with the package 'abc' or others. This was done in http://onlinelibrary.wiley.com/doi/10.1111/mec.12321/full, and an R script for launching ms from R is available at the dryad entry associated to this last article.
Hoping that you find this useful,
Ivan
  • asked a question related to Bayesian Methods
Question
1 answer
Are there any recent recommendation for measures of convergence and diagnostics criterion in a mixed treatment comparison. Moreover, apart from the standard ranking, OR / mean treatment effects, probability of being the best, Any there any other output which can be interesting to report ?
Relevant answer
Answer
The idea of  rate of convergence may be from difference in known informations which will ultimately affect the posterior probability in Bayesian framework.
  • asked a question related to Bayesian Methods
Question
7 answers
Dear all,
I recently started my PhD and the main topic of my project is Bayesian estimation with small samples. To get a clear overview of the field, I am conducting a comprehensive literature review on the topic and am trying to ensure that I have found all the relevant work in this area. I'm wondering if you know about any papers in which simulation studies are used to study the performance of Bayesian estimation (in all kind of models), when the sample size is small.
If you know about a paper in which these three topics (simulation study, Bayesian estimation and small samples) are discussed, or if you are working on such a paper right now (especially in press or recently accepted papers that would not appear in databases that I am using for my search), could you please let me know?
I would appreciate any help!
Kind regards,
Sanne Smid
Relevant answer
Answer
Hi :)
When dealing with small samples, statistical estimates may become quite diffuse.
One way to improve estimates is to use formalised judgement.
We've recently conducted a simulation study of how Bayesian estimation can help incorporate judgement into modelling, especially when small samples are used.
Perhaps, our work can present some interest to you:
  • asked a question related to Bayesian Methods
Question
5 answers
The usual Bayesian analysis takes a prior and updates it to a posterior, using some given data (and a probability model -> likelihood). The prior represents what we already know (believe) about the model. There is nothing like an absolutely uninformative prior, but there can be priors with very high information content (like sharp peaks at a particular hypothesis).
Now consider the experiments to measure the speed of light from Michelson and Morley. The data shows systematic differences between runs and experiments. Given we knew this variability and performed only a single experiment, how could we include the knowledge about the variabilty between experiments? I suspect that this should give wider credible intervals.
As I understand do "non-informative priors" like Jeffreys prior still assume that the data is "globally representative", so the posterior tells me what to expect given my particular experiment, but it ignored that I know that a further experiment would likely give me a different estimate.
I know that a possible solution was to actually perform several Experiments, but here I want to consider the case that I have some konwledge about the variability between experiments but that I can not actually perform several experiments [due to contraints on availability, money].
Another example: a hospital investigates the effect of a particular treatment of its patients. The data is representative for this hospital, but it is well known that the effects will vary between hospitals. If this information is available (at least approximately), is it possible to consider this in the analysis of the data that was obtained only from a single hospital?
Relevant answer
Answer
This seems to me like a good example to apply the random effects model. Bayesian version of this model, applied to discrepant measurement results of fundamental constant (Plank constant, in this case) was published recently by Bodnar et al (http://dx.doi.org/10.1088/0026-1394/53/1/S46)
  • asked a question related to Bayesian Methods
Question
4 answers
In Bayesian approach, the prior probability is often computed by the ratio of sample in training set. Is it really suitable? Is there any method to calculate the prior probability without considering training data? 
Relevant answer
Answer
From Wikipedia: A class' prior may be calculated by assuming equiprobable classes (i.e., priors = 1 / (number of classes)), or by calculating an estimate for the class probability from the training set (i.e., (prior for a given class) = (number of samples in the class) / (total number of samples)).
Yes it's suitable as it gives you an accurate estimator of your prior information. If you don't want to use it, you can assume equiprobable classes.
  • asked a question related to Bayesian Methods
Question
2 answers
In Crystal Structure Prediction of organic molecular crystals, almost no prior information is used. Usually, only a rough accounting of space group prior probability is included by preferentially sampling the most common space groups, but even this is done to save computer time, not because it is considered a scientifically sound thing to do!
Sometimes CSD database frequencies of nearest neighbor interactions or hydrogen bonding motifs are used to score crystal structures, but this seems to be applied in an ad hoc manner, without mathematical rigor. In the latest blind test only scoring functions based on energy was used.
Why don't we consider prior probabilities of crystal structure properties, and perform the CSP as a Bayesian update? The scoring function would then be a probability evaluated by Bayes' theorem, the probability of observing a predicted structure is the prior probability times the evidence provided by the crystal structure prediction. An obvious advantage of this method is that it captures kinetic effects during nucleation and growth. 
Relevant answer
Answer
Thanks, Tee. That paper is an excellent introduction and example of what I want to do. 
We should use statistics from large databases of crystals to create a prior probability distribution. This could include things like how common different space groups are, favorable and unfavorable close contacts and density (since crystals tend to be close-packed).
We then perform a CSP calculation and obtain a set of crystals, ranked by energy.  Combining the prior probability with the new energy-based evidence with Bayes' theorem should give a better scoring.
  • asked a question related to Bayesian Methods
Question
3 answers
Attached are details of a new framework for meta-analysis that we created that is easy for the main-stream researcher to implement. In addition, it avoids many assumptions involved with the Bayesian method and thus may actually be the more accurate framework. Does anyone with Bayesian interests have any thoughts vis a vis using this as a replacement to the Bayesian framework in network meta-analysis?
See this paper:
Relevant answer
Answer
Thanks! Attached is the GPM output of the same data reported by Senn [1] using the RE model. In addition to Senn's output (iteratively re-weighted least squares), the Bayesian and multivariate frequentist output from the same data are attached (from a paper by Rucker). Of note, only the GPM framework is accessible to the mainstream researcher.
1. Senn S, Gavini F, Magrez D, Scheen A. Issues in performing a network meta-analysis. Stat Methods Med Res. 2013; 22(2):169-89.
  • asked a question related to Bayesian Methods
Question
4 answers
In Bayesian networks continuous data is often made discrete, for example:
< 21.5 becomes 0
21.5 > .. < 43 becomes 1
> 43 becomes 2
If you run an optimisation script to find the best thresholds, and not define them using equal count/width, will that lead to overfitting? I personally would expect that it will not lead to overfitting.
Relevant answer
Answer
If you wish to learn a Bayes net from data, I think that overfitting is much more likely to come from the number of intervals of discretization rather than on the cutpoints of these intervals. Fortunately, the scores used by search approaches penalize this, see:
Friedman, N., Goldszmidt, M., et al.: Discretizing continuous attributes while
learning bayesian networks. In: ICML. pp. 157–165 (1996)
Monti, S., Cooper, G.F.: A multivariate discretization method for learning bayesian
networks from mixed data. In: UAI pp. 404–413. (1998)
But you are right, some discretizations may induce overfitting: basically, if you discretize a random variable using two intervals, and you put its smallest value in the database in the first interval and all the other values in the second interval, you may come up with overfitting. Of course, nobody would do that but finding the right discretization method is really not obvious. Personally, I would not advocate using entropy-based approaches because I do not think they are appropriate for Bayes net learning (the kind of information they exploit is not necessarily what is important for a Bayes net).
  • asked a question related to Bayesian Methods
Question
1 answer
Dear colleagues,
for my last manuscript I calculated the outliers from Fst values using Mcheza (basically LOSITAN for dominant markers, using DFDIST). The reviewers asked to verify the outliers using Bayesian methods. I did the re-calculation with BayeScan as it was mentioned by the reviewers.  The total amount of detected outliers was slightly less using BayeScan (24 vs. 28, out of 284 loci). But, while almost all outliers apparently under positive selection were identified with both approaches (14 out of 17), none of the outliers under apparent balancing selection identified in Mcheza (11) were identified in BayeScan (7) and vice versa. I assumed, that the Bayesian approach should be more conservative and identify less outlying loci. I did not expect a completely different set for loci under balancing selection. Especially, when the set of loci under positive selection is almost identical. Any idea about how that might have happened or how to interpret it?
Thanks in advance!!!
Regards,
Klaus
Relevant answer
Answer
how about interpretation of these IR spectra?
  • asked a question related to Bayesian Methods
Question
2 answers
I'm very new in SEM and I'm concerned about the reliability of Bayesian estimation of categorical variable in AMOS; then, if it is reliable, how can I interpret those results? How can I know which category is associated with the outcome in the latent variable? It is assumed that only that categorical variable measures the latent construct.
Relevant answer
Answer
Hi Matteo
I guess if you're not experienced with Bayesian analysis (I'm not really) you could either have a read up on this, or alternatively you can use the WLSMV estimator (and declare your predictor as categorical) in MPlus or lavaan in R (free).  These 2 programs also allow you to mix observed and unobserved variables in your model if you are saying your predictor is effectively a single measured variable.
Cheers
Tom
  • asked a question related to Bayesian Methods
Question
6 answers
in Bayesian inference we can model prior knowledge using prior distribution. There is a lot of information available on how to construct flat or weakly informative priors, but I cannot find good examples of how to construct a prior from historical data.
How, would I, for example, deal with the following situations:
1) A manager has to decide on whether to stop or pursuit the redesign of a website. A pilot study is inconclusive about the expected revenue. But, the manager has had five projects in the past, with a recorded revenue increase of factor 1.1, 1.2, 1.1,  1.3, 1. How can one add the managers optimism as a prior to the data from the pilot study.
2) An experiment (N = 20) is conducted where response time is measured in two conditions, A and B. The same experiment has been done in dozens of studies before, and all studies have reported their sample size, the average response time and standard deviation of A and B. Again: How to put that into priors?
I would be very thankful for any pointers to accessible material on the issue.
--Martin
Relevant answer
Answer
Straight foreward: use the old data and some reasonable "flat" prior and calculate the posterior for that data. This will be the prior for your new data.
  • asked a question related to Bayesian Methods
Question
2 answers
Is it possible to use both kinds (nominal, numerical) of features in designing training set for Naive byes machine learning algorithm. Can we mention feature frequencies for designing feature vectors in Naive Byes machine learning algorithm. when  Features are represented in Byes algorithm, which out of binary(term x and y are represented by 0,1) and TF (term geography and chemistry are represented by 10,20) approach is better for classifying large size documents.
Relevant answer
Answer
Hello,
In any statistical estimation problem where probability theory applies, you can form joint densities for discrete and continuous random variables. This applies to machine learning algorithms also. For a Bayes' classifier, you are casting the problem as a hypothesis test on a mixed space of discrete and continuous RVs. Since you wish to perform classification rather than estimation, you need to treat the continuous variables as a discrete class. You can do this by testing their membership with respect to a set, e.g. like thresholding. Alternatively you can apply a generalised likelihood ratio test where you maximise the likelihood over the continuous random variables. This leaves you with a likelihood function that has discrete variables.
  • asked a question related to Bayesian Methods
Question
4 answers
Please see my question that is attached about prior distribution.
Relevant answer
Answer
You might consider a hierarchical model, where you have n independent Dirichlet distributions with common parameter (like in your reply to Hamideh Sadat Fatemi Ghomi), and then put a prior on the common parameter.  This would typically require MCMC to estimate. But you could also use empirical Bayes to come up with a point estimate for the common prior and solve this model exactly.
  • asked a question related to Bayesian Methods
Question
6 answers
As the models used in Kalman filtering are also Gaussian processes, one would expect that there would be a connection between GP regression and Kalman filtering. If so, could one claim that GP regression is more general than Kalman filtering? In that case, aside from computational efficiency, what would be the other advantages of using Kalman filtering?
Relevant answer
Answer
Thank you Rolf and Paresh for your useful comments.
I read the references that you have cited. However, I am still not sure if these references fully answer my questions. So let me rephrase my questions.
Kalman filter recursively produces estimates of unknown variables based on system’s dynamics model, known control inputs to the system and multiple sequential measurements.
In Gaussian process regression, same process can be implemented in order to estimate the unknown variables. The system dynamics can be implemented in the form of Gaussian kernel and hence incorporated in the covariance matrix. The inputs and sequential measurements are then through use of the Bayes rule can be used to estimate the unknown variables.
My question is the following. Is there any problem that can be solved by kalman filtering which cannot be solved by Gaussian process regression? In this relation, I am not thinking about computational efficiency but on if, kalman filtering can offer insights, which are not already offered by the Gaussian process regression.
  • asked a question related to Bayesian Methods
Question
5 answers
Please, I will like to know if the Bayesian approach to research has been successfully employed in studies in forest ecology and conservation. Relevant literature will be appreciated. Thanks. 
Relevant answer
Answer
There are now many examples of Bayesian analyses in forest ecology and conservation (a quick google scholar search turns up >19,000 hits with "Bayesian analysis" and "forest").  I've enclosed a prominent example, now about 10 years old, from a large-scale collaborative project in which I was involved.
How "successful" has Bayesian statistics been in forest ecology and conservation?  My opinion is that one should always strive to conduct statistical analyses in the simplest manner possible to address the question of interest.  If a simple t-test will suffice, then that is preferable.  Bayesian statistics, which requires not only intensive computations but also "deep thinking" about model structure and prior distributions, is almost always much more complex than alternatives.  Whether this additional complexity is warranted depends on the questions posed and the data available.  My observation has been that the added complexity of a Bayesian approach sometimes obscures poor model assumptions, occasionally resulting in publications with both hard-to-decipher statistical methods and poorly substantiated results.  I can't view this as a "success".  However, there are many problems (e.g., estimating mortality rates for rare tree species, as in the paper enclosed) for which Bayesian statistics is an extremely useful, if not indispensable, tool.
  • asked a question related to Bayesian Methods
Question
3 answers
We have a poolof hairs (around 10 for each reference sample coming from one individual) representing 15 individuals (e.g.).
We can characterize each of them by microscopy for several morphological characters, and by microspectrophotometry for colours informations.
These methods results for each hair in one set of discontinuous/qualitative data (morphological characters) and for the same hair in one set of continuous/quantitative data (colours informations). We can analyze them separately. That is not a problem.
But how can we analyse the two sets in a pooled matrix (combining qualitative and quantitative data) following a standardized protocol (that could be reused latter, like that)?
The questions we need to answer are :
- to test if all hairs coming from the same people cluster in the same group;
- for an unknown sample (of one hair at minimum), to search the group from which is the closest;
- and of course, to have a statistical estimation of the validity of the clusters or the similarity between unknown hairs and the closest clusters.
What is the best way to do that and the best software easy to use? (like XlStat?)
Thank you for your suggestions and ideas.
Relevant answer
Answer
I think the Taguchi-Maharanobis strategy is proper for your inquiry.
We consider each reference samples as each classes and one individual as the other class. This is the same as one-class SVM.
I developed the optimal LDF on the minimum number of misclassisications (MNM).
We discriminate the data consists of reference samples with one individual.
We evaluate the discriminant scores by obtained LDFs and choose reference calass with the most biggest t-value.
I am willing to co-work.
  • asked a question related to Bayesian Methods
Question
2 answers
I am curious about what people think of the approach, from: great idea, bad form mixing two different paradigms, to not ready to use it yet as it hasn't been applied very often etc
Relevant answer
Answer
See attached file
  • asked a question related to Bayesian Methods
Question
3 answers
I am looking for a good introduction (book/article/chapter) to using Bayesian methods/stats for the design and analysis of psychology experiments. Any suggestions are much appreciated.
Relevant answer
Answer
You might take a look at the work by Trafimow who proposed replacing null hypothesis significance testing with bayesian statistical inference.   The main paper is:
Trafimow, D. (2003). Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes’s theorem. Psychological Review, 110, 526–535.
There have been a few follow up papers on it as well.  For example, see http://www.socsci.uci.edu/~mdlee/lee_wagenmakers.pdf
  • asked a question related to Bayesian Methods
Question
1 answer
hello, I am doing a model based population structure analysis following Bayesian method in R. I have SNP data from 100 diploid individual. I am using K=1,2,3,4,5,6 and 7. Now the problem is when I use k values up to 3 I see only one color in the figure generated in R. But when I use values like 4,5,6 and 7 I see minor color change in the tip of 5-6 vertical line. I thought that there is only one population as majority are showing only one color, but then I checked the lnProb values for each k and it was showing that the highest lnProb value is for k=7. please share your thoughts and suggest me. I have attached 2 figures. Give me some idea.
Relevant answer
Answer
Which package are you using in R? The output from the package must give you a probability for each individual to belong to an specific population, before going to the graphic check those values. I would suggest you to extend your K range, try up to 10 and check again the highest InProb value, before drawing any conclusion.
  • asked a question related to Bayesian Methods
Question
2 answers
Hello anyone
I'm trying to perform an analysis of BSSVS (a Bayesian Stochastic Search Variable Selection), but when I run BEAST (including Version 1.75, 1.8.0, 1.8.1 and 1.8.2) under Windows 7(64bit),  using BEAGLE 2.1, the program displays the error message:
"Underflow calculating likelihood. Attempting a rescaling...
Underflow calculating likelihood. Attempting a rescaling...
State 1000467: State was not correctly restored after reject step.
Likelihood before: -9781.963977882762 Likelihood after: -9022.104627201184
Operator: bitFlip(Locations.indicators) bitFlip(Locations.indicators)"
Alternatively, I tried to run it under iMAC (yosemite) ,but it can work only for 1 millions generations. And after that the same problems stilly occured.
Does anyone know how I can solve this problem?
The details of the BEAGLE  as follows.
Thanks.
Best regards,
Raindy
BEAGLE resources available:
0 : CPU
Flags: PRECISION_SINGLE PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_SSE VECTOR_NONE THREADING_NONE PROCESSOR_CPU FRAMEWORK_CPU
1 : Intel(R) HD Graphics 4600 (OpenCL 1.2 )
Global memory (MB): 1624
Clock speed (Ghz): 0.40
Number of multiprocessors: 20
Flags: PRECISION_SINGLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_NONE THREADING_NONE PROCESSOR_GPU FRAMEWORK_OPENCL
Relevant answer
Answer
Thanks. It can  work in BEAST v1.7.4 without the BEAGLE library.
  • asked a question related to Bayesian Methods
Question
5 answers
I have a Hierarchical Bayes probit model, but I'm seeking answers in general for any random coefficients model.   Depending on what level I calculate DIC (sum the DICs per unit a vs. DIC for the average respondant), I get different answers in model comparison to a regular probit model.  The by unit way is worse than the regular probit, the other is better.  The hit rate for the HB model is superior the regular probit, so I don't why the sum of the by unit DICs is worse than regular probit.  
Relevant answer
Answer
Kalinda,
I misunderstood your calculations earlier. If you computed pd(i) for each outcome i and then summed them together, you obtained the correct value for pd.
But there's an important issue here: pd is not a measure for each outcome, but a property of the model. It's a measure of effective number of parameters, taking into account the partial pooling of coefficient estimates in multilevel models.
Using a notation similar to R functions:
p= meanS(sumN(-2 * log p(yi | parss))) - sumN(-2 log p(yi|parsi^)
Where:
  • parss is a vector of simulated values of pars (i.e., in a MCMC chain);
  • sumN(-2 * log p(|)) is the model deviance, computed for each simulated parameter value and averaged (first term in the right side) and for the average parameter value (second term in the right side).
The problem might be the choice of the likelihood function, p(y|pars). What you choose as pars in the likelihood function changes the "focus" of the model. Spielgelhalter has some nice slides with examples of model focus in the attached link.
Simply put, the model focus is the level of the model for which you want to predict new cases. In your case, for the first level (responses within individuals?):
p(yi|ai, bi, se) = dnorm(yi, mean=ai+bi*xi, sd=se)
Where each parameter is either the posterior expected value (par^) or a simulated posterior value from the MCMC chain (pars). In case you want to focus on the second level (new individuals?), you must integrate over the first level parameters that depend on the second level. This can be achieved with:
p(yi|ai, g0, g1, se, sb) = dnorm(yi, mean=ai+(g0+g1*zi)*xi, sd=sqrt(se^2 + sb^2))
  • asked a question related to Bayesian Methods
Question
3 answers
I have panel data of 200 regions over 20 years. My goal is to estimate a dynamic spatial (space-time) panel model. I would like to employ an extension of model used in Debarsy/Ertur/LeSage (2009): “Interpreting dynamic space-time panel data models” and in Parent/LeSage: “Spatial dynamic panel data models with random effects,” Regional Science & Urban Economics. 2012, Volume 42, Issue 4, pp. 727-738.  See the attached word-file for more information (formulas).
I got three questions:
1.) Is it possible to add lagged exogenous covariates?
Referring to Anselin (2008) in “The Econometric of Panel Data (page: 647)” this would result in an identification  problem, since  Y_t-1 already includes X_t-1.
2.) I want to use a “twoways” (region and year) fixed effects specification instead of a random effects. Does that lead to any complications?
In my view, it should be possible to de-mean the data first and then apply the MCMC sampler in usual fashion. Is that correct?
3.) As a last step, I try to add a non-dynamic spatial error term (SAR). Note that the spatial weights (row-stand.) are different for the spatial lag (durbin-part) and spatial error. Is that possible?
Relevant answer
Answer
You study requires model re-specification procedure until the problem on identification of specific variables is resolved.
  • asked a question related to Bayesian Methods
Question
4 answers
I have a question concerning the time scale (x-axis) of Bayesian Skyline Plot (BSP) analysis as implemented in BEAST. I am not very familiar with the model underlying this analysis.
I run BSP analyses on three different dataset (one each population) for the mitochondrial COI (537bp). The first population included 97 sequences, the second population 167 sequences, and the third population 48 sequences. The programs (Beauti + BEAST + Tracer) produced 3 BSPs with different time scales: the first population, 0 to 450 Kya; second population, 0 to 1800 Kya; and third population 0 to 350 Kya. My question is “how the more ancient value of x-axis is obtained? It seems related to the number of sequences in the original dataset, does it? If so, which is the relation between the number of sequences and the length of x-axis (time) of the BSP?
Any suggestion will be greatly appreciated.
Ciao
Ferruccio
Relevant answer
Answer
The paper by Atkinson et al (Proc Roy Soc B, 276:367-373, 2009) shed light on the problem addressed in my question. The answer is that Bayesian Skyline Plots are truncated to the median estimate of each population's TMRCA. Problem fixed! Thanks!
  • asked a question related to Bayesian Methods
Question
8 answers
I'm working on a method to determine the prior-predictive value (PPV, also known as evidence or normalization) in Bayes formula which, in the case of a multi-modal posterior, should work significantly better than common thermodynamic integration methods.
I would like to test my method on a real example.
Does someone know an application of Bayes formula where a high-dimensional, multi-modal posterior hinders an accurate calculation of the PPV? With the data and the explicit form of the likelihood available?
Relevant answer
Answer
  • asked a question related to Bayesian Methods
Question
4 answers
I would like to receive suggestions on applied research if possible. Individually there are plenty of work, but I haven't found much research using the combination of both areas. Thanks in advance for your contribution.
Relevant answer
Answer
Hi George,
please ask Etienne Rivot (Etienne.Rivot@agrocampus-ouest.fr) and Sakari Kuikka ( sakari.kuikka@helsinki.fi). They should provide you some answers. all the best. pierre
  • asked a question related to Bayesian Methods
Question
6 answers
I get error message, "Fatal exception during plot;Illegal range values, can't calibrate" when I try Bayesian skyline reconstruction in "Tracer" program. I would like to know what is wrong. Is MCMC not enough?
Relevant answer
Answer
Dear Stew,
Thank you for your answer and review paper.
Yes, genealogy is very shallow in my data (a dominant haplotype in more than 60% , and others, most of them are singletons and a few of haplotypes shared in multiple individuals, are derived from dominant one with one or two substitutions.)
Haplotype network shows typical star shape, indicating the population surely expanded in recent.
Once I tried Beast-Tracer to reconstruct BSP with this haplotype data (omit similar sequences). I could get skyline plot but it is very short and not very informative, because we do not have reliable time calibration.
 The reviewer recommended BSP, and I tried to run MCMC throughout several days with all sequence data including 540 individuals, but, I am afraid that anyway it can not run on TRACER, or it will be not very informative because of lack of good calibration for divergence time.
Best regards,
Noriko
  • asked a question related to Bayesian Methods
Question
4 answers
I am searching different methods in "blind image restoration" and "blind PSF estimation". What are the advantages of non-Bayesian methods and how can one improve these methods in order to obtain a more precise PSF and image? In Bayesian methods how can we assign probabilities to f, h, and n, and obtain the original image from these probabilities? where can I find the codes for simulating these methods? I want to simulate a method for standard images and then improve it to get better results. which method do you think is better to start with?
Relevant answer
Answer
You can also find a useful introduction to blind processing in:
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, New York, Wiley, 2001
  • asked a question related to Bayesian Methods
Question
3 answers
I have selected representative strains from each cluster in my Maximum Likelihood tree for divergence dates estimation but I made sure I covered the entire time period of detection of these strains (1975-2012). Will my exclusion of some of the strains affect my estimated evolutionary rate and times of divergence at the nodes after BEAST analysis?
Relevant answer
Answer
Since it is a bayesian method, anything related to the data influence the output. Excluding some taxa should not influence so much the estimate, but the distribution of them across the years may do. If you have more taxa from a specific range of time, this may drive the estimate towards one direction rather than another. The punctual estimate may not change, but then you can have a broader 95% HPD.
  • asked a question related to Bayesian Methods
Question
9 answers
Is HPD Interval is the best interval that we could use as interval estimator?
From posterior distribution, we could form many Bayesian Credible Interval/Region. HPD interval is the shortest interval among all of the Bayesian Credible Intervals. Many literature said that we could use HPD interval as the interval estimator.
What is the requirement for any interval so it could be use as a good interval estimator?
Relevant answer
Answer
It is perhaps also worth noting that HPD does not always result in an interval estimate, When the posterior density is multimodal, then HPD can yield non-interval set estimators, in contrast to quantile limits. As such, the HPD is a truer summary of your posterior. Gelman et al's Bayesian Data Analysis, 2nd Edition, has a nice short discussion at Section 2.3.
  • asked a question related to Bayesian Methods
Question
55 answers
I'm not a statistician. Just an interested MD. So, please, suggest somethitng relatively understandable. WinBUGS, e.g., is out of question. It has to have GUI, not only a command line. I have some expirience in Object Pascal and Java but not much. I'd appreciate any advise for beginner. I'm planning to start with observational longitudinal data (register and survey) and, maybe later, move towards network meta-analysis.
Relevant answer
Answer
I use GeNIe/SMILE (University of Pittsburgh).
SMILE is a fully platform independent library of C++ classes implementing graphical probabilistic and decision-theoretic models, such as Bayesian networks. SMILE is released as a dynamic link library (DLL). There are also several SMILE wrappers, such as SMILE.NET (.NET interface - very good), jSMILE (Java interface) etc. Versions available for Windows, Unix (Solaris), Linux, Mac etc.
GeNIe environment for the creation of decision theoretic models. Main features are graphical editor to create and modify network models and the SMILE Engine. You may develop models in GeNIe and create a custom interface for them using SMILE. 
System have thorough and good documentation.
  • asked a question related to Bayesian Methods
Question
3 answers
As my question may sound a bit bulky, please consider the following example:
You want to estimate the latent abilities of p participants who have answered i items in a questionnaire. The individual answers to the different items can be ascribed with the following model:
Answer_i,p = u_i + u_p;   u_i ~ Normal(0, sd) and u_p ~ Normal(0, sd)
where u_i and u_p are randomly varying parameter manifestations of item difficulty and participant ability, respectively.
Suppose you expect (from previous studies) that participant ability is driven by some covariate C that you can assess and/or experimentally manipulate. A regression of u_p on C does not suggest any significant association between participant ability and this covariate, although the effect (beta) points numerically to the expected direction.
Therefore you try to inform your psychometric model by including a (fully standardized) covariate C as a second level predictor of participant ability:
Answer_i,p = u_i + u_p;  u_i ~ Normal(0, sd), u_p ~ Normal(beta*C, sd)
Surprisingly, the beta of C now differs significantly from zero, as the estimates of participant ability have changed. Thus the estimates of participant ability become informed by a covariate, which did not have any predictive value before.
I repeatedly encountered this phenomenon. Nonetheless I must admit, that it remains quite elusive to me. Moreover this seems to introduce some kind of arbitrariness in the interpretation of predictors for latent abilities. Is the covariate indeed informative with regard to participant ability or is it not?
Does anyone have a solution to this issue? Proably such situations should be dealt with by appropriate model comparision strategies (i.e., one should refrain from interpreting estimates of such nested effects in isolation)?
Relevant answer
Answer
I wouldn't worry about statistical significance. In the example you gave, C accounts for some of person ability in the top model, and some of the response accuracy in the bottom. Depending on how the computer has done the estimation, the u_p may differ. They (what statisticians call the condition modes, but which people often refer to as the person residuals) are estimates that for most models borrow information from each other. This means some accuracy values that are very different from the others will be estimated as not as different. They are often called shrunken estimates, and a great paper on it is http://statweb.stanford.edu/~ckirby/brad/other/Article1977.pdf (it has a baseball example, and it is the playoff season here in the US, go Dodgers). If ignoring significance it looks like they tell the same story, that would be fine.
This doesn't get exactly to your question about what to do, but that will depend on your purpose. Do you want to measure ability, or ability after conditioning on something.
  • asked a question related to Bayesian Methods
Question
2 answers
Which method is more efficient (fast and reliable) for a corpus covering not more than 20 categories?
Relevant answer
Answer
Rafael, many thanks for sharing your insights and the links.
  • asked a question related to Bayesian Methods
Question
6 answers
Can anyone compare Particle Filter and Gibbs Sampling methods for approximate inference (filtering) and learning (parameter estimation) tasks in general DBNs containing both discrete and continues hidden random variables? Are both methods applicable? Which one is more computationally efficient? Which one is suitable for which task?
Relevant answer
Answer
I think it depends on your model as well as on which inference task that you are interested in. If you are trying to solve the (on-line) filtering problem, then particle filters would be preferable for sure. Also for off-line inference tasks, smoothing and parameter learning, particle filters are well suited for dynamical models. If you haven't already, I would recommend having a look at particle MCMC,
which tries to get the best of both worlds.
If you have both continuous and discrete latent variables you can consider marginalising (i.e. Rao-Blackwellising) over the discrete states. This will reduce the Monte Carlo variance, but at the cost of increased computational complexity. Whether or not it is worth it depends on the model. This technique can be used both with particle filtering/PMCMC and with Gibbs sampling.
  • asked a question related to Bayesian Methods
Question
3 answers
I am asking for estimating the location parameter of logistic distribution by Bayesian methods. I am in need of a proper prior distribution for µ , and how to know the prior is Normal ? or logistic ? Or how to know what is the result for posterior distribution for µ.
Relevant answer
Answer
See chapter 10 of:
Hoff, Peter D. A first course in Bayesian statistical methods. New York: Springer, 2009.
Except for the normal regression model, standard conjugate prior distributions for generalized linear models do not exist. The book recommends obtaining the posterior distribution in your situation with a Markov chain Monte Carlo (MCMC) method, the Metropolis algorithm.
  • asked a question related to Bayesian Methods