Science topic

# Stochastic Modeling - Science topic

Explore the latest questions and answers in Stochastic Modeling, and find Stochastic Modeling experts.
Questions related to Stochastic Modeling
Question
In robust optimization, random variables are modeled as uncertain parameters belonging to a convex uncertainty set and the decision-maker protects the system against the worst case within that set.
In the context of nonlinear multi-stage max-min robust optimization problems:
What are the best robustness models such as Strict robustness, Cardinality constrained robustness, Adjustable robustness, Light robustness, Regret robustness, and Recoverable robustness?
How to solve max-min robust optimization problems without linearization/approximations efficiently? Algorithms?
How to approach nested robust optimization problems?
For example, the problem can be security-constrained AC optimal power flow.
To tractably reformulate robust nonlinear constraints, you can use the Fenchel duality scheme proposed by Ben Tal, Hertog and Vial in
"Deriving Robust Counterparts of Nonlinear Uncertain Inequalities"
Also, you can use Affine Decision Rules to deal with the multi-stage decision making structure. Check for example: "Optimality of Affine Policies in Multistage Robust Optimization" by Bertsimas, Iancu and Parrilo.
Question
in fact, i'm working on simulation optimization and i'm wondering if there is some kind of traditional simulation in the literature that researchers usually test their methods on. Also if anyone has stochastic simulations available implemented on julia, python or C or any other language and want to share with us
Thank you Robert Boer for your interest in my question. in fact i am working in the field of simulation optimization, where i am trying to optimize a stochastic objective function that we cannot represent analytically, and the only way to evaluate it is to use stochastic simulation. I am actually using a single server queue simulation to test the optimization methods (get the best priorities between the queues to minimize the waiting time). but I wonder if there is a more realistic example to test on it.
Question
X~Poi(lambda) and Y~Poi(2*lambda). X and Y are independent. I am to find MLE estimator of lambda. I am forming the joint density and extracting marginal for X. Is this necessary? I know f(x).
As we know f(x,lambda)
Question
I want to generate Figure 14 from Gillespie's 1977 paper.
Exact Stochastic Simulation of Coupled Chemical Reactions
If anyone can help me with the code to generate the Figure. (MATLAB, PYTHON)
Yamen s Alharbi Yes, but if you have something I would appreciate it.
Question
The birth and death probabilities are p_i and q_i respectively and (1-(p_i+q_i)) is the probability for no change in the process. zero ({0}) is an absorbing state and sate space is {0,1,2, ...}. What are the conditions for {0} to be recurrence (positive or null)? Is the set {1,2,3,...} transient? What we can say about duration of process until absorption and stationary distribution if it exists and etc?
Every comment is appreciated.
There is no logical (reasonable) condition that {0} is not absorbing, so it is always a recurrence state. {1,2,...} is always transient.
Question
I have a two dimensional stochastic  equation like this
dX=AXdt+BXdV1+CXdV2
Where A B and C  are matrices, dV1 and dV2 are correlated with each other and called color noises. My question is , if i want to use Range Kutta scheme, do i need to normalize the color noise and  make it of length 1/sqrt(dt)  like  White noise  or we can use Platen's and Kloeden's Range Kutta scheme directly for color noise case.
I am responding to my own question, may be someone find it helpful.
The answer is, we need to normalized the correlated noise and make it behave like white noise N(0, 1/sqrt(dt)). For instance, V1 is a color noise, we need to normalize it as V1/(std(V1)*sqrt(dt)) ~ N(0, 1/sqrt(dt));
Question
Stochastic Modelling with Optimal Control
I suggest Stochastic Differential Equations - An Introduction with Applications, by Bernt Øksendal, https://www.springer.com/gp/book/9783540047582
Question
I need random walk code for matrix 20*20 in MATLAB. but without repeatation in each place. I mean we can walk just once in each room.
I do not know why anyone would use MATLAB or python when C compilers are free and you can do anything in C. I have written hundreds of programs and millions of lines of code. I have implemented a random walk in C. If you provide more details, I'd be glad to write the code and send it to you if I don't already have it. My book on Monte Carlo Methods (https://www.amazon.com/dp/B07BHWRDSD) is free on these days: 12/18/19, 12/26/19, 1/3/20, 1/11/20. My book on particle tracking that includes random walk with validation (https://www.amazon.com/dp/B07XRWKS6H) is free on these days: 12/25/19, 1/2/20, 1/10/20, 1/18/20. This simulation of diffusion from the MC text is free online.
Question
Dear all,
I am interested in studying Stochastic Programming problems but I know that one of the many difficulties faced by students is how to reduce the scenarios especially when they encounter a large scenario-tree.
'SCENRED' is a useful solver of GAMS which can reduce the number of scenarios but I could not find any useful materials or manuals of SCENRED on the Internet to show me how we can write GAMS code to reduce scenario numbers properly in Scenred environment.
I would appreciate if you could give me further details about SCENRED solver and its implementation in GAMS platform by giving a simple example.
Question
I had tried ARIMAX modeling in R. I didn't get clear references. If anyone tried this before in R, please give me some suggestions or references which will be useful.
you'll get help there immediately regarding any problem with R software.
Question
If existing microgrid energy management system is deterministic, how to design it in real time probabilistic or stochastic model?
How to get expertise in this modeling?
Question
I am trying to solve the stochastic inhomogeneous differential equations.
The stochastic inhomogeneous part of that equation is as source of problem and it is the noise.
I want to modeling the power density of noise (especially thermal and diffusion noise) in "time domain" as a mathematical process in differential form.
I attached the instance equation of my problem.
the right side of this equation is the stochastic process which we know that as electrical noise.
one of the model for noise is Winner process which it is not differentiable, while white noise is defined as the derivative of winner process.
so we have to Integrate the sides of equation to solve this numerical equation.
I am looking for a stochastic process that has deffentiable property.
It will be easier for me because it will be simpler than integral form for numerical simulation.
thanks you for participation in my problem.
Question
This map is produced by CA-Markov module to predict the future land use map of an study area. Input land use maps were 2004 and 2008 with 9 land use types. 3 of them (housing, commercial and industrial) were with 10 class of probability of growth (created using Weight of evidence method) with class 1 least probability and class 10 highest probability of growth. And other 6 classes just extracted from land use map of 2008 ( as there were). So Markov chain transitional area matrix, basis land use map (land use map of 2008) and raster group file of 9 classes (as explained) were other input files, and asked to project for next 4 years (2012).
Why the result is like this?
Thank you
I am also facing the same problem. How I can resolve that error
Question
I want to improve the specification performance of my MEMS Gyro, As we know, the measurement errors of a MEMS gyroscope usually contain deterministic errors and stochastic errors. I just focus on stochastic part and so we have:
y(t) = w(t)+b(t)+n(t)
where:
{w(t) is "True Angular Rate"}
{b(t) is "Bias Drift"}
{n(t) is "Measurement Noise"}
The bias drift and other noises are usually modeled in a filtering system to compensate for the outputs of gyroscope to improve accuracy. In order to achieve a considerable noise reduction, there's another solution that the true angular rate and bias drift are both modeled to set as the system state vector to design a KF.
Now if I want model the true angular rate, How could I do this? I just have a real dynamic test of gyro that includes above terms and I don't know how can I determine parameters required by the different models (such as Random Walk, 1st Gauss Markov or AR) for modeling ture angular rate from an unknown true angular rate signal!
You can also model the scaling errors and angular displacement, so the full model would be
y(t) = S R w(t) + b(t) + n(t),
where matrix S is matrix of scaling factors, and R is matrix for angular displacement. However in practice the biggest contributor of error is bias b(t). Errors due to scaling error and angular displacements are nowday usually low, because manufacturing quality of gyro sensors is quite good now.
Question
One of the main stability theories for stochastic systems is stochastic Lyapanuv stability theory, it is the same as Lyapanuv theory for deterministic systems.
the main idea is that for the stochastic system:
dx=f(x)dt+g(x)dwt
the differential operator LV(infinitesimal generator- the derivative of the Lyapanuv function) be negative definite.
there is another assumption for this theory:
f(0)=g(0)=0
and this implies that at equilibrium point (here x_e=0) the disturbance vanishes automatically.
what I want to know is that is it a reasonable assumption?
i.e in engineering context, is it reasonable to assumed that the disturbance will vanish at the equilibrium point?
From my practical experience if f(0)=0, then g(0) doesn't equal 0, because of the sensors noises.
Question
Stochastic Modeling and Solution approaches from scratch !!
Good Intro: Grimmet's "Probability and Random Process" covers modeling, and has a full companion of exercises with solutions.
From operations research: Resnick's "Adventures in Stochastic Processes" is great (good coverage of Markov Chains/Processes, Renewal Theory, Queues, etc.)
For continuous processes (SDE): Oksendal's "Stochastic Differential Equations: An Introduction with Applications" is a succinct intro.
For core financial modeling: Shreve's "Stochastic Calculus for Finance II: Continuous-Time Models" is a classic (also know as "Baby Shreve", its a more applied and approachable alternative to "Big Shreve" - Brownian motion and Stochastic Calculus)
Modeling with Jump Processes: Cont & Tankov's "Financial Modelling with Jump Processes", very accessible intro to modelling with jump diffusions and Levy processes.
Question
My work is mainly experiment-based research. Moving a step further in the advanced analysis, can you please help me with the following questions?
1- Do you think this topic is linked with dynamic systems analysis? if yes: how this analysis should be done?
2- What kind of theoretical analysis (based on differential equations formulation) could be added to my research (especially to the vortex's stability and/or stochastic factors)?
3- What's your best suggestion for making sure that the results obtained (from experiments) are dependable? (Validation by CFD?)
Every single answer is important to me.
Thank you very much.
Vortex flows are ubiquitous at all scales of matter organization, from quantum systems to large structures of the universe. In the most general mathematical sense, it is useful to look at these structures in a unified way. When trying to organize my ideas in this field, I have encountered a book on the general theory of vortices that I recommend as a valuable source of information placing the subject in a multidisciplinary context; for the synopsis please see:
Regarding the research suggestions, I agree with the previous comments, but I can add some specific answers:
>>Do you think this topic is linked with dynamic systems analysis? if yes: how this analysis should be done?<<
The answer is definitely yes. You can consider the following paper as an illustration of the methods derived from Dynamical Systems Theory.
>>What kind of theoretical analysis (based on differential equations formulation) could be added to my research (especially to the vortex's stability and/or stochastic factors)?<<
The theory of stable and unstable manifolds discussed in the reference above. It is also useful to consult a book by Ottino: The kinematics of mixing: stretching, chaos and transport
>>What's your best suggestion for making sure that the results obtained (from experiments) are dependable? (Validation by CFD?)<<
The best way to obtain reliable results is to set an experiment as carefully as possible. CFD calculations are generally validated by experiment. However, the use of CFD to validate the experimental results is very useful (I always look at numerical simulations as a parallel experiment).
Question
More preciselly in stochastic nonlinear model predictive controllers.
Thanks for the opportunity.
It is my pleasure to help as per capacity.
Question
I have historic time series of 40 years of many weather variables. Call each variable's time series A, B, C ... Z for simplicity.
I want to use all 40 year time series for training with the intention of reproducing stochastic and synthetic time series.
Now i can use simple Markov chain or Monte Carlo approaches for individual variables with great success. However, the relationships between the variables will not be maintained.
I need all variables to relate, such that A has a strong connection to B, but not to C etc.
So when I stochastically generate A, I want that to influence B and not C.
What is the best method to simulate complex inter-dependencies?
Stretch goal: how can this be done in Python 3??
Thanks for any and all help!
Best,
Jamie
Yes, there is definitely some correlation going in many of these plots, so I think that copulas could work to some degree of accuracy. As regards the exact setup, I do not really follow how the variables are supposed to tie together. Generally, copulas can handle many variables, so it could be easier to include even seemingly uncorrelated ones, just in case there might be some small correlation. But, then again, I do not really follow all variables here.
Question
I need help in understanding the role of (random) sampling in implementation of a control system in Simulink. I need a basic, general example to visualize the role of the sampler in a control system, and the way it can be programmed (to be random/event-triggered etc).
Any help in this regard is very much appreciated
Thank you in advance
Hi Samira,
Referring to the Examples 9.3 and 9.4 in Prof. Lewis' book (Optimal and Robust Estimation: With an Introduction to Stochastic Control Theory, 2e), the attached MATLAB example (m-file) shows how to simulate a stochastic control system.
Hope this helps!
Question
Can anyone recommend me a good and comprehensive book/article on stochastic domain growth?
Let me suggest a source of some basic methods: a paper by Andreas C. Aristotelous and Richard Durrett
and some medical applications (models of tumor growth) in papers by K.A. Rejniak, A.R.A. Anderson et al., e.g.
Question
Explain with simple example
Stochastic modeling is a probabilistic approach of mathematical modeling. Randomness of the systems can be analyze using stochastic modeling approach.
Question
Hi
I am trying to find steady state solution of a stochastic differential equation
dy/dt=Aydt+B1ydV1+B2ydV2
where A, B1 and B2 are operators , dV1 and dV2 are color noises.
is there any way or any literature , where steady state solution (dy/dt=0) of a stochastic differential equation has been found out. your help will be appreciated.
Dear Arif, see attached pdf, Chap.4, point 4.5 . Gianluca
Question
I tried the curve fitting toolbox in Matlab but it was limited to 2 independent variables. I read about the linear regression function in Matlab but I am not sure if it can produce the equation governing the relation.
One possiblity is the matlab function lsqcurvefit. See
or type "help lsqcurvefit" in matlab.
It allows you to fit any nonlinear function with as many parameters as you want, as far as I understand it. I used it lately for fitting 2 parameters  and holding another one constant. Sometimes changing my guess for the constant parameter.
However, I warn you: According to my experience, fitting nonlinear functions with at least 2 parameters, will very likely result in some very local minimum. That means that your results might very strongly depend on the initial assumption for the fit parameters! I recommend always plotting the resulting fit curve compared to your data and with a bit of trial and error guessing the fit parameter values better and better until you are satisfied with the solution (by e.g. the looks of the plot or your personal weighted measure of deviation between fit curve and data).
Question
Hi
Your project sound really interesting. For your interest, we have developed highly novel methodology, referred as HMM-GP (Paper as attach file), for artificially generating synthetic daily streamflow sequences.  HMM-GP, a suite of stochastic modelling techniques, integrates highly competent Hidden Markov model (HMM) with the generalised Pareto Distribution (GP). The application of HMM model retains the key statistical characteristics of the observed (input) streamflow records in the synthetic (output) streamflow series but essentially re-orders the magnitude, spacing and frequency of streamflow sequences to simulate realistically possible multiple alternative (artificial) flow scenarios. These synthetic series could be utilised in a range of hydrological/hydraulic applications. Moreover, within the HMM-GP modelling framework, Generalized Pareto Distribution (GP) fitted to values over 99 percentile allows highly accurate simulation of extreme flows/events.
I would be very happy to hear from you if you have any comments/questions for me.
Best wishes
Sandhya
Question
I am working on system identification problem. The question of convexity come up, and I am wondering if transfer function model and state space model suffer from the local minimum issue?
I am not sure, if I understood the question. Convexity and local minimum are related to optimization problems and functions which are minimized. Models like transfer functions or state space representations are not related to these issues. If the problem consists in the fact that you are using a numerical optimization algorithm as a method for parameter estimation and you are minimizing a function which have local minimum, then yes this could be a problem but from the optimization algorithm and not from model itself. You can change the identification methods as suggested Jose or you can change the cost function.
Question
If we train a data model once on a dataset using a machine learning algorithm, save the model, and then train it again using the same algorithm and the same dataset and data ordering, will the first model be the same as the second?
I would propose a classification of ml algorithms based on their "determinism"
in this respect. On the one extreme we would have:
(i) those which always produce an identical model when trained from the same dataset with the records presented in the same order and on the other end we would have:
(ii) those which produce a different model each time with a very high variability.
Two reasons for why a resulting model varies could be (a) in the machine learning algorithm itself there could be a random walk somewhere, or (b) a sampling of a probability distribution to assign a component of an optimization function. More examples would be welcome !
Also, it would be great to do an inventory of the main ML algorithms based on their "stability" with respect to retraining under the same conditions (i.e. same data in same order). E.g. decision tree induction vs support vector vs neural networks. Any suggestions of an initial list and ranking would be great !
for quite a comprehensive list of methods.
There is an element of chance in the training process. In some software, you can get reproducible answers by using something like set.seed( ) in the R language. Using the seed number again with the same data will then give the same result. Then you can report the software you used with the seed. However in general the different outcomes will be close together, but as with sampling, you will occasionally get outliers (depending on the seed you choose).
Question
Why threshold value set in exponential phase, if it set in linear phase for Real time PCR any problem?
It's not clear your question !!!!
Question
What is the difference between ARMAX model and Linear regression with ARMA errors? and how does the estimation of the 2 models differ?
An ARMAX is a model of lagged dependent variable and lagged independent variable(s). On the other hand a linear regression with ARMA errors is linear regression of a dependent variable on independent variable(s) such that the errors (or residuals) are observed to follow an ARMA model. ARMAX models are time series models and are estimated using time series approaches. However the linear regression model is not a time series model and be estimated using regression approach after which an ARMA model can be fitted to the residuals.
Question
Nash-Sutcliffe coefficient is considered to be a very reliable measure of goodness of fit for hydrological models. But can it be used for other than hydrological models?
Nash, J. E.; Sutcliffe, J. V. (1970). "River flow forecasting through conceptual models part I — A discussion of principles". Journal of Hydrology. 10 (3): 282–290. doi:10.1016/0022-1694(70)90255-6
Sure, It can. It shows the percentage of the variability represented by the model. 100% means perfect model. 0% means the model always returns the mean value. A minus NS shows the model is worse than a model only returning mean value.
Question
Why usually triangular distributions are used as input variables for Monte Carlo simulation? How valid are triangular distributions for this purpose? To me, bell shape distributions are better replications of most real world activities.
Triangular distribution is used for when you have no idea what the distribution is but you have some idea what the minimum value is for the variable, the maximum value for the variable and what you think the most likely value is.
For example let us say you are interested in how many Birthday Cards to order this year to maximise profit. You have no idea what the distribution is, but you have a gut feeling that you have never sold more than 2000 and you have never sold less than 500, and in most years you have sold about 1500. Here you could use the triangular distribution setting the minimum value to 500, the maximum to 2000 and the modal value to 1500. The normal distribution will not do here as the distribution is likely to be skewed if you look at the minimum, maximum and modal values.
You don't have to use the triangular distribution. You can also use the Beta Distribution for this. This is called PERT ( Program Evaluation and Review Technique) analysis. Again the minimum, maximum and modal values are used to derive a beta distribution which can be used for Monte Carlo
Question
What is a reasonable range of model time step to be used for rainfall-runoff modelling? What could be the significant difference when using let's say 5 min or 60 mins of model time step?
The appropriate temporal resolution of a hydrological model will indeed depend on hydroclimatological and geophysical characteristics. This means for instance that for large and slowly responding catchments you can work with a coarser temporal resolution than for a small and quickly responding catchment. If one would have a temporal autocorrelogram of the runoff (if that's the variable of interest), one could estimate the appropriate temporal resolution of the rainfall-runoff model (see for instance the two publications below).
Furthermore, the appropriate temporal resolution will obviously also depend on the aim of the study (high flows, low flows, water balance study etc.) and the availability of data. There should always be a balance between data availability, study aim and model complexity (including spatial and temporal scales).
Question
Hi all!
I am looking for some suggestion on some raster data simulation.
I need to generate a map of a simulated value (ranging 0-1) with clustered spatial autocorrelation.
At the moment I am using the rMatClust() function in spatstat package, generating a Matern Cluster Process of random point pattern, and then transforming the density of points into a raster...
It works perfectly but I'm sure there must be a more 'elegant' way to do it.
Any suggestion?
See the R-INLA website and on-line resources. The SPDE tutorial (http://www.r-inla.org/examples/tutorials/spde-tutorial) has a worked example of simulated dataset, with code available at http://www.math.ntnu.no/inla/r-inla.org/tutorials/spde/R/spde-tutorial-functions.R
Question
Looking for problems faced in industries in the field of manufacturing systems that I can attempt to address as part of my research.
You can work on modeling of job shop scheduling. NP hard problem.
Question
1, if A1,A2,...An are stochastic matrixes, then except that A1A2...An is stochastic, what is its other properties?
2, What is the classic book to refer the stochastic matrix?
@ Jiangbo
Can you please explain the term "continued product of different stochastic matrices"? My quess is that you want to analyse ways of solving some transition matrices of non-honogeneous Markovian processes on discrete state spaces with continuous time.  Then the feature   "continued" would be justified. Indeed, the solution of   P'(t) = L(t) \circ P(t) with non-constant intensities   L(t)  requires some limit procedures  for the chronological products  of matrices  exp(  L(t_j) dt_j ), which are stochastic. Such procedures are investigated in relation to the Feynman path integrals and for stochastic differential equations. Hopefully, this tips help a little:)
Regards
Question
I'm developing a wind speed map and I need to find the appropriate wind speed for each contour point for a specific return period. How should I use Monte Carlo simulation to get the wind speed, if I have nested equations for the wind formula, and developed Probability Distribution Functions of the needed parameters.
Perform the following 4 steps:
1. Define a domain of possible inputs.
2. Generate inputs randomly from a probability distribution over the domain.
3. Perform a deterministic computation on the inputs.
4. Aggregate the results.
I hope this helps :)
Question
Dear all,
we know, owing to a result appearing in the book of R. M. Dudley (Real analysis and Probability - Theorem 2.8.2), that for any separable metric space (S, d), there is a metric e on S, defining the same topology as d, such that (S, e) is totally bounded.
I would like to know if a uniform continuous function f for the metric d will remain uniform continuous for the constructed metric e.
Regards,
Nathalie
Consider some unbounded uniformly continuous function f on an (unbounded) separable metric space (S,d). If f remained uniformly continuous on (S,e) it would extend to the completion of (S,e) which is compact, thus would be bounded.
Question
Most of the literature on the recursive maximum likelihood estimates of parameters of a partially observed model seems to be in discrete time, i.e. on Hidden Markov Models (HMMs).
There is quite a strong result for HMMs in
Tadic, V. B. (2010). Analyticity, Convergence, and Convergence Rate of Recursive Maximum-Likelihood Estimation in Hidden Markov Models. IEEE Transactions on Information Theory, 56(12), 6406–6432. http://doi.org/10.1109/TIT.2010.2081110
I'm wondering whether there is similar work in continuous-time models. If they exist, I can't seem to find them. Maybe the problem is still open.
Thank you for any hints!
Yes, I think these might help you:
"Efficient descriptor-vector multiplications in stochastic automata networks"
"Efficient vector-descriptor product exploiting time-memory trade-offs"
"SANGE – Stochastic Automata Networks Generator
A tool to efficiently predict events through structured Markovian models"
Question
Is every probability distribution necessary to have scale, location and shape parameters? Although, probability distribution is characterized by location and scale parameters, while location and scale parameters are typically used in modeling applications, then what is role of shape parameter?
There is some confusion about location and scale parameters.
First we have the "natural" parameters such as moments (expected value, variance, third centred moment etc). Second we have the parameters that define the density function of the family of distributions (or the discrete probability function).
Any family of distributions has natural parameters that describe the central tendency, the width and skewness and kurtosis.
A functional parameter of a family of distribution is called a location parameter is a parameter that changes by a transformation adding a constant a to the random variable to a new value = old value + the constant.
A parameter is called a scale parameter is a parameter of the family of distributions that changes by scaling the random variable by a multiplicative constant c into c times the old parameter.
For example, X normally distributed with mue and sigma:
mue is a location parameter and sigma is a scale parameter.
For Y 1= a + X it holds, the expected value of Y1 is mue + a.
For Y2 = c times  X, the standard deviation of Y2 equals c times sigma.
Distributions that have scaling or location parameters with these properties are special.
To describe the distribution of any random variable, the moments above give a rough picture of the distribution. Only a few families of distributions have the same shape throughout (for all functional parameters); the same applies for kurtosis.
Question
Hello all,
I would like to ask about the following benchmark functions:
Ackley
Bohachevsky 1
Camel 3 hump
Drop wave
Exponential
Paviani
Are these benchmark functions unimodal or multimodal ?
Thanks to all who contribute to the answer.
great my dear
Question
I'm interested in studying the statistical properties of pedestrian flow through subway systems (not within a single subway station, but through the subway system. I.e. from station A to station B, etc.) In particular, I want to fit a model to determine the distribution of paths people take through the subways.
Given a dataset of the topology of a subway system (which stations connect to which stations) and timeseries of how many people enter and exit each subway station for during each interval of time. I want to imagine this as a network where each node is a subway station with a decay rate to each of its neighboring stations.
I wrote a little more about my ideas on this problem in the included document and am hoping for some advice on how to approach this problem. I'm taking an introductory statistics course now and (this is independent of my course work) this is the first time I've had to imagine a model of my own and fit it.
I'm hoping for some advice from the experts here on how to approach this problem
February 14, 2017, 3:42 PM EST (GMT-5 hours), Gulf Specimen Marine Laboratory & Aquarium, Panacea, Florida, USA
Dear Akiva,
Friesen, M.R.P., R. Gordon & R.D. McLeod (2014). Exploring emergence within social systems with agent based models [invited]. In: Interdisciplinary Applications of Agent-Based Social Simulation and Modeling. D.F. Adamatti, G.P. Dimuro & H. Coelho, (Eds.) Hershey, Pennsylvania, USA IGI Global: 52-71.
Yours, -Dick Gordon <DickGordonCan@gmail.com>
Question
Linking EnKF to HSPF
Dear Amir,
I suggest to you a links and attached files in topics.
-Journal of Hydrology | Vol 519, Part D, Pgs 2661-3692, (27 November ...
-ensemble kalman filters: Topics by WorldWideScience.org
Best regards
Question
For multidimensional optimization problems, I want to find which of the variables (to be optimized) are highly influential on the objective function value, so that I can have an expectation on which parameters would be accurately identified by the optimizer.
Given a domain for each variable, the aim is to identify which of the variables (in their respective domain) has the most influence on the overall objective function value.
The complexity also arises due to the fact that I am working on black-box optimization , and the objective function is not algebraic but a finite element computation. So I want to reduce the dimension of the problem.
Thanks for the help.
Dear Dr. Chakraborty,
I strongly suggest you  the so-called analysis of variance (ANOVA) approach.
This method is used to evaluate the influence of each design variable and their interactions to the objective function. Based on the result of the ANOVA, you can succesfully reduce the number of design variables by eliminating those that
have small effect on the whole objective function.
Look at the attached paper.
Hope it can help,
Fabrizio
Question
I have question regarding simulating under mentioned 1D Stochastic Differential Equation in R using Sim.DiffProc package:
dx1 = (b1*x1 − d1*x1) dt + Sqrt(b1*x1 + d1*x1) dW1(t)
I have taken this equation from book: Modeling with Ito Stochastic Differential Equations by E. Allen. In the deterministic and diffusion part of equation, b1 and d1 are model parameters representing birth and death rates (for single population approximation of two interacting populations compartment model). Relevant lines of my code are as under (note that i,ve used theta's to represent parameters in my code):
Code (1):
> fx <- expression( theta*x1-theta*x1 ) ## drift part
> gx <- expression( (theta*x1+theta*x1)^0.5 ) ## diffusion part
> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1,theta3=1,theta4=1),pmle="euler")
Or should I model it like this
Code (2):
>fx <- expression( theta*x1-theta*x1 )
> gx <- expression( (theta*x1+theta*x1)^0.5 )
> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1),pmle="euler")
I am not clear whether to use theta, theta, theta, theta as I have used at first place above or should I code it like only using parameters theta and theta (done at second place above) because in original model the parameters b1 and d1(birth and death rates) appearing in the deterministic part are same as appearing in the diffusion part.
I don’t find a single example in Sim.DiffProc package documentation where there is any repetition of parameters just like I have done at second place.
Thanking in anticipation and best regards.
I would use code [2} above with two parameters theta and theta.
Also,  it is very easy to code this directly without using any packages by applying the Euler Maruyama approximation method (which is described in E. Allen's book).
Also, see the book by Linda J.S. Allen which has ALL THE CODES for such examples problems given in the book . So may copy into R directly and implement:
Linda J.S. Allen, An Introduction to Stochastic Processes with Applications to Biology, Second Edition
Also, the following papers contain more examples of somewhat more complicated stochastic differential equations which have been solve in MATLAB (similar to R) using Euler Maruyama approximation:
A.S. Ackleh and S. Hu, Comparison between Stochastic and Deterministic Selection-Mutation Models. Mathematical Biosciences and Engineering, 4(2007), 133-157.
A.S. Ackleh, K. Deng and Q. Huang, Stochastic Juvenile-Adult Models with Application to a Green Tree Frog Population. Journal of Biological Dynamics, 5(2011), 64-83.
Question
I would like to emulate a FE model of a stochastic, heterogenous, anisotropic material, subjected to a loading. Due to high computational costs of this model I would like to replace the FE model by a metamodel. Can anyone give me some suggestions?
Dear Matthias
you should definitely try out Kriging and polynomial chaos expansions as surrogate models.
You can find surrogate modelling in the software UQLab (www.uqlab.com), which is Matlab-based, free of use for academics, and gathers state-of-the-art algorithms for this purpose (but also global sensitivity analysis, reliability, etc.)
Best regards
Bruno
Question
My work research aims to integrate graphical models (e.g.: Bayesian Networks) and hydrological models (e.g.:  HBV hydrology model). The problem is how to transform deterministic equations involved in hydrological models into probabilistic relationships between the involved variables (in the form CPTs for BNs). I'm testing Bayesian Network in order to create a probabilistic model dealing with flood forecasting and I’m quite surprised that I didn't find any previous developments in this context.
Thank you Firoz Ahmad. But my research doesn't concerne the classical flood forecasting sytems (based on deterministic equations). These systems generally describe the transformation rainfall-runoff using approximated and simplified representations. They are necessarily uncertain and their performances are affected by several sources of uncertainties coming from the approximations involved but also from errors in input data, incomplete knowledge of initial conditions, and uncertainty tied to their parameters. Quantifying these uncertainties is very crucial for decision making and interpreting the results. My idea is to develop a framework to estimate these uncertainties and their propagations through flood forecasting system using Bayesian Networks.
Question
I know that the polynomial chaos expansion can deal with many distributions such as normal distribution, beta distribution and gamma distribution. But if they simultaneously occur in equations, how can I handle such occasion?
Hi,
PCE is alternative to the Monte-Carlo method and initially was created to deal with
Gaussian variables. But it can be also used with non-Gaussian random variables.
The question is are your variables independent or not?
There are some techniques for correlated non-Gaussian random variables.
Question
Hello everyone,
Using exploratory methods to track changes in the syntactic preferences of constructions over time, I was wondering if anybody has ever conceived of time (e.g. decades) as a continuous variable in statistical analysis.
For instance, I have a corpus that covers the period between the 1830's and the 1920's (10 decades) and I would like to divide my dataset into, say, 5 clusters of decades.
Discrete time:
- 1830-1840
- 1850-1860
- 1870-1880
- ...
Continuous time:
- 1830-1850
- 1850-1870
- 1870-1890
- ...
What do you think? Knowing that this could be feasible only in exploratory analysis, not in predictive analysis (regression models).
Thanking you in advance!
Thank you Ann Christina Foldenauer. It helps a lot, indeed! The thing is that I created clusters of decades right because the second outcome of the binary variable is underrepresented in my dataset (as it is often the case in linguistic studies) if the time variable is continuous. I've never used mixed models, only (multiple) linear and binary logistic models. Basically I created the clusters of decades in order to create Multiple Correspondence Analysis maps in which the configuration of the categories would not be affected by data sparseness.
Question
I am looking for a stochastic model in use clearly for extreme rainfall generation.
the rainfall will be used for hydraulic models and drainage system models.
few concepts are developed in past but it seems there is not any toolkit or software with manual about them, like
Neyman-Scott (NS) or   Bartlett-Lewis or DRIP MODEL
thanks
you can find a Matlab code implementing Neyman-Scott here on RG:
we can support you a bit...as we developed this code :)
Question
I have 450000 data from gyroscope and I want to model stochastic noises of it and at bias instability modeling I need time correlation to model bias instability and I don't know who can I compute it?
Dear Amir Taha,
see the formula for  time correlation in the file. Here R(t) is autocorrelation function.
Question
Hello Researchers,
I have a developed a "Stochastic solar model" for the purpose of long term distribution system planning. I am aware about the indices researchers commonly use to validate Solar models like; RMSE, MBE etc.
But I face the challenge to find similar literature's (or Similar Solar prediction models) and the same Solar data set that they had used for validating the models. I'm also confused whether it is logical to use aforementioned indices for validating a "Stochastic model", because indices values are not constant.
Kindly let me know your suggestions in this regard. Thanks in advance!
Check This Attached article if it can help , Best Wishes
Question
A continuum equation is used to analyze the model related to the stochastic growth of surface using poisson distribution. In this stochastic growth, the flat surface is continued to become rougher as time proceeded but the correlation length is always zero during the stochastic growth process. i am not able to understand why the correlation length will always be zero.
I can tell you about Asymmetric Simple Exclusion (ASEP) and related stochastic particle processes, which also have a surface growth representation. There the stationary evolution sees independent GRADIENTS of the surface (which are just the one-site occupation numbers in the particle picture) at any fixed time moment, so no correlation length there. BUT: the real interest is in the space-time correlations E(gradient at time 0, space 0 * gradient at time t, space x). These are far from trivial. Maybe you can take a look at such a quantity in your model?
Question
I am working on High Frequency Trading/ Algorithmic trading. I have to simulate limit order book. Can anyone help me?
try http://arxiv.org/abs/1603.05313 for the data & software
Question
I need to apply Brownian motion model for continuous character (wing length) and I apply the below code
WL.BM<-fitContinuous(tree, mysorteddata\$WL, model="BM")
and I got this error:
Warning: no tip labels, order assumed to be the same as in the tree
Fitting BM model:
Error in solve.default(phyvcv) :
Lapack routine dgesv: system is exactly singular.
I appreciate any helps and advice
Regards
Aram
in the estimation, the lower bound could be too small (e.g. negative), making the phylogenetic variance covariance non-positive semidefinite singular. tru to use the bound option (bounds = list(a=c(X,> 0))), with X being the loest bound you are willing to accept. Also, check this thread... http://permalink.gmane.org/gmane.comp.lang.r.phylo/2376
Question
Hi,
I am now busy on making a multivariate hydrologic model. It uses several variables (E.g. Runoff, Evaporation, Precipitation and groundwater) to forecast another variable (Lake water level). The procedure is inevitable of making some data reconstruction. I have some doubts about the degrees of accuracy of such model while data are carefully retreated.
The final model would be in the form of regressive-stochastic model based on method of moments.
1. Is it acceptable to do such modelling on synthetic data?
2. what is the criteria for this model?
3. Is there any similar works on such treatment?
note:
• reconstruction is manipulated such that the periodicity, similarities, trends, persistent and moments of the distribution is preserved to acceptable extent.
• A validation set is used for testing the degree of accuracy.
• The degree of reconstruction in data sets varies from just one variable in the middle to 120 data either in past, future or in the middle of the time series.
Babak
It is OK to try to make a model with some missing data that you have estimated using a variety of standard type options to do this as long as you document what you are doing and list its potential limits, etc.  I did not necessary follow you on some things like synthetic data.  If you have model based on real data with a few quality controlled estimates on all or good quantifiable supportable values with time, sure a model may help determine relationships or predict or forecast change.  However, if the max rainfall is 250mm in your data, don't expect perfection if you have a storm outside of the boundary with 350mm.  A prediction may be reasonably good, or not.  Sometimes we try things and they work out great, and other times not so good.  But you have the basic elements of the water balance, so you should be in the ballpark as we say sometimes.  As you know, rain can have a lot of spatial variability, and evapotranspiration is not always easy with vegetation and land use changes.  I assume by runoff you mean a recording stream gauge of water entering the lake, and groundwater a good geology map and some wells measuring shallow and deep groundwater.  Don't go adding a bunch more monitoring or data collection points yet, before you try it, but if you come up with abnormalities such as huge losses or entries you cannot figure out, you may have to go to added measures to find out what is happening.
Question
I am investigating the relationship between one continuous variable and one ordinal variable. Correlation seem appropriate for this for this first step. Now I want to explore how a third continuous variable influences the strength and direction of the first two variables.
Hi Roy, which type of regression are you using for your first step? A Pearson correlation is difficult, because of the ordinal scaled variable.
One way could be to use an ordinal regression analysis where you enter the control variable as the first predictor. Then the regression coefficient of the second variable is the association between the two variables with the third one controlled for. Another way would be a partial regression, but you have to look if there are alternatives for ordinal scaled variables.
When using Peter's approach keep in mind that you want to control both variables for your third one and with this approach you just control one of them!
Instead of controlling, a better way would be to build an interaction-term out of the two interval scaled variables. Here, you have to deal with ordinal regression again.
Question
Hello,
I want to estimate genetic divergence corrected with the nucleotide substitution model HKY. This model was selected as the most parsimonious with jModelTest.   I tried with Mega 5.1 but HKY is not one the models included to compute distances.
The HKY model is equivalent to the F84 model (Felsenstein). Both add a base composition bias term to simpler models with a transition/transversion bias parameter. Check to see if Mega uses F84 instead of HKY.
Otherwise download Seaview (free) which does provide an HKY distance as does PAUP (offers both HKY and F84.
If that doesn't work, tell me which models are included in MEGA and I'll let you know what model to use. If you can use something that is simpler that GTR without changing the topology or branch lengths then you will save power and still get an optimal fit as far as the phylogeny goes.
Question
Most of what I am able to uncover is really old and vague.
Here are a few recent reviews, the Hofmann and Hahn paper reviews a number of studies from 2005-2011:
Hofmann H; Hahn S. Characteristics of nursing home residents and physical restraint: a systematic literature review. Journal of Clinical Nursing. 23(21-22):3012-24, 2014
Huang HC; Huang YT; Lin KC; Kuo YF. Risk factors associated with physical restraints in residential aged care facilities: a community-based epidemiological survey in Taiwan. Journal of Advanced Nursing. 70(1):130-43, 2014
Rakhmatullina M; Taub A; Jacob T. Morbidity and mortality associated with the utilization of restraints : a review of literature. Psychiatric Quarterly. 84(4):499-512, 2013
Question
The functional is defined from a convex closed bounded subspace A of a banach space E to the real space R
No; consider the norm (or its square) on an infinite dimensional Hilbert space.
Dirk.
Question
Hello,
I am using semi-empirical approach to generate synthetic ground motion. This approach is a combination of Empirical Greens function and Stochastic Simulation methods. This method doesn't consider any slip value. Do any other models consider slip of the fault? Because this slip is a variable parameter along the fault length. We didn't have any ground motion records at the location where max slip occurred for the 2002 Denali earthquake. Similarly, effect of asperities or boulders has not been included in developing synthetic accelerogram as far as i am concerned.
Dear Kamyar,
Thank you very much for your response.
Regards,
Chenns
Question
I've solved the Kuramoto model for 100 oscillators and have calculated phase of each oscillator (theta) at each time step. Now by using these phases (thetas) I need to generate two/more signals to measure the synchronization among them as a function of coupling strength with my own method to check the capability of my method.
Dear Elman
thanks a lot for your response, help and guidance
Best Wishes
Question
I am trying to model a multi variate time series model. (i.e. Multivariate ARIMA). The general format of the model is in the shape of a budget model like:
W(t) = f [ X(t), Y(t), Z(t), T(t) ]
were W(t) shows the jump and drop that I mentioned. There is jump and a following drop in the historical dependent variable (i.e. Output). But this record is not just on one data and records seem to persist on this behavior for a medium period (i.e. several years after jump and then a following drop). I have attached the diagram of the time series to this post. As soon as the independent variables spouse to cause or track down the jump and drop; and since I am going to make  them independent from each other. Do I have to remove the jump from the data or not?
I think changes in input variables should track or result in that jump and drop and it is not proper to remove the jump and the drop.
Dear Babak
you dont have to remove the jumps and drops from the data as your model will automatically incorporate these changes into the model system parameters. These features may be low frequency trends that may be important clues in your data you dont want to lose. We often have these in computer simulation models for solar thermal and solar power systems in smart microgrid configurations.
Regards
Gerro
Question
What is the state of the art in the statistical modelling of the internet traffic generated by the different internet activities such as the Skype, Bit Torrent and browsing etc. Are there accurate models existent in literature that are underlying these activities
The following paper studies statistical modelling of VoIP flow of Skype packets:
Statistical analysis and modeling of Skype VoIP flows, N. M. Markovicha, U. R. Kriegerb, Computer Communications, vol. 33, suppl.1, November 2010, pp: S11–S21.
I attached a pdf file of the above paper.
Best of luck,
Question
I am using a birth and death process.
I prefer Papoulis, "Probability, Random Variables, and Stochastic Processes" 2nd ed., McGraw-Hill 1984
Question
Suppose we have different candidate models proposed for a time series based on ACF and PACF. Now the basic equation has the white noise term. In MATLAB, u have "randn" command to generate normal random numbers. The parameters can be estimated from "armax" command.
After parameter estimation (calibration), the validation involves comparing the observed data with the predicted values. Now the problem which I m facing is that figuring out whether the white noise should be generated of what length (data length or a bigger population). Secondly, the white noise sequence should be preserved for all candidate model validation or subject to change? If they are changed, then the performance indicators such as RMSE, ML, AIC, BIC will also change.
So what should i do?
I'd say that if your sample size / number of repetitions is small enough that you're worried about dramatic shifts in your summary statistics, then you should be looking at doing ensemble-type runs to ensure that you are in the realm of suitably-sized numbers.  I would assume that you would want your results to be robust enough that they are reproducible without specifying your RNG scheme or seed.
Question
Hello anyone
I'm trying to perform an analysis of BSSVS (a Bayesian Stochastic Search Variable Selection), but when I run BEAST (including Version 1.75, 1.8.0, 1.8.1 and 1.8.2) under Windows 7(64bit),  using BEAGLE 2.1, the program displays the error message:
"Underflow calculating likelihood. Attempting a rescaling...
Underflow calculating likelihood. Attempting a rescaling...
State 1000467: State was not correctly restored after reject step.
Likelihood before: -9781.963977882762 Likelihood after: -9022.104627201184
Operator: bitFlip(Locations.indicators) bitFlip(Locations.indicators)"
Alternatively, I tried to run it under iMAC (yosemite) ,but it can work only for 1 millions generations. And after that the same problems stilly occured.
Does anyone know how I can solve this problem?
The details of the BEAGLE  as follows.
Thanks.
Best regards,
Raindy
BEAGLE resources available:
0 : CPU
Flags: PRECISION_SINGLE PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_SSE VECTOR_NONE THREADING_NONE PROCESSOR_CPU FRAMEWORK_CPU
1 : Intel(R) HD Graphics 4600 (OpenCL 1.2 )
Global memory (MB): 1624
Clock speed (Ghz): 0.40
Number of multiprocessors: 20
Flags: PRECISION_SINGLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_NONE THREADING_NONE PROCESSOR_GPU FRAMEWORK_OPENCL
Thanks. It can  work in BEAST v1.7.4 without the BEAGLE library.
Question
Hi All,
Was wondering what is voxel based monte carlo method?
Thank you very much.
The term voxel refers to "volumetric pixel" thus giving you some virtual representation of a 3D cube and refers to the notion of doing a fully 3D Monte-Carlo simulation (contrary to some pseudo variants and hybrids). Size and discretisation of the voxels allow you to adapt the Monte-Carlo simulation accuracy and its computing time. From my experience, the term is rather general and does not refer to any specific field. Yet, a famous area of application is truly the medical branch.
Best regards, yours Marcus
Question
Hello, I am currently working on models where energy can be produced using either a clean or dirty technology and investment (in knowledge) reduces the average cost of the clean technology or backstop. A steady state involves using both the dirty and clean technologies when their marginal costs are equal.
I am thinking of including a stochastic process for change in energy prices such that investment in the backstop is feasible only when energy prices are above a certain level (that is to say, investment in knowledge now reduces the future average cost of the backstop but there is also a huge fixed cost in actually using the backstop). Theoretically, I believe that this would involve switching back and forth between clean and dirty technologies. I am looking for any ideas in how to model this. I am attaching my recent publication (basically including stochasticity as I said in my current model).
I am interested in collaborating! any ideas?
Supratim
Hi Joaquim, thanks for your answer! I am not aware of the Baum-Welch algorithm but would definitely look into it.
I would definitely be in touch if I need more help regarding the stochastic derivation of my model.
Supratim
Question