Science topic

# Stochastic Modeling - Science topic

Explore the latest questions and answers in Stochastic Modeling, and find Stochastic Modeling experts.

Questions related to Stochastic Modeling

In robust optimization, random variables are modeled as uncertain parameters belonging to a convex uncertainty set and the decision-maker protects the system against the worst case within that set.

In the context of nonlinear multi-stage max-min robust optimization problems:

What are the best robustness models such as Strict robustness, Cardinality constrained robustness, Adjustable robustness, Light robustness, Regret robustness, and Recoverable robustness?

How to solve max-min robust optimization problems without linearization/approximations efficiently? Algorithms?

How to approach nested robust optimization problems?

For example, the problem can be security-constrained AC optimal power flow.

in fact, i'm working on simulation optimization and i'm wondering if there is some kind of traditional simulation in the literature that researchers usually test their methods on. Also if anyone has stochastic simulations available implemented on julia, python or C or any other language and want to share with us

X~Poi(lambda) and Y~Poi(2*lambda). X and Y are independent. I am to find MLE estimator of lambda. I am forming the joint density and extracting marginal for X. Is this necessary? I know f(x).

I want to generate Figure 14 from Gillespie's 1977 paper.

Exact Stochastic Simulation of Coupled Chemical Reactions

If anyone can help me with the code to generate the Figure. (MATLAB, PYTHON)

The birth and death probabilities are p_i and q_i respectively and (1-(p_i+q_i)) is the probability for no change in the process. zero ({0}) is an absorbing state and sate space is {0,1,2, ...}. What are the conditions for {0} to be recurrence (positive or null)? Is the set {1,2,3,...} transient? What we can say about duration of process until absorption and stationary distribution if it exists and etc?

Every comment is appreciated.

I have a two dimensional stochastic equation like this

dX=AXdt+BXdV1+CXdV2

Where A B and C are matrices, dV1 and dV2 are correlated with each other and called color noises. My question is , if i want to use Range Kutta scheme, do i need to normalize the color noise and make it of length 1/sqrt(dt) like White noise or we can use Platen's and Kloeden's Range Kutta scheme directly for color noise case.

Stochastic Modelling with Optimal Control

I need random walk code for matrix 20*20 in MATLAB. but without repeatation in each place. I mean we can walk just once in each room.

Dear all,

I am interested in studying Stochastic Programming problems but I know that one of the many difficulties faced by students is how to reduce the scenarios especially when they encounter a large scenario-tree.

'SCENRED' is a useful solver of GAMS which can reduce the number of scenarios but I could not find any useful materials or manuals of SCENRED on the Internet to show me how we can write GAMS code to reduce scenario numbers properly in Scenred environment.

I would appreciate if you could give me further details about SCENRED solver and its implementation in GAMS platform by giving a simple example.

I had tried ARIMAX modeling in R. I didn't get clear references. If anyone tried this before in R, please give me some suggestions or references which will be useful.

If existing microgrid energy management system is deterministic, how to design it in real time probabilistic or stochastic model?

How to get expertise in this modeling?

I am trying to solve the stochastic inhomogeneous differential equations.

The stochastic inhomogeneous part of that equation is as source of problem and it is the noise.

I want to modeling the power density of noise (especially thermal and diffusion noise) in "time domain" as a mathematical process in differential form.

This map is produced by CA-Markov module to predict the future land use map of an study area. Input land use maps were 2004 and 2008 with 9 land use types. 3 of them (housing, commercial and industrial) were with 10 class of probability of growth (created using Weight of evidence method) with class 1 least probability and class 10 highest probability of growth. And other 6 classes just extracted from land use map of 2008 ( as there were). So Markov chain transitional area matrix, basis land use map (land use map of 2008) and raster group file of 9 classes (as explained) were other input files, and asked to project for next 4 years (2012).

Why the result is like this?

Thank you

I want to improve the specification performance of my MEMS Gyro, As we know, the measurement errors of a MEMS gyroscope usually contain deterministic errors and stochastic errors. I just focus on stochastic part and so we have:

y(t) = w(t)+b(t)+n(t)

where:

{w(t) is "True Angular Rate"}

{b(t) is "Bias Drift"}

{n(t) is "Measurement Noise"}

The bias drift and other noises are usually modeled in a filtering system to compensate for the outputs of gyroscope to improve accuracy. In order to achieve a considerable noise reduction, there's another solution that the true angular rate and bias drift are both modeled to set as the system state vector to design a KF.

Now if I want model the true angular rate, How could I do this? I just have a real dynamic test of gyro that includes above terms and I don't know how can I determine parameters required by the different models (such as Random Walk, 1st Gauss Markov or AR) for modeling ture angular rate from an unknown true angular rate signal!

One of the main stability theories for stochastic systems is stochastic Lyapanuv stability theory, it is the same as Lyapanuv theory for deterministic systems.

the main idea is that for the stochastic system:

dx=f(x)dt+g(x)dwt

the differential operator LV(infinitesimal generator- the derivative of the Lyapanuv function) be negative definite.

there is another assumption for this theory:

f(0)=g(0)=0

and this implies that at equilibrium point (here x_e=0) the disturbance vanishes automatically.

what I want to know is that is it a reasonable assumption?

i.e in engineering context, is it reasonable to assumed that the disturbance will vanish at the equilibrium point?

Stochastic Modeling and Solution approaches from scratch !!

My work is mainly experiment-based research. Moving a step further in the advanced analysis, can you please help me with the following questions?

1- Do you think this topic is linked with dynamic systems analysis? if yes: how this analysis should be done?

2- What kind of theoretical analysis (based on differential equations formulation) could be added to my research (especially to the vortex's stability and/or stochastic factors)?

3- What's your best suggestion for making sure that the results obtained (from experiments) are dependable? (Validation by CFD?)

Every single answer is important to me.

Thank you very much.

More preciselly in stochastic nonlinear model predictive controllers.

I have historic time series of 40 years of many weather variables. Call each variable's time series A, B, C ... Z for simplicity.

I want to use all 40 year time series for training with the intention of reproducing stochastic and synthetic time series.

Now i can use simple Markov chain or Monte Carlo approaches for individual variables with great success. However, the relationships between the variables will not be maintained.

I need all variables to relate, such that A has a strong connection to B, but not to C etc.

So when I stochastically generate A, I want that to influence B and not C.

What is the best method to simulate complex inter-dependencies?

**Stretch goal:**how can this be done in Python 3??

Thanks for any and all help!

Best,

Jamie

I need help in understanding the role of (random) sampling in implementation of a control system in Simulink. I need a basic, general example to visualize the role of the sampler in a control system, and the way it can be programmed (to be random/event-triggered etc).

Any help in this regard is very much appreciated

Thank you in advance

Can anyone recommend me a good and comprehensive book/article on stochastic domain growth?

Explain with simple example

Hi

I am trying to find steady state solution of a stochastic differential equation

dy/dt=Aydt+B1ydV1+B2ydV2

where A, B1 and B2 are operators , dV1 and dV2 are color noises.

is there any way or any literature , where steady state solution (dy/dt=0) of a stochastic differential equation has been found out. your help will be appreciated.

I tried the curve fitting toolbox in Matlab but it was limited to 2 independent variables. I read about the linear regression function in Matlab but I am not sure if it can produce the equation governing the relation.

Hi

Your project sound really interesting. For your interest, we have developed highly novel methodology, referred as HMM-GP (Paper as attach file), for artificially generating synthetic daily streamflow sequences. HMM-GP, a suite of stochastic modelling techniques, integrates highly competent Hidden Markov model (HMM) with the generalised Pareto Distribution (GP). The application of HMM model retains the key statistical characteristics of the observed (input) streamflow records in the synthetic (output) streamflow series but essentially re-orders the magnitude, spacing and frequency of streamflow sequences to simulate realistically possible multiple alternative (artificial) flow scenarios. These synthetic series could be utilised in a range of hydrological/hydraulic applications. Moreover, within the HMM-GP modelling framework, Generalized Pareto Distribution (GP) fitted to values over 99 percentile allows highly accurate simulation of extreme flows/events.

I would be very happy to hear from you if you have any comments/questions for me.

Best wishes

Sandhya

I am working on system identification problem. The question of convexity come up, and I am wondering if transfer function model and state space model suffer from the local minimum issue?

If we train a data model once on a dataset using a machine learning algorithm, save the model, and then train it again using the same algorithm and the same dataset and data ordering, will the first model be the same as the second?

I would propose a classification of ml algorithms based on their "determinism"

in this respect. On the one extreme we would have:

(i) those which always produce an identical model when trained from the same dataset with the records presented in the same order and on the other end we would have:

(ii) those which produce a different model each time with a very high variability.

Two reasons for why a resulting model varies could be (a) in the machine learning algorithm itself there could be a random walk somewhere, or (b) a sampling of a probability distribution to assign a component of an optimization function. More examples would be welcome !

Also, it would be great to do an inventory of the main ML algorithms based on their "stability" with respect to retraining under the same conditions (i.e. same data in same order). E.g. decision tree induction vs support vector vs neural networks. Any suggestions of an initial list and ranking would be great !

for quite a comprehensive list of methods.

Why threshold value set in exponential phase, if it set in linear phase for Real time PCR any problem?

What is the difference between ARMAX model and Linear regression with ARMA errors? and how does the estimation of the 2 models differ?

Nash-Sutcliffe coefficient is considered to be a very reliable measure of goodness of fit for hydrological models. But can it be used for other than hydrological models?

Nash, J. E.; Sutcliffe, J. V. (1970). "River flow forecasting through conceptual models part I — A discussion of principles". Journal of Hydrology. 10 (3): 282–290. doi:10.1016/0022-1694(70)90255-6

Why usually triangular distributions are used as input variables for Monte Carlo simulation? How valid are triangular distributions for this purpose? To me, bell shape distributions are better replications of most real world activities.

What is a reasonable range of model time step to be used for rainfall-runoff modelling? What could be the significant difference when using let's say 5 min or 60 mins of model time step?

Hi all!

I am looking for some suggestion on some raster data simulation.

I need to generate a map of a simulated value (ranging 0-1) with clustered spatial autocorrelation.

At the moment I am using the rMatClust() function in spatstat package, generating a Matern Cluster Process of random point pattern, and then transforming the density of points into a raster...

It works perfectly but I'm sure there must be a more 'elegant' way to do it.

Any suggestion?

Looking for problems faced in industries in the field of manufacturing systems that I can attempt to address as part of my research.

1, if A

_{1},A_{2},...A_{n}are stochastic matrixes, then except that A_{1}A_{2}...A_{n}is stochastic, what is its other properties?2, What is the classic book to refer the stochastic matrix?

I'm developing a wind speed map and I need to find the appropriate wind speed for each contour point for a specific return period. How should I use Monte Carlo simulation to get the wind speed, if I have nested equations for the wind formula, and developed Probability Distribution Functions of the needed parameters.

Dear all,

we know, owing to a result appearing in the book of R. M. Dudley (Real analysis and Probability - Theorem 2.8.2), that for any separable metric space (S, d), there is a metric e on S, defining the same topology as d, such that (S, e) is totally bounded.

I would like to know if a uniform continuous function f for the metric d will remain uniform continuous for the constructed metric e.

Regards,

Nathalie

Most of the literature on the recursive maximum likelihood estimates of parameters of a partially observed model seems to be in discrete time, i.e. on Hidden Markov Models (HMMs).

There is quite a strong result for HMMs in

Tadic, V. B. (2010). Analyticity, Convergence, and Convergence Rate of Recursive Maximum-Likelihood Estimation in Hidden Markov Models. IEEE Transactions on Information Theory, 56(12), 6406–6432. http://doi.org/10.1109/TIT.2010.2081110

I'm wondering whether there is similar work in continuous-time models. If they exist, I can't seem to find them. Maybe the problem is still open.

Thank you for any hints!

Is every probability distribution necessary to have scale, location and shape parameters? Although, probability distribution is characterized by location and scale parameters, while location and scale parameters are typically used in modeling applications, then what is role of shape parameter?

Hello all,

I would like to ask about the following benchmark functions:

Ackley

Bohachevsky 1

Camel 3 hump

Drop wave

Exponential

Paviani

Are these benchmark functions unimodal or multimodal ?

Thanks to all who contribute to the answer.

I'm interested in studying the statistical properties of pedestrian flow through subway systems (not within a single subway station, but through the subway system. I.e. from station A to station B, etc.) In particular, I want to fit a model to determine the distribution of paths people take through the subways.

Given a dataset of the topology of a subway system (which stations connect to which stations) and timeseries of how many people enter and exit each subway station for during each interval of time. I want to imagine this as a network where each node is a subway station with a decay rate to each of its neighboring stations.

I wrote a little more about my ideas on this problem in the included document and am hoping for some advice on how to approach this problem. I'm taking an introductory statistics course now and (this is independent of my course work) this is the first time I've had to imagine a model of my own and fit it.

I'm hoping for some advice from the experts here on how to approach this problem

For multidimensional optimization problems, I want to find which of the variables (to be optimized) are highly influential on the objective function value, so that I can have an expectation on which parameters would be accurately identified by the optimizer.

Given a domain for each variable, the aim is to identify which of the variables (in their respective domain) has the most influence on the overall objective function value.

The complexity also arises due to the fact that I am working on black-box optimization , and the objective function is not algebraic but a finite element computation. So I want to reduce the dimension of the problem.

Thanks for the help.

I have question regarding simulating under mentioned 1D Stochastic Differential Equation in R using Sim.DiffProc package:

dx1 = (b1*x1 − d1*x1) dt + Sqrt(b1*x1 + d1*x1) dW1(t)

I have taken this equation from book: Modeling with Ito Stochastic Differential Equations by E. Allen. In the deterministic and diffusion part of equation, b1 and d1 are model parameters representing birth and death rates (for single population approximation of two interacting populations compartment model). Relevant lines of my code are as under (note that i,ve used theta's to represent parameters in my code):

Code (1):

> fx <- expression( theta[1]*x1-theta[2]*x1 ) ## drift part

> gx <- expression( (theta[3]*x1+theta[4]*x1)^0.5 ) ## diffusion part

> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,

+ theta2=1,theta3=1,theta4=1),pmle="euler")

Or should I model it like this

Code (2):

>fx <- expression( theta[1]*x1-theta[2]*x1 )

> gx <- expression( (theta[1]*x1+theta[2]*x1)^0.5 )

> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,

+ theta2=1),pmle="euler")

I am not clear whether to use theta[1], theta[2], theta[3], theta[4] as I have used at first place above or should I code it like only using parameters theta[1] and theta[2] (done at second place above) because in original model the parameters b1 and d1(birth and death rates) appearing in the deterministic part are same as appearing in the diffusion part.

I don’t find a single example in Sim.DiffProc package documentation where there is any repetition of parameters just like I have done at second place.

Thanking in anticipation and best regards.

Saad Sharjeel.

I would like to emulate a FE model of a stochastic, heterogenous, anisotropic material, subjected to a loading. Due to high computational costs of this model I would like to replace the FE model by a metamodel. Can anyone give me some suggestions?

My work research aims to integrate graphical models (e.g.: Bayesian Networks) and hydrological models (e.g.: HBV hydrology model). The problem is how to transform deterministic equations involved in hydrological models into probabilistic relationships between the involved variables (in the form CPTs for BNs). I'm testing Bayesian Network in order to create a probabilistic model dealing with flood forecasting and I’m quite surprised that I didn't find any previous developments in this context.

I know that the polynomial chaos expansion can deal with many distributions such as normal distribution, beta distribution and gamma distribution. But if they simultaneously occur in equations, how can I handle such occasion?

Hello everyone,

Using exploratory methods to track changes in the syntactic preferences of constructions over time, I was wondering if anybody has ever conceived of time (e.g. decades) as a continuous variable in statistical analysis.

For instance, I have a corpus that covers the period between the 1830's and the 1920's (10 decades) and I would like to divide my dataset into, say, 5 clusters of decades.

Discrete time:

- 1830-1840

- 1850-1860

- 1870-1880

- ...

Continuous time:

- 1830-1850

- 1850-1870

- 1870-1890

- ...

What do you think? Knowing that this could be feasible only in exploratory analysis, not in predictive analysis (regression models).

Thanking you in advance!

I am looking for a stochastic model in use clearly for extreme rainfall generation.

the rainfall will be used for hydraulic models and drainage system models.

few concepts are developed in past but it seems there is not any toolkit or software with manual about them, like

Neyman-Scott (NS) or Bartlett-Lewis or DRIP MODEL

thanks

I have 450000 data from gyroscope and I want to model stochastic noises of it and at bias instability modeling I need time correlation to model bias instability and I don't know who can I compute it?

Hello Researchers,

I have a developed a "Stochastic solar model" for the purpose of long term distribution system planning. I am aware about the indices researchers commonly use to validate Solar models like; RMSE, MBE etc.

But I face the challenge to find similar literature's (or Similar Solar prediction models) and the same Solar data set that they had used for validating the models. I'm also confused whether it is logical to use aforementioned indices for validating a "Stochastic model", because indices values are not constant.

Kindly let me know your suggestions in this regard. Thanks in advance!

A continuum equation is used to analyze the model related to the stochastic growth of surface using poisson distribution. In this stochastic growth, the flat surface is continued to become rougher as time proceeded but the correlation length is always zero during the stochastic growth process. i am not able to understand why the correlation length will always be zero.

I am working on High Frequency Trading/ Algorithmic trading. I have to simulate limit order book. Can anyone help me?

I need to apply Brownian motion model for continuous character (wing length) and I apply the below code

WL.BM<-fitContinuous(tree, mysorteddata$WL, model="BM")

and I got this error:

Warning: no tip labels, order assumed to be the same as in the tree

Fitting BM model:

Error in solve.default(phyvcv) :

Lapack routine dgesv: system is exactly singular.

I appreciate any helps and advice

Regards

Aram

Hi,

I am now busy on making a multivariate hydrologic model. It uses several variables (E.g. Runoff, Evaporation, Precipitation and groundwater) to forecast another variable (Lake water level). The procedure is inevitable of making some data reconstruction. I have some doubts about the degrees of accuracy of such model while data are carefully retreated.

The final model would be in the form of regressive-stochastic model based on method of moments.

- Is it acceptable to do such modelling on synthetic data?
- what is the criteria for this model?
- Is there any similar works on such treatment?

note:

- reconstruction is manipulated such that the periodicity, similarities, trends, persistent and moments of the distribution is preserved to acceptable extent.
- A validation set is used for testing the degree of accuracy.
- The degree of reconstruction in data sets varies from just one variable in the middle to 120 data either in past, future or in the middle of the time series.

Your sincerely

Babak

I am investigating the relationship between one continuous variable and one ordinal variable. Correlation seem appropriate for this for this first step. Now I want to explore how a third continuous variable influences the strength and direction of the first two variables.

Hello,

I want to estimate genetic divergence corrected with the nucleotide substitution model HKY. This model was selected as the most parsimonious with jModelTest. I tried with Mega 5.1 but HKY is not one the models included to compute distances.

Most of what I am able to uncover is really old and vague.

The functional is defined from a convex closed bounded subspace A of a banach space E to the real space R

Marcov chains are rather popular math tool in econometric studies, but are there any studies about using it in economic activities?

Hello,

I am using semi-empirical approach to generate synthetic ground motion. This approach is a combination of Empirical Greens function and Stochastic Simulation methods. This method doesn't consider any slip value. Do any other models consider slip of the fault? Because this slip is a variable parameter along the fault length. We didn't have any ground motion records at the location where max slip occurred for the 2002 Denali earthquake. Similarly, effect of asperities or boulders has not been included in developing synthetic accelerogram as far as i am concerned.

I've solved the Kuramoto model for 100 oscillators and have calculated phase of each oscillator (theta) at each time step. Now by using these phases (thetas) I need to generate two/more signals to measure the synchronization among them as a function of coupling strength with my own method to check the capability of my method.

I am trying to model a multi variate time series model. (i.e. Multivariate ARIMA). The general format of the model is in the shape of a budget model like:

W(t) = f [ X(t), Y(t), Z(t), T(t) ]

were W(t) shows the jump and drop that I mentioned. There is jump and a following drop in the historical dependent variable (i.e. Output). But this record is not just on one data and records seem to persist on this behavior for a medium period (i.e. several years after jump and then a following drop). I have attached the diagram of the time series to this post. As soon as the independent variables spouse to cause or track down the jump and drop; and since I am going to make them independent from each other. Do I have to remove the jump from the data or not?

I think changes in input variables should track or result in that jump and drop and it is not proper to remove the jump and the drop.

What is the state of the art in the statistical modelling of the internet traffic generated by the different internet activities such as the Skype, Bit Torrent and browsing etc. Are there accurate models existent in literature that are underlying these activities

I am using a birth and death process.

Suppose we have different candidate models proposed for a time series based on ACF and PACF. Now the basic equation has the white noise term. In MATLAB, u have "randn" command to generate normal random numbers. The parameters can be estimated from "armax" command.

After parameter estimation (calibration), the validation involves comparing the observed data with the predicted values. Now the problem which I m facing is that figuring out whether the white noise should be generated of what length (data length or a bigger population). Secondly, the white noise sequence should be preserved for all candidate model validation or subject to change? If they are changed, then the performance indicators such as RMSE, ML, AIC, BIC will also change.

So what should i do?

Hello anyone

I'm trying to perform an analysis of BSSVS (a Bayesian Stochastic Search Variable Selection), but when I run BEAST (including Version 1.75, 1.8.0, 1.8.1 and 1.8.2) under Windows 7(64bit), using BEAGLE 2.1, the program displays the error message:

*"*

**Underflow calculating likelihood. Attempting a rescaling...**

**Underflow calculating likelihood. Attempting a rescaling...**

**State 1000467: State was not correctly restored after reject step.**

**Likelihood before: -9781.963977882762 Likelihood after: -9022.104627201184**

**Operator: bitFlip(Locations.indicators) bitFlip(Locations.indicators)"**Alternatively, I tried to run it under iMAC (yosemite) ,but it can work only for 1 millions generations. And after that the same problems stilly occured.

Does anyone know how I can solve this problem?

The details of the BEAGLE as follows.

Thanks.

Best regards,

Raindy

**BEAGLE resources available:**

**0 : CPU**Flags: PRECISION_SINGLE PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_SSE VECTOR_NONE THREADING_NONE PROCESSOR_CPU FRAMEWORK_CPU

**1 : Intel(R) HD Graphics 4600 (OpenCL 1.2 )**Global memory (MB): 1624

Clock speed (Ghz): 0.40

Number of multiprocessors: 20

Flags: PRECISION_SINGLE COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO SCALING_ALWAYS SCALERS_RAW SCALERS_LOG VECTOR_NONE THREADING_NONE PROCESSOR_GPU FRAMEWORK_OPENCL

Hi All,

Was wondering what is voxel based monte carlo method?

Thank you very much.

Hello, I am currently working on models where energy can be produced using either a clean or dirty technology and investment (in knowledge) reduces the average cost of the clean technology or backstop. A steady state involves using both the dirty and clean technologies when their marginal costs are equal.

I am thinking of including a stochastic process for change in energy prices such that investment in the backstop is feasible only when energy prices are above a certain level (that is to say, investment in knowledge now reduces the future average cost of the backstop but there is also a huge fixed cost in actually using the backstop). Theoretically, I believe that this would involve switching back and forth between clean and dirty technologies. I am looking for any ideas in how to model this. I am attaching my recent publication (basically including stochasticity as I said in my current model).

I am interested in collaborating! any ideas?

Supratim

I wish to know whether I could the adopt time-varying stochastic frontier model when the values of each cross section are not available for the complete time period selected for the analysis. Could someone help me in this regard?

Thanking you.

When performing analytical math on random graphs and point processes we usually tend to assume infinite sets, that is in order to enhance the tractability. However in practical systems and simulations we do not have an “infinite” space. A suggested workaround is to use Toroidal distances that convolves the flat simulation map similar to the famous “snakes” game.

Can Toriodal distance measure enhance the simulation accuracy of random graphs?

Is there any other method for counter-effect the edge problem ?

Toroidal distance reference in the Appendix of:

C. Bettstetter, "On the minimum node degree and connectivity of a wireless multihop network," in Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking & computing, 2002, pp. 80-91.