Science topics: Data MiningPrediction
Science topic

Prediction - Science topic

Explore the latest questions and answers in Prediction, and find Prediction experts.
Questions related to Prediction
  • asked a question related to Prediction
Question
7 answers
Can artificial intelligence already predict our consumer behaviour and in a short while will it be able to predict which shop we will go to and what we will buy tomorrow?
With the help of artificial intelligence, how can systems for monitoring citizens' consumer behaviour based on GPS geolocalisation and information contained in smartphones be improved?
The lockdowns and national quarantines introduced during the coronavirus pandemic (Covid-19) caused a strong decline in sales and turnover generated in traditionally, physically functioning shops and service establishments. The lockdowns imposed on selected service industries and on traditionally operated trade also resulted in an acceleration of e-commerce, the sale of products and services conducted via the Internet. When the coronavirus pandemic was no longer interpreted in terms of high health and economic risk, a significant proportion of traditionally operated trade and physical service establishments also returned to traditionally operated business, customer service, product or service sales. On the other hand, emerging new ICT and Industry 4.0 solutions are being implemented and support the economic activities of companies, enterprises, service establishments and shops producing and/or offering their products or services in both traditional and Internet-based formats. when the pandemic was considered to be over and did not generate major risks for the economic activities of service establishments and shops, new ICT and Industry 4. 0, including artificial intelligence technologies, are being implemented in information systems to support the sales processes of product or service offerings, including improving tools for activating potential consumers, getting customers interested in new product or service offerings, and encouraging customers to visit stationary shops and service establishments. In this regard, startups have been rapidly developing over the past few years, which, using anonymous mobile user identifiers and accurate location and internet user data available in various applications installed on smartphones, are able to precisely locate where a smartphone user is at any given time and diagnose whether he or she is by chance making a purchase in a specific stationary shop, walking down the street passing by an establishment providing specific services and perhaps considering using those services. In a situation where a technology start-up has data on a specific Internet user downloaded from a number of different Internet applications and, on the basis of this data collected on Big Data Analytics information processing and analysis platforms, has drawn up information-rich characteristics of the interests and purchasing preferences of a kind of digital avatar equivalent to a specific Internet user, then, in combination with analysis of current customer behaviour and GPS-based geolocalisation, it is able to make real-time predictions about the subsequent behaviour and/or purchasing decisions of individual potential customers of specific product or service offerings. Some technology start-ups conducting this kind of analytics based on large sets of customer data and on geolocalisation, use of specific apps and social media available on the smartphone and knowledge of the psychology of consumer behaviour are first able to precisely locate consumers in real time with reference to specific shops, service establishments, etc. They are able to firstly locate consumers in real time and precisely identify specific shops, service providers, etc., and then display information on advertising banners appearing in specific applications on the smartphone about the current offer, including a price or other promotion for a specific product available for sale in the shop where the Internet user and potential customer is currently located. Thanks to this type of technological solutions, more and more often an Internet user available on a smartphone in a situation when he/she is in the vicinity, next to specific stands, shop shelves, specific shops in shopping centres, and is thinking about buying a specific product, then at that moment he/she receives information on the smartphone, an advertisement appears with information on a price or other promotion concerning that particular product or a similar, highly substitutable product. At the aforementioned point in time when the customer is in a specific shop or part of a shop, online advertisements are displayed on his or her smartphone, e.g. on social media, the Google ecosystem, third-party web browsers or other applications that the potential customer has installed on his or her smartphone.
When such technological solutions are complemented by artificial intelligence analysing the consumer behaviour of individual customers of different product and service offers, it is possible to create intelligent analytical systems capable of predicting who will visit a specific shop, when they will do so and what they plan to buy in that shop. Statistically, a citizen has several applications installed in his or her smartphone, which provide the technology-based analytical companies with data about their current location. Therefore, thanks to the use of artificial intelligence, it may not be long before Internet users receive messages, see online advertisements displayed on their smartphones showing the products and services they are about to buy or think about tomorrow. Perhaps the artificial intelligence involved in this kind of analytics is already capable of predicting our consumer behaviour in real time and will soon be able to predict which shop we will go to and what we will buy tomorrow.
In view of the above, I would like to address the following question to the esteemed community of scientists and researchers:
With the help of artificial intelligence, how can monitoring systems for citizens' consumer behaviour based on GPS geolocation and information contained in smartphones be improved?
Can artificial intelligence already predict our consumer behaviour and in a few moments will it be able to predict which shop we will go to and what we will buy tomorrow?
Can artificial intelligence already predict our consumer behaviour?
What do you think about this topic?
What is your opinion on this subject?
Please answer,
I invite you all to discuss,
The above text is entirely my own work written by me on the basis of my research.
I have not used other sources or automatic text generation systems such as ChatGPT in writing this text.
Copyright by Dariusz Prokopowicz
Thank you very much,
Best regards,
Dariusz Prokopowicz
Relevant answer
Answer
Yes, artificial intelligence can predict consumer behavior. AI can analyze data from various sources such as social media, online shopping history, and search engine queries to predict consumer behavior. AI can also be used to personalize marketing campaigns and improve customer experience.
Received message. Yes, artificial intelligence can predict consumer behavior. AI can analyze data from various sources such as social media, online shopping history, and search engine queries to predict consumer behavior. AI can also be used to personalize marketing campaigns and improve customer experience.
  • asked a question related to Prediction
Question
3 answers
Is it possible to build a highly effective forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science analytics, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies?
Is it possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science analytics, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies as part of a forecasting system for complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of a self-fulfilling prediction and to increase the scale of the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied?
What do you think about the involvement of artificial intelligence in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies for the development of sophisticated, complex predictive models for estimating current and forward-looking levels of systemic financial, economic risks, debt of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting trends in economic developments and predicting future financial and economic crises?
Research and development work is already underway to teach artificial intelligence to 'think', i.e. the conscious thought process realised in the human brain. The aforementioned thinking process, awareness of one's own existence, the ability to think abstractly and critically, and to separate knowledge acquired in the learning process from its processing in the abstract thinking process in the conscious thinking process are just some of the abilities attributed exclusively to humans. However, as part of technological progress and improvements in artificial intelligence technology, attempts are being made to create "thinking" computers or androids, and in the future there may be attempts to create an artificial consciousness that is a digital creation, but which functions in a similar way to human consciousness. At the same time, as part of improving artificial intelligence technology, creating its next generation, teaching artificial intelligence to perform work requiring creativity, systems are being developed to process the ever-increasing amount of data and information stored on Big Data Analytics platform servers and taken, for example, from selected websites. In this way, it may be possible in the future to create "thinking" computers, which, based on online access to the Internet and data downloaded according to the needs of the tasks performed and processing downloaded data and information in real time, will be able to develop predictive models and specific forecasts of future processes and phenomena based on developed models composed of algorithms resulting from previously applied machine learning processes. When such technological solutions become possible, the following question arises, i.e. the question of taking into account in the built intelligent, multifaceted forecasting models known for years paradoxes concerning forecasted phenomena, which are to appear only in the future and there is no 100% certainty that they will appear. Well, among the various paradoxes of this kind, two particular ones can be pointed out. One is the paradox of a self-fulfilling prophecy and the other is the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied. If these two paradoxes were taken into account within the framework of the intelligent, multi-faceted forecasting models being built, their effect could be correlated asymmetrically and inversely proportional. In view of the above, in the future, once artificial intelligence has been appropriately improved by teaching it to "think" and to process huge amounts of data and information in real time in a multi-criteria, creative manner, it may be possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology, a system for forecasting complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of a self-fulfilling prophecy and increase the scale of the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied. In terms of multi-criteria processing of large data sets conducted with the involvement of artificial intelligence, Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4. 0 technologies, which make it possible to effectively and increasingly automatically operate on large sets of data and information, thus increasing the possibility of developing advanced, complex forecasting models for estimating current and future levels of systemic financial and economic risks, indebtedness of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting economic trends and predicting future financial and economic crises.
In view of the above, I address the following questions to the esteemed community of scientists and researchers:
Is it possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies in a forecasting system for complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of the self-fulfilling prophecy and to increase the scale of the paradox of not allowing a forecasted crisis to occur due to pre-emptive anti-crisis measures applied?
What do you think about the involvement of artificial intelligence in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies to develop advanced, complex predictive models for estimating current and forward-looking levels of systemic financial risks, economic risks, debt of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting trends in economic developments and predicting future financial and economic crises?
What do you think about this topic?
What is your opinion on this subject?
Please respond,
I invite you all to discuss,
Thank you very much,
Warm regards,
Dariusz Prokopowicz
Relevant answer
Answer
Dear Dariusz
when you faced with big data problems and you want to forecast or predict the future of your problem you have many choice and methods. But first you should decrease your data so you need dimensionality reduction methods like t-SNE, PCA, LSA, SDV, LDA which all of them categorized as unsupervised learning methods and then you should use neural networks to predict the future of your data sets therefor you should use supervise learning algorithms that contain regression layer in their structure some of these networks are LSTM, GRU, BiLSTN or if you use Python you have XGBoost network to predict you future data . note that it's not important what is your data may be bitcoin price or wind speed data or specific illness data .
you can do a search in google, github, etc to find practical problems such as yours with their scripts and Python or MATLAB codes
Best regards;
  • asked a question related to Prediction
Question
3 answers
A number of people have asked on ResearchGate about acceptable response rates and others have asked about using nonprobability sampling, perhaps without knowing that these issues are highly related.  Some ask how many more observations should be requested over the sample size they think they need, implicitly assuming that every observation is at random, with no selection bias, one case easily substituting for another.   
This is also related to two different ways of 'approaching' inference: (1) the probability-of-selection-based/design-based approach, and (2) the model-based/prediction-based approach, where "prediction" means estimation for a random variable, not forecasting. 
Many may not have heard much about the model-based approach.  For that, I suggest the following reference:
Royall(1992), "The model based (prediction) approach to finite population sampling theory." (A reference list is found below, at the end.) 
Most people may have heard of random sampling, and especially simple random sampling where selection probabilities are all the same, but many may not be familiar with the fact that all estimation and accuracy assessments would then be based on the probabilities of selection being known and consistently applied.  You can't take just any sample and treat it as if it were a probability sample.  Nonresponse is therefore more than a problem of replacing missing data with some other data without attention to "representativeness."  Missing data may be replaced by imputation, or by weighting or reweighting the sample data to completely account for the population, but results may be degraded too much if this is not applied with caution.  Imputation may be accomplished various ways, such as trying to match characteristics of importance between the nonrespondent and a new respondent (a method which I believe has been used by the US Bureau of the Census), or, my favorite, by regression, a method that easily lends itself to variance estimation, though variance in probability sampling is technically different.  Weighting can be adjusted by grouping or regrouping members of the population, or just recalculation with a changed number, but grouping needs to be done carefully. 
Recently work has been done which uses covariates for either modeling or for forming pseudo-weights for quasi-random sampling, to deal with nonprobability sampling.  For reference, see Elliott and Valliant(2017), "Inference for Nonprobability Samples," and Valliant(2019), "Comparing Alternatives for Estimation from Nonprobability Samples."  
Thus, methods used for handling nonresponse, and methods used to deal with nonprobability samples are basically the same.  Missing data are either imputed, possibly using regression, which is basically also the model-based approach to sampling, working to use an appropriate model for each situation, with TSE (total survey error) in mind, or weighting is done, which attempts to cover the population with appropriate representation, which is mostly a design-based approach. 
If I am using it properly, the proverb "Everything old is new again," seems to fit here if you note that in Brewer(2014), "Three controversies in the history of survey sampling," Ken Brewer showed that we have been all these routes before, leading him to have believed in a combined approach.  If Ken were alive and active today, I suspect that he might see things going a little differently than he may have hoped in that the probability-of-selection-based aspect is not maintaining as much traction as I think he would have liked.  This, even though he first introduced 'modern' survey statistics to the model-based approach in a paper in 1963.  Today it appears that there are many cases where probability sampling may not be practical/feasible.  On the bright side, I have to say that I do not find it a particularly strong argument that your sample would give you the 'right' answer if you did it infinitely many times when you are doing it once, assuming no measurement error of any kind, and no bias of any kind, so relative standard error estimates there are of great interest, just as relative standard error estimates are important when using a prediction-based approach, and the estimated variance is the estimated variance of the prediction error associated with a predicted total, with model misspecification as a concern.  In a probability sample, if you miss an important stratum of the population when doing say a simple random sample because you don't know the population well, you could greatly over- or underestimate a mean or total.  If you have predictor data on the population, you will know the population better.  (Thus, some combine the two approaches: see Brewer(2002) and Särndal, Swensson, and Wretman(1992).) 
..........         
So, does anyone have other thoughts on this and/or examples to share for this discussion: Comparison of Nonresponse in Probability Sampling with Nonprobability Sampling?    
..........         
Thank you.
References:
Brewer, K.R.W.(2002), Combined Survey Sampling Inference: Weighing Basu's Elephants, Arnold: London and Oxford University Press
Brewer, K.R.W.(2014), "Three controversies in the history of survey sampling," Survey Methodology, Dec 2013 -  Ken Brewer -   Waksberg Award: 
Elliott, M.R., and Valliant, R.(2017), "Inference for Nonprobability Samples," Statistical Science, 32(2):249-264,
Royall, R.M.(1992), "The model based (prediction) approach to finite population sampling theory," Institute of Mathematical Statistics Lecture Notes - Monograph Series, Volume 17, pp. 225-240.   Information is found at
The paper is available under Project Euclid, open access: 
Särndal, C.-E., Swensson, B., and Wretman, J.(1992), Model Assisted Survey Sampling, Springer-Verlang
Valliant, R.(2019), "Comparing Alternatives for Estimation from Nonprobability Samples," Journal of Survey Statistics and Methodology, Volume 8, Issue 2, April 2020, Pages 231–263, preprint at 
Relevant answer
Answer
This is a very interesting perspective, James R Knaub , and one that you could well share on Frank Harrell's Datamethods discussion forum : https://discourse.datamethods.org
Other than that, I'm going to have a look at those references over a largeish pot of coffee before I say anything stupid (stupid plus references allows you to cover your retreat better!)
r
  • asked a question related to Prediction
Question
2 answers
I am trying to use machine learning algorithms to predict whether a pipe has broken or not and I also want to predict the time to failure of a particular pipe. So, I need a dataset that contains the pipe installation year, the date of recorded failure for failed pipes and also some other parameters such as pipe length, operating pressure, type of material and pipe diameter among others.
Relevant answer
Answer
Hi Sir,
The other implementations should have cited their datasets and can be refereed for benchmarking purposes
2 good datasets I could find are:
Best of luck!
  • asked a question related to Prediction
Question
20 answers
In my country, more than a dozen years ago or more, there were real winters with snow and frost after the autumn. Whereas last winter, during the last few years it looked like autumn, without snow and positive temperatures. I think that the greenhouse effect, ie the warming of the Earth's climate, has already begun. This is also confirmed by numerous climatic cataclysms and weather anomalies, which in the current year 2018 appear in numerous places on the Earth. In some parts of the Earth there are fires of huge forest areas such as in Scandinavia, California in the USA, Australia, the Iberian Peninsula, Africa, etc. In addition, weather anomalies, e.g. snow and floods in October and November in the south of Europe.
In addition, tornadoes in many places on Earth and so on.
Perhaps these problems will get worse. It is necessary to improve security systems and anti-crisis services, improve the prediction of these anomalies and climatic cataclysms so that people can, have managed to shelter or cope with the imminent cataclysm. One of the technologies that can help in more precise forecasting of these cataclysms is the processing of large collections of historical and current information on this subject in the cloud computing technology in Big Data database systems.
Therefore, I am asking you: Will new data processing technologies in Big Data database systems allow for accurate prediction of climate disasters?
Please, answer, comments. I invite you to the discussion.
Relevant answer
Answer
Despite a small amount of uncertainty, scientists find climate models of the 21st century to be pretty accurate because they are based on well-founded physical principles of earth system processes. This basis solidifies the confidence of the scientific community that human emissions are changing the climate, which will impact the entire planet.
  • asked a question related to Prediction
Question
4 answers
At the US Energy Information Administration (EIA), for various establishment surveys, Official Statistics have been generated using model-based ratio estimation, particularly the model-based classical ratio estimator.  Other uses of ratios have been considered at the EIA and elsewhere as well.  Please see
At the bottom of page 19 there it says "... on page 104 of Brewer(2002) [Ken Brewer's book on combining design-based and model-based inferences, published under Arnold], he states that 'The classical ratio estimator … is a very simple case of a cosmetically calibrated estimator.'" 
Here I would like to hear of any and all uses made of design-based or model-based ratio or regression estimation, including calibration, for any sample surveys, but especially establishment surveys used for official statistics. 
Examples of the use of design-based methods, model-based methods, and model-assisted design-based methods are all invited. (How much actual use is the GREG getting, for example?)  This is just to see what applications are being made.  It may be a good repository of such information for future reference.
Thank you.  -  Cheers. 
Relevant answer
Answer
In Canada they have a Monthly Miller’s Survey, and an Annual Miller’s Survey.  This would be a potential application, if used as I describe in a paper linked below. As in the case of a survey at the US Energy Information Administration for electric generation, fuel consumption and stocks for electric power plants, they collect data from the largest establishments monthly, and from the smallest ones just annually.  After the end of the year, for a given data item, say volume milled for a type of wheat, they could add the twelve monthly values for each given establishment, and with the annual data collected, there is then an annual census.  To predict totals each month, the previous annual census could be used for predictor data, and the new monthly data would be used for quasi-cutoff sample data, for each data item, and with a ratio model, one may predict totals each month for each data item, along with estimated relative standard errors.  Various techniques might apply, such as borrowing strength for small area predictions, adjustment of the coefficient of heteroscedasticity, and multiple regression when production shifts from say, one type of grain to another, as noted in the paper. 
Here are the mill surveys: 
Canadian Mill surveys: 
Monthly Miller’s Survey: 
Annual Miller’s Survey: 
This survey information is found on page 25 of the paper below, as of this date.  There will likely be some revisions to this paper.  This was presented as a poster paper at the 2022 Joint Statistical Meetings (JSM), on August 10, 2022, in Washington DC, USA.  Below are the poster and paper URLs. 
Poster:
The paper is found at
.........................     
If you can think of any other applications, or potential applications, please respond. 
Thank you. 
  • asked a question related to Prediction
Question
3 answers
I am working on landslide hazard and risk zonation. I trained some of factors for landslide in python/R and SPSS. I have calculated ROC/AUC and confusion matrix of the model. I want to get a solution about how can I generate the final Landslide prediction maps from those trained and evaluated Machine Learning (ML) models?
Relevant answer
Answer
Sounds like you have trained multiple ML models and got multiple prediction results. You can get predictions from each and pool them together using blending/voting/stacking methods to combine them
  • asked a question related to Prediction
Question
10 answers
It is about a 3 class classification problem. Where test data has the probability of occurance of different classes are almost similar. I.e. they occur around 33% times each. Now upon training a model yields an accuracy of 45-48% on out of sample test data. Is this result significant in terms of prediction? Here accuracy is computed as %of correctly identified class to all classes. In other similar problems where problem is modelled as 2 class classification problem the maximum accuracy obtained in the literature is around 69%. But in present case the classes are "up" "down" and "no-change" instead of just "up" and "down"
Relevant answer
Answer
For any system accuracy of 60% or more is acceptable. Anything below is 60 is not a good model for classification
  • asked a question related to Prediction
Question
4 answers
I use a conditional logit model with income, leisure time and interaction terms of the two variables with other variables (describing individual's characteristics) as independent variables.
After running the regression, I use the predict command to obtain probabilities for each individual and category. These probabilities are then multiplied with the median working hours of the respective categories to compute expected working hours.
The next step is to increase wage by 1%, which increases the variable income by 1% and thus also affects all interaction terms which include the variable income.
After running the modified regression, again I use the predict command and should obtain slightly different probabilities. My problem is now that the probabilities are exactly the same, so that there would be no change in expected working hours, which indicates that something went wrong.
On the attached images with extracts of the two regression outputs one can see that indeed the regression coefficients of the affected variables are very, very similar and that both the value of the R² and the values of the log likelihood iterations are exactly the same. To my mind these observations should explain why probabilities are indeed very similar, but I am wondering why they are exactly the same and what I did possibly wrong. I am replicating a paper where they did the same and where they were able to compute different expected working hours for the different scenarios.
Relevant answer
Answer
Either something went wrong or performed the same test. Did you use the same version of the software as the original study?
  • asked a question related to Prediction
Question
5 answers
I just starting to try out the google colab version for the alphafold2 for protein 3D structure prediction via this link:
Pretty much a newbie, so still trying to figure out how best to interpret the results and put them into proper words for a report/presentation. Also, is there a way to download the predicted 3D structure that is displayed?
Thanks in advance.
  • asked a question related to Prediction
Question
1 answer
Like protein metal predictor or simulation programs
Relevant answer
Answer
Are you looking for a software that predicts the binding mode/site between metals and protein? If that’s what you mean, molecular docking is what you are looking for. Here is an article that can help to get started. Hope this helps.
  • asked a question related to Prediction
Question
6 answers
I have recently been working on using machine learning for yield prediction, however, I was exploring what inputs would be better at predicting yield. I am confused by only three papers that use historical yields as an input to predict yields for the new year. From the test results this does improve the prediction accuracy substantially. But does this count as data leakage? If not, what is the rationale for doing so? What are the limitations? (It seems that the three papers are from the same team.)
Relevant answer
Answer
Pinery Lee Machine learning algorithms' predictions will assist farmers in deciding which crop to cultivate in order to maximize production by taking into account aspects such as temperature, rainfall, acreage, and so on.
  • asked a question related to Prediction
Question
8 answers
I am trying to tweaking my machine learning model optimizer, and i would love to test that in healthcare domain space, especially for rare illnesses.
Thus, do any one knows any deidentified electronic health records for Epilepsy, Parkinson , or other rare diseases patients (maybe those who are treated with warfarin) ?
Please guide me how to get these datasets.
I already spoke with many research authors, but yet no responses.
Relevant answer
Answer
That's too painful.
The thing is that you understand their privacy concerns, and you have your procedures that you can assure your compliance to that.
But still its not easy to get those patient records.
I wrote this post after searching all the suggested resources, thought i really thank you for your care to respond.
  • asked a question related to Prediction
Question
2 answers
Hi, I want to predict post-transitional modification for phosphorylation. I found lots of websites like Phosida, PhosphoSite Plus. I am just curious about is there any python code for this phosphorylation prediction. If you have, could you share the GitHub link?
Relevant answer
Answer
Shaban Ahmad thank you
  • asked a question related to Prediction
Question
3 answers
i have predicted the solubility of a compound using a webserver in units of mol/L and it was 0.00126 and i want to know whether this value means the compound is soluble in water or not and it would be better to compare it with other compounds in the market.
Thanks
Relevant answer
Answer
We have to create in-vitro conditions to get solubility, which is vital for its bioavailability. Just mere water solubility is only for data generation.
Mild alkaline or acidic medium I mean
  • asked a question related to Prediction
Question
11 answers
I have been doing research on different issues in the Finance and Accounting discipline for about 5 years. It becomes difficult for me to find some topics which may lead me to do projects, a series of research articles, working papers in the next 5-10 years. There are few journals which have updated research articles in line with the current and future research demand. Therefore, I am looking for such journal(s) that can help me as a guide to design research project that can contribute in the next 5-10 years.
Relevant answer
Answer
You don't need to look for any journals.
All you need to do is narrow your search to topics listed in "special issues" and "call for papers". Top publishers e.g. elsevier, wiley, T&F, Emerald, etc., often advertise call for papers and special issues of journals. The topics in the special issue or call for paper can give you some hint on current and future research trends. I think this is the standard practice in academia.
I hope this advice helps.
  • asked a question related to Prediction
Question
1 answer
Dear collegues.
I would like to ask,if anybody works with neural networks,to check my loop for the test sample.
I've 4 sequences (with a goal to predict prov,monthly data,22 data in each sequence) and I would like to construct the forecast for each next month with using training sample size 5 months.
It means, I need to shift each time by one month with 5 elements:
train<-1:5, train<-2:6, train<-3:7...,train<-17:21. So I need to get 17 columns as a output result.
The loop is:
shift <- 4
number_forecasts <- 1
d <- nrow(maxmindf)
k <- number_forecasts
for (i in 1:(d - shift + 1))
{
The code:
require(quantmod)
require(nnet)
require(caret)
prov=c(25,22,47,70,59,49,29,40,49,2,6,50,84,33,25,67,89,3,4,7,8,2)
temp=c(22,23,23,23,25,29,20,27,22,23,23,23,25,29,20,27,20,30,35,50,52,20)
soil=c(676,589,536,499,429,368,370,387,400,423,676,589,536,499,429,368,370,387,400,423,600,605)
rain=c(7,8,2,8,6,5,4,9,7,8,2,8,6,5,4,9,5,6,9,2,3,4)
df=data.frame(prov,temp,soil,rain)
mydata<-df
attach(mydata)
mi<-mydata
scaleddata<-scale(mi$prov)
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x)))
}
maxmindf <- as.data.frame(lapply(mydata, normalize))
go<-maxmindf
forecasts <- NULL
forecasts$prov <- 1:22
forecasts$predictions <- NA
forecasts <- data.frame(forecasts)
# Training and Test Data
trainset <- maxmindf()
testset <- maxmindf()
#Neural Network
library(neuralnet)
nn <- neuralnet(prov~temp+soil+rain, data=trainset, hidden=c(3,2), linear.output=FALSE, threshold=0.01)
nn$result.matrix
plot(nn)
#Test the resulting output
#Test the resulting output
temp_test <- subset(testset, select = c("temp","soil", "rain"))
head(temp_test)
nn.results <- compute(nn, temp_test)
results <- data.frame(actual = testset$prov, prediction = nn.results$net.result)
}
minval<-min(x)
maxval<-max(x)
minvec <- sapply(mydata,min)
maxvec <- sapply(mydata,max)
denormalize <- function(x,minval,maxval) {
x*(maxval-minval) + minval
}
as.data.frame(Map(denormalize,results,minvec,maxvec))
Could you tell me please,what can i add in trainset and testset (with using loop) and how to display all predictions using a loop so that the results are displayed with a shift by one with a test sample of 5?
I am very grateful for your answers
  • asked a question related to Prediction
Question
25 answers
I would like to know whether there is a direct relationship between quantum computer technology and artificial intelligence. Can you provide your explanation with examples for more understanding?
Relevant answer
Answer
Yes definitely .
Quantum Computing and Artificial intelligence( Computing) are directly related to each other similar to Physics and Quantum Mechanics( Physics).
Quantum Computing development definitely would support to make more intelligent to machine ( Artificial Intelligence ).
Quantum computing conjures many myriads but one meaning is so fast computing that could not be countable in time.
Recognition of a person among the millions of people without taking time or thinking is quantum computing intelligence of humans that could be implemented in machines.
  • asked a question related to Prediction
Question
4 answers
Dear collegues.
I've 400 data (monthly) and I need to construct the forecast for each next month with using learning (training ) sample 50.
It means, I need to shift each time by one month with 50 elements.
train<-1:50, train<-2:51, train<-3:52,...,train<-351:400.
Could you tell me please,which function can I write in the program for automatic calculation?
Maybe, for() loop?
I am very grateful for your answers
Relevant answer
Answer
embed( data, 50 )
  • asked a question related to Prediction
Question
4 answers
I want to predict water in my project. I need to know which of them have more advantages.
Relevant answer
Answer
A Bayesian network is a graphical model; it consists of a collection of random variables that are represented as nodes in a directed graph, with the graph's edges representing the variables' interdependence.
In theory, a Dynamic Bayesian Network (DBN) functions identically to a Bayesian Network (BN): given a directed network (the structure), you may learn conditional probability tables (the parameters) from a dataset.
The primary distinction is that a DBN reflects a time-dependent phenomena; therefore, whereas a conventional BN may have a node reflecting variable "A" influencing variable "B," a DBN may have variable "A" at time=1 influencing variable "A" at time=2.
  • asked a question related to Prediction
Question
8 answers
There are different empirical equations and techniques like Fuzzy, ANN, etc.. for predicting Blast Induced Ground Vibration. In addition to these is there any software for predicting Blast Induced Ground Vibration
Relevant answer
Answer
  • asked a question related to Prediction
Question
5 answers
I have the following dataset:
SQ - SEX - Weight - letter - duration - trail - quantity
1 - male - 15KG - abc - Year 1 - 2 - quantity 1
- Year 2 - 3 - quantity 2
2 female - 17KG - cde - Year X - 4 - quantityx
- 16KG - Year Y - 6 - quantityy
- Year Z - 3 - quantityz
.... etc...
I want to make a prediction model that predict the quantity, but using classic machine learning models ( not deep learning ones, like LSTM or RNN ), i.e. linear regression, SVM , .. such that:
predict quantity of n individual at a certain duration ( duration A) what will be the quantity ?
n - male - 25KG - xlm - 34 - A - ?
What is the best was to treat and pre-process duration , trail and quantity features before fitting them to preserve their correlation with the target quantity ?
Relevant answer
Answer
Aggrigation with rolling window may help you to rearrange your column values according
  • asked a question related to Prediction
Question
9 answers
I am trying to predict peak demand using machine learning techniques. Current articles consider this as a time series prediction issue and consider a 7-day lag to predict peak demand. A ML model I am trying to apply considers new features for this prediction, and I applied it without a week prior value lag. I was challenged why I did not use lag values for time series prediction like this issue.
The objective of my project was to evaluate whether adding new features would improve the daily peak demand prediction and assess the effects of the new features. If I use new features to predict daily demand, should I also consider the previous seven days' lags as a new feature? Is it correct to combine several COVID-19 related features with the lag demand for peak demand prediction for an unstable situation like COVID-19?
Ps:
1- The model I used for prediction is LightGradient Boosting.
2- Data trained and tested during COVID-19 situation (2020 & 2021)
3- The weekly trends of my target value in 2020 and 2021 are as below figures.
Relevant answer
Answer
Pleasure Negin Zarbakhsh
Choose a high number of lags and calculate a penalized model (e.g. using LASSO, ridge or elastic net regularization). The penalization should reduce the influence of irrelevant delays, allowing the selection to be done more effectively. And Experiment with various lag combinations and either.
Fisher's Points. One of the most popular supervised feature selection approaches is the Fisher score. The method we'll employ returns the variables' rankings in decreasing order depending on the fisher's score. The variables can then be chosen based on the circumstances.
Kind Regards
Qamar Ul Islam
  • asked a question related to Prediction
Question
3 answers
I am using Qgis 2.8.3 version for molusce plugin to get the prediction landuse map. But got the error while creating a change map in area change tab.
Here attached the link for details.
Relevant answer
  • asked a question related to Prediction
Question
1 answer
Dated: 10-June-2020.
Perhaps!
Prefatory, it may be, because this year the radiations and greenhouse gases interaction feedback processes on different timescale (one of the main factor in monsoon dynamics) which makes the monsoon predictability erratic is not expected to add much uncertainty in the prediction system due to the substantial reduction in the greenhouse gas emissions. Implies, may be an upper hand for potential predictive models in the line. Recall that model ability to predict the SW monsoon is higher with initial conditions been used for the month of Feb., March, April (this years these are main lockdown month in the world when atmosphere is not invaded by atmospheric gases) than months closer to the SW monsoon. On other side, can be also be test bed for the models have near accurate long rage forecasting tendency with early months (as mentioned above) initial conditions.
Over all it may be also be manifested that NATURE can be predicted correctly if it is not disturbed. BUT if we keep on disturbing it then predictability may not be that easy and precise.
If yes, then "Commendations" to the accurate predictability of the monsoon system will be higher this year, I think. Good! This may also considered because of Nature natural tendency is higher this year apart from having well resolved and improved interannual and climate systems predictability aspects in the modelling systems, etc...
Nature is in NATURAL swing. Enjoy and try to be safe! But we should also be ready for the monsoon system predictability in the times to come or years to come when emissions will again be dumped in the earth system. It will certainly obstruct the prediction realities. Consistency is the accuracy in the prediction should be addressed responsibly.
What’s your take on that!
Relevant answer
Answer
I think yes.
  • asked a question related to Prediction
Question
9 answers
Let consider there is a selling factor like this:
Gender | Age | Street | Item 1 | Count 1 | Item 2 | Count 2 | ... | Item N | Count N | Total Price (Label)
Male | 22 | S1 | Milk | 2 | Bread | 5 | ... | - | - | 10 $
Female | 10 | S2 | Cofee | 1 | - | - | ... | - | - | 1 $
....
We want to predict the total price for a factor based on their buyer demographic information (like gender, age, job) and also their buying items and counts. It should be mentioned that we suppose that we don't know each item's price and also, the prices will be changed during the time (so, we although will have a date in our dataset).
Now it is the main question that how we can use this dataset that contains some transactional data (items) which their combination is not important. For example, if somebody buys item1 and item2, it is equal to other guys who buy item2 and item1. So, the values of our items columns should not have any differences for their value orders.
This dataset contains both multivariate and transactional data. My question is how can we predict the label more accurately?
Relevant answer
Answer
Hi Dr Behzad Soleimani Neysiani . I agree with Dr Qamar Ul Islam .
  • asked a question related to Prediction
Question
13 answers
Hi,
I am currently looking for a dataset in which I can get historical weather data (like temperature, precipitation, wind speed) for every day in every city from 2005 to today.
The data will be used for a prediction project.
Where can I find these kinds of data, or anything related?
Thank you very much.
p/s: To clarify, what I mean is I have a table with 2 columns, "date" and "city", and I want to fill the third(or how many it takes) column with weather information of that date+city combination. A lot of websites provide weather information but since my dataset is quite large, I need a way to automate the process, either a data set or a crawler-friendly website with enough information.
Relevant answer
Answer
ECMWF ERA5 dataset
  • asked a question related to Prediction
Question
7 answers
I am doing MS thesis. Title is "Time series crop yield estimation using satellite images". Below are my aims and objectives but supervisor said objectives are not correct. I dont know what should I change.Any one can help me to rewrite my objectives.
Aim: The aim of this study is to develop a model for wheat yield prediction using satellite imagery before the harvest time.
Objectives:
1.It is mandatory for the planners of any regime to have an accurate and precise estimate of a crop to cope with the shortage crises of the crop, as Pakistan faced a very serious crisis of wheat’s shortage in 2007
2.An accurate estimate of a crop gives a significant relief to the country’s exchequer in terms of saving foreign exchange
3.The main purpose of this research is, therefore, the scientific construction of a model employing all the information available via remote sensing in order to get a good and trustworthy estimate of wheat crops.
Relevant answer
Answer
You should start with a problem statement - come up with the main research question. Then you will have to break it into pieces as 3-4 research questions, which when answered would answer the main problem/research question. The research questions can be converted into objectives. All the best
  • asked a question related to Prediction
Question
4 answers
Which best way to classify table dataset using MATLAB?
How to predict or categorize text data using Convolutional Neural Network. Also, how to use deep learning for classification of text data in table dataset. (for example, numerical data or textual data)
Can we use regression or classification for prediction? Which is the best approach?
Relevant answer
Answer
The below Python framework is a repository for different deep learning based text classifications, and might be helpful
  • asked a question related to Prediction
Question
21 answers
I have 27 features and I'm trying to predict continuous values. When I calculated the VIF (VarianceInflation Factors), only 8 features are less than 10 and the remaining features range from 10 to 250. Therefore, I am facing a multicollinearity issue.
My work is guided by two aims:
1- ML models should be used to predict the values using regression algorithms.
2- To determine the importance of features( interpreting the ML models).
A variety of machine learning algorithms have been applied, including Ridge, Lasso, Elastic Net, Random Forest Regressor, Gradient Boosting Regressor, and Multiple Linear Regression.
Random Forest Regressor and Gradient BoostingRegresso showing the best performance (Lowest RMSE), while using only 10 features (out of 27 features) based on the feature importance results.
As I understand it, if I face multicollinearity issues, I can fix them using regularized Regression models like LASSO. When I applied Lasso to my model, the evaluation result is not as good as Random Forest Regressor and Gradient BoostingRegresso. However, none of my coefficients become zero when I apply the feature importance.
Moreover, I want to analyse which feature is affecting my target value and I do not want to omit my features.
I was wondering if anyone could help me determine which of these algorithms would be good to use and why?
Relevant answer
Answer
Dear Nergin,
one of the first things that I would do is to first analyse the data prior to modelling: is the collinearity of the features due to some causal process that you can hypothesise on? If so, is there a set of root variables that you can identify? Factor analysis in this case might be a good thing to look into.
Another thing to try before modelling might be clustering and selecting a subset of representatives from the clusters and input those into an algorithm for selection, separating the features into groups of collinear features and taking one of each according to some criteria. Another thing might be to use PCA on the clustered features to combine and reduce the number of features.
The choice of algorithm depends on how to model your data, If you have a linear relationship the usual LS penalised approaches could be fine, but if the Random Forest performs better on a similar number of features and without model overfitting (because that is a common thing with thsi ensemble approach), the relationship might not be linear so it gets a better score using the trees. For interpreting the results, the TreeInterprteer in Python or LIME could perhaps help you give a better understanding of what the random forest is doing.
I hope this is helpful and lots of luck!
Asier
  • asked a question related to Prediction
Question
15 answers
Please suggest if any specific software is used.
Relevant answer
Answer
I am in a holidays, away from the lab computer to check the options of origin, but anyane who have the experience in using it can provide help. Sorry for being unable to help at this time period.
  • asked a question related to Prediction
Question
4 answers
In my current project, I want to answer if various cognition items (ratio, 30+ of them, may get reduced based on a separate factor analysis) predict moral outrage - in other words, do increases in item 1 (then item 2, item 3, etc) predict increases in outrage in a significant way. Normally, this would be a simple regression. But then I complicated my design, and I'm having a hard time wrapping my head around my potential analyses and whether it will actually answer my stated question, or if I'm over-thinking things.
Currently, I'm considering a set-up where participants will see a random selection of 3 vignettes (out of 5 options) and answer the cognition items and moral outrage about each. This complicates matters because 1) there is now a repeated measure component that may (or may not?) need to be accounted for and 2)I'm not sure how my analyses would work if the vignette selection is random (thus, all vignettes will show up the same number of times, but in different combinations to different people). I am anticipating that different vignettes will not be equal in their level of DV (which is on purpose - I want to see if these patterns are general, not just at very high or very low levels of outrage).
When originally designing this, I had wanted to average the 3 vignette scores together for each subject, treating them as single, averaged item values to use in a multiple regression. But I've been advised by a couple people that this isn't an option, because the variance between the vignettes needs to be accounted for (and the vignettes can't be shown to be equivalent, and thus can't be collapsed down in analysis).
One potential analysis to combat this is a nested, vignette-within-individual multilevel design, where I see if the pattern of cognition items to outrage is consistent between vignettes (level 1) and across subjects (level 2), to account for/examine any vignette-by-cognition/MO pattern interactions. And this makes sense, as MLMs can be used to compare patterns, rather than single scores.
But I can't wrap my head around what part of this set-up/the output I would look at to actually answer my question: generally, which, if any, of these cognition items predicts outrage (regardless of vignette, or across many scenarios)? And can this approach work when the vignettes combinations differ between subjects?
Or is this the incorrect analysis approach and another, simpler one would be more fitting? For example, is the averaging approach workable in another format? What if all vignettes were done by all subjects (more arduous on the subjects, but possible if the strength of the analysis/results would be compromised/overly-complicated)?
Confirmation that my current analysis approach will indeed work, help with what part of the output would answer my actual RQ, or suggestions for an alternative approach, would be appreciated.
Relevant answer
Answer
You have many answers there
Do not complicate you research design to much: complicate just to surround some issue.
Analysis should be made at different levels as if it was a splitplot.
  • asked a question related to Prediction
Question
3 answers
Are you interested in the application of complex systems to the global history of humankind? I'm working on such a project, and I'm interested in discussions with like-minded people.
I published several articles on that in "The Complex Systems" journal (thecomplexsystems.com). A short overview of my work is in my blog (vtorvich.com) and the description of my book "Subsurface History of Humanity: Direction of History" on Amazon.
Relevant answer
Answer
All classifications are up to people.
Everybody will tell you that the history of humanity started around 5,150 years ago. Of course, not the same wording would be used. The phrase would be like this. The history of mankind is recorded history. In other words, our history began only when humanity invented writing.
The reason is simple and very convincing. The magic word is convenience. It is much easier to work with historical facts and artifacts if you have written records about them. It is hard to work with only archeological or similar data. The history with existed written records is a comfort zone for everybody. Any choice of the beginning of the history of humankind as a date before first writing will throw researchers and the public out of this comfort zone.
When did the Agricultural Revolution happen? Well, it began many thousands of years before writing was invented. If the history of mankind began in 3150 BC, then that revolution is thrown from the history of humanity into prehistory.
I consider humankind's history, as the one, which started in 42000 BC. Why exactly at this date?
You could read it in my book - https://www.amazon.com/dp/B08WZCVDTD.
  • asked a question related to Prediction
Question
25 answers
If artificial intelligence is implemented for the online mobile banking, can this banking segment be deprived of employing human capital altogether?
Please reply
Best wishes
Relevant answer
Answer
Dariusz Prokopowicz In my experience, bank employees are needed less and less banking applications, mobility, online services and even financial and credit analyzes are performed using artificial intelligence.
  • asked a question related to Prediction
Question
12 answers
After 30 years, much will change. 30 years is a long period for the continuation of the current fourth technological revolution, known as Industry 4.0.
The current technological revolution known as Industry 4.0 is motivated by the development of the following factors:
Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
On the basis of the development of the new technological solutions in recent years, dynamically developing processes of innovatively organized analyzes of large information sets stored in Big Data database systems and computing cloud computing for the needs of applications in such areas as: machine learning, Internet of Things, artificial intelligence are dynamically developing, Business Intelligence.
The development of information processing technology in the era of the current technological revolution defined by Industry 4.0 is determined by the application of new information technologies in the field of e-commerce and e-marketing.
Added to this are additional areas of application of advanced technologies for the analysis of large data sets, such as Medical Intelligence, Life Science, Green Energy, etc. Processing and multi-criteria analysis of large data sets in Big Data database systems is made according to the V4 concept, ie Volume (meaning number of data), Value (large values ​​of specific parameters of the analyzed information), Velocity (high speed of new information appearing) and Variety (high variety of information).
The advanced information processing and analysis technologies mentioned above are used more and more often for marketing purposes of various business entities that advertise their offer on the Internet or analyze the needs in this area reported by other entities, including companies, corporations, financial and public institutions. More and more commercial business entities and financial institutions conduct marketing activities on the Internet, including on social media portals.
More and more companies, banks and other entities need to conduct multi-criteria analyzes on large data sets downloaded from the Internet describing the markets on which they operate, as well as contractors and clients with whom they cooperate. On the other hand, there are already specialized technology companies that offer this type of analytical services, develop customized reports that are the result of multicriteria analyzes of large data sets obtained from various websites and from entries and comments on social media portals.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
What are the known futurological visions of technology development until around 2050?
Please reply
I invite you to the discussion
Best wishes
Relevant answer
Answer
Tech future dev. in...The advent of Biotechnology courses a few decades ago appeared to provide a better alternative to young students for their career options. The applications of Biotechnology are vast as it caters to various agricultural, animal husbandry, fishery, health, pharmaceutical n etc …Samal, K. C., Mohanty, A., Patnaik, L., & Sahoo, J. P. (2021). Career Options and Future Prospects in Biotechnology. Biotica Research Today, 3(3), 135-138.
Robotics, space tech, AI, BC, BD...Wedler, A., Schuster, M. J., Müller, M. G., Vodermayer, B., Meyer, L., Giubilato, R., ... & Reill, J. (2021). German Aerospace Center's advanced robotic technology for future lunar scientific missions. Philosophical Transactions of the Royal Society A, 379(2188), 20190574.
Rusakova, E. P., & Inshakova, A. O. (2021). INDUSTRIAL AND MANUFACTURING ENGINEERING IN DIGITAL LEGAL PROCEEDINGS IN THE ASIA-PACIFIC REGION: A NEW LEVEL OF QUALITY BASED ON DATA, BLOCKCHAIN AND AI. International Journal for Quality Research, 15(1).
  • asked a question related to Prediction
Question
8 answers
My main goal is to use Neural Networks to forecast Sunspot Numbers. Requesting the option of ANN or RNN seems simple enough. However, which is best to learn and utilize for a complete beginner? If there is a GitHub repository for similar Space Science topics based on Neural Networks, please link me to it. I'd be extremely appreciative.
Relevant answer
Answer
Ashok Silwal Are you performing the right kind of multi-step forecasting? If it is closely related to its neighbours then choose RNN which provides the possibility to model time series dynamic systems and and for accuracy you may want to create two different models for each output.
Good luck
  • asked a question related to Prediction
Question
8 answers
Will the development of computerized business analytics of large collections of economic information collected in Big Data database systems improve the forecasting of future economic processes?
Please reply
I invite you to the discussion
Thank you very much
Dear Colleagues and Friends from RG
The key aspects and determinants of applications of data processing technologies in Big Data database systems are described in the following publications:
I invite you to discussion and cooperation.
Best wishes
Relevant answer
Answer
More than 2 years have passed since I asked the above question. During these two years, has there been a significant progress in analytics based on Big Data Analytics technology towards using this analytics to forecast complex climate, natural, social and economic processes?
Regards,
Dariusz Prokopowicz
  • asked a question related to Prediction
Question
4 answers
There has been a debate on the topic "Why the sunspot number needs re-examination?". What is the reason behind this controversial topic? Which model is currently the best model to predict the Sunspot Number in Solar Cycle 25?
Relevant answer
Answer
Here is a discussion of a revised SSN:
  • asked a question related to Prediction
Question
3 answers
There is a lot of research on AI-based air pollution forecasting, but very few have put up a reasonable explanation in this regard.
I want to know what might be the reasons for the performance drop ??
Is it a problem of data length or any other issue ??
Relevant answer
It's very simple. All forecasting methods are based on the search for patterns in the retrospective data and on the assumption (hypothesis) that these patterns will be valid in the future for the forecast period. In other words, it is assumed that the training sample is representative for a certain period in the future. This period is called the period of ergodicity. But this is an incorrect assumption. Sometimes the patterns in the modeled domain change. The period of ergodicity is violated. New patterns are being formed, although the old ones may remain. Therefore, the point of violation of ergodicity is called the bifurcation point. It is necessary to predict not only based on the patterns of the past period, but also to predict the risks of violating these patterns. I did it back in 1994: http://lc.kubagro.ru/aidos/aidos02/7.4.htm (see Figure 7.2).
Это очень просто. Все методы прогнозирования основаны на поиске закономерностей в ретроспективных данных и на предположении (гипотезе), что эти закономерности будут действовать и в будущем на период прогнозирования. Иначе говоря, предполагается, что обучающая выборка репрезентативна на определенный период в будущее. Этот период называется периодом эргодичности. Но это неверное предположение. Иногда закономерности в моделируемой предметной области меняются. Период эргодичности нарушается. Формируются новые закономерности, хотя могут оставаться и прежние. Поэтому точка нарушения эргодичности называется точка бифуркации. Надо прогнозировать не только основываясь на закономерностях прошлого периода, но и прогнозировать риски нарушения этих закономерностей. Я это делал еще в 1994 году: http://lc.kubagro.ru/aidos/aidos02/7.4.htm (см. рис. 7.2).
  • asked a question related to Prediction
Question
5 answers
According to the nature of temporal data, strategies like k-fold cross validation is not an appropriate idea since we cannot remove the dimension of time. In this discussion we want to explore ideas about testing models for temporal data.
Relevant answer
Answer
Pooia Lalbakhsh you may want to employ deep learning models such as LSTM and GRU.
Good luck
  • asked a question related to Prediction
Question
6 answers
Good day scholars, I have read a lot of articles on LSTM time-series forecasting capabilities.
However, I want to know if LSTM can be used for multi-output time-series forecasting. For example, I have x,y,z variables with 1000 time steps, and I want to use LSTM to forecast all the variables (x,y,z) in future time steps). Any recommendation or suggestion will be highly appreciated.
Thanks
Relevant answer
Answer
Yes, it definitely can be used for this purpose.
To get more insights about recurrent neural network and LSTM in particular, you can read "Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning. Cambridge: MIT press; 2016 Nov 18." which is available online for free (on author's website).
Also you can take a look at tensorflow and keras tutorials on LSTM (or matlab's deep learning toolbox help if you use matlab) to see how you can easily implement them.
  • asked a question related to Prediction
Question
8 answers
The stock market prediction is a vibrant and exciting topic around the globe due to its ability to mint money with its magical prediction possibilities and furthermore rewarding academic appreciations associated with it.
But what is the feasibility of having a FUTURES* (derivative) prediction mechanism in place?
There is wide-ranging literature out there featuring Stocks, Indices and options. Why are there no articles related to futures market prediction, if there is a possibility let me know your insights.
Futures are restricted by their expiration. But, if that can be predicted, the chance of earning a handsome return is wide open as per the limited knowledge I possess.
Relevant answer
Answer
As per my predictions, Nifty will be around 15000 in the next year and the share market will go as high as 52000 in 2021. ... Market will go high as 2021 will proceed further. The share market in the last two months of 2021 will be at the peak.
  • asked a question related to Prediction
Question
20 answers
Apparently, on the financial markets and in macroeconomic determinants of the economic situation in particular sectors and entire economies of developed countries, there are symptoms that suggest a high probability of economic slowdown from 2020 in individual countries and, consequently, in the entire global economy.
Therefore, I am asking you: Do you know the forecasts of the global economic development that would suggest a high probability of deceleration (or possibly acceleration) of economic growth from 2020 in individual countries and, consequently, in the entire global economy?
What are the symptoms of potential changes in the financial markets and / or the scope of macroeconomic determinants of the economic situation in particular sectors and entire economies?
If you know the results of prognostic research in this area, please send links to websites or scientific publications in which this type of prognostic issues are taken.
I wish you the best in New Year 2019.
Best wishes
Relevant answer
Answer
9 May MMXXI
Please read attached article, THE ART OF GREED...
Cordially...
ASJ
  • asked a question related to Prediction
Question
5 answers
Hi everyone, lately by using all the nice tools and benefits that Artificial Intelligence can offer as a technology, I am searching for possible applications in the Automotive Business.
Below are some simple examples of basic scenarios that I have discovered already and I would like to enhance them or discover new ones, that could help Automotive Industries on taking proper business actions:
1. Based on historical CRM Opportunities taking into account the Lead Source (TV, WEB, Phone), Customer Genre, Customer Age, Customer Geographical Area, Customer Follow-Up-Times and Model Of Interest, Model Price, predict the possibility to convert this opportunity into an Invoice.
2. Based on historical Service Document Turnover (Service Quote -> Service Schedule -> Service Order -> Service Invoice), predict the possibility of a new open (un-invoiced) Document.
3. Based on historical Vehicle Document Turnover (Vehicle Quote -> Vehicle Order -> Vehicle Invoice), predict the possibility of a new open (un-invoiced) Document.
4. Based on historical clocking of technicians that spent on fixing vehicles taking into account Model Code, Vehicle Mileage, Job Qualification Code, Parts Number Labor, Number, predict the expected workshop load for the following scheduled based on open Service Schedules.
What do you think?
Relevant answer
Answer
Dear Stavros Koureas Utilizing rapidly expanding data availability. These companies are, for example, drastically increasing their marketing efficiency through programmatic advertising with AI at its core.
  • asked a question related to Prediction
Question
17 answers
I try to predict the occurrence of individual aquatic plants (48 species) with Random Forest (RF) models. For this I use six explanatory variables. The datasets are highly unbalanced. Lets say minimal 2.5% have presences, but can also go up to 25% (of 2000 observations). Not surprisingly, the accuracy (~70%) and Cohen's kappa (~0.2) are not very satisfactory. Moreover, the True Negative (TN) rate is high (~80%) while True Positive (TP) rate is low (~15%). I tried multiple things from changing the cut-off to 40-45%, which works somehow (still not satisfactory). Additionally, I subsampled my dataset (also down-sampling), build an RF model with 50 trees and repeat this 20 times and combine these 20 RF models in a one RF model (somehow circle reasoning as this is what down sampling does), but results in similar performance. Changing the mtry, node size (85-100% of the lowest class) or maximum number of observations ending in the terminal node (0-15% of the lowest class) also does not improve the performance. However, the latter two "smooth" the patterns, but does not improve performance or distinction between TN and TP. The best option seems to set the cut-off to 45%, node size to 90% and maximum obs to 10%.
First, my guess resulting to the low performance is of course due to the unbalanced dataset, where simply the pattern of absences is better captured than that of the presences. However, I cannot resolve this with the data I currently have (am I sure that I cannot resolve this? not really). This would mean I need more data (anyhow I want this). Second, TN are easier to predict in general. For example, fish need water, if there is no water the model predicts no fish (easy peasy). However, if there is water the model predicts fish, but because there is water, this does not necessarily mean there is fish. For aquatic plants, if flow velocity is > 0.5 m/s species of vascular plants are often absent and mosses are present. Yet, if flow velocity < 0.5 m/s this does not mean vascular are present or mosses are absent. Third, the predictor variables are not suitable and in general the species seem to distributed widely along the gradient of these predictors (you do not need an ML model to tell you this if you look at the boxplots). Moreover, correlations between predictors also present (while not an issue for prediction it is an issue for inference), for some species this is more apparent than others; and some species occur everywhere along these gradient. Although, this idea somehow seems to float around, actually relative little articles discuss this (excluding articles addressing the high misclassification rates of Trophic Macrophyte Indices in general):
Even using different model types does not really work (SVM, KNN, GLM, [binomial]). Naive Bayes seem to work, but the prior ends up extremely low for some species thus the model hardly predicts presence. However I turn or twist (organize) the data, I cannot obtain a satisfactory prediction. Are there any statistic or machine learning experts who have any tips or tricks to improve model performance, besides increasing the datasets?
P.S. Perhaps I should start a contest on Kaggle.
Relevant answer
Answer
Methods to Boost the Accuracy of a Model
Add more data. Having more data is always a good idea. ...
Treat missing and Outlier values. ...
Feature Engineering. ...
Feature Selection. ...
Multiple algorithms. ...
Algorithm Tuning. ...
Ensemble methods.
Regards,
Shafagat
  • asked a question related to Prediction
Question
11 answers
Hello all,
I am a new user of Python and Machine learning!
Hereunder, I am trying to explain my data and model and then ask my question.
I have a couple of independent variables: Ambient temperature, Solar intensity, PCM melting temperature (PCM is a kind of material that we glue to the back of the PV panel in this experiment) and, Wind speed. My only dependent variable is the power output of a photovoltaic panel.
All independent variables change during a day (from 8 AM to 5 PM) as "Time" passes. For instance, ambient temperature increases from 8 AM to 3 PM and gradually drops from 3 PM to 5 PM.
My question is: can I consider Time (which is defined in an hour -- e.g. 8,9,....,13,14,....,17) as another independent variable to use machine learning techniques (in Python) like linear regression, Bayesian linear regression and SVM in order to predict the behaviour of the system?
I think because time here shows its effects on temperatures and solar intensity directly, I can disregard "time" as another independent variable.
I am quite confused here. Any help and suggestion would be much appreciated.
Thanks a lot.
Relevant answer
Answer
Dear Mohammad Rezvanpour
I found the below page very useful if you work on Matlab. It's easy to understand
  • asked a question related to Prediction
Question
16 answers
Here is the situation: I am trying to predition the energy consumption (load) of households using the artificial intelligence (machine learning) techniques.
Problem: The data is only available for the 40% of the households. Is it possible to predict the energy consumption for the rest of 60% households based on the available data (features) of 40% of households?
Relevant answer
Answer
I think, it is possible, but I cannot guess the accuracy of the developed forecasting model.
  • asked a question related to Prediction
Question
3 answers
Dear All!
Is there a software in which I will make NMR prediction of compounds in deuterated acetontrile, acetone or methanol ? In mestrenova I can make only predictions in chloroform, dmso or water.
Thank you so much for your help!
Relevant answer
Answer
Mrs/Miss Haraźna,
Do you know the reliability of these so called predictions of NMR spectra?
There is a plenty of software for prediction of mass spectra as well. However, a comparative analysis with experiment shows a dramatic lack of accuracy between theory and experiment.
Such software are very useful to only educational purpose. Because of, all important and really observed both NMR and mass spectrometric phenomena are unable to be accounted. Thus, the so-call predicted spectra produce very illustratively the fundamental basick knowledge.
An additional comment on: RG represent forum for exchange of knowledge at a highly specialized professional level. Very frequently, many participants are unable to distinguish between highly specialized technical information and low specialized information of general or popular character. The latter one is typical for the public press. Owing to the fact that the comments on RG are inaccessible to the mass reader or to the communities as whole, this means that RG does not represent forum for distribution of knowledge at a general public level.
  • asked a question related to Prediction
Question
2 answers
Dear researchers,
Any recommendation on FREE online Webserver/ Software For metabolomic approaches and toxicity prediction for dermal ?
Is better if enclosed with guidance on how to interpret the results generated from the webserver.
This is because I would like to generate a report and have to do interpretation on it.
Thank you.
Relevant answer
Answer
ADMET toxicity tools in general.
A list of tools available on the following link:
  • asked a question related to Prediction
Question
5 answers
PS: by "Predict the energy level of battery operated sensor node", I want to mean that I am willing to predict the amount of energy the sensor node will have at a given point of time. In other words, I am interested in building some regression model for time series forecasting of the amount of energy a sensor's battery will have over the next few hours/days.
The parameters which I could thought of are as follows.
1. L_volt and H_volt of battery
2. L_curr and H_curr in the circuit
3. Energy consumption rate
4. Ambient humidity, temperature and altitude
5. Battery discharge Cycle.
6. Magnitude of solar panel's, LUX, current and voltage .
Please let me know some other parameters, which I can add to the above list.
I thank you all in advance.
Relevant answer
Answer
Predict the energy level of battery operated sensor node
  • asked a question related to Prediction
Question
4 answers
Hi all!
I've been using local copy of Mitoprot software for prediction of gene localization.
I've updated my laptop and now need to download Mitoprot again, but the download link seems to be broken. I can't find it anywhere else either. Does anyone have a copy or knows where I can get it?
Thanks!
Relevant answer
Answer
You may be able to download from ftp://ftp.biologie.ens.fr/pub/molbio/
  • asked a question related to Prediction
Question
74 answers
What kind of scientific research dominate in the field of Futurology in literature and film?
Please, provide your suggestions for a question, problem or research thesis in the issues: Futurology in literature and film.
Please reply. I invite you to the discussion
Best wishes
Relevant answer
Answer
John Carpenter: They live— is a great piece of futurological art.
  • asked a question related to Prediction
Question
8 answers
Hi,
I want to predict operons on entire bacterial genomes (already annotated). I used to use operon-mapper ( https://academic.oup.com/bioinformatics/article/34/23/4118/5040321), which is great but it has been "under maintenance" for weeks now and I really need one now.
MicrobesOnline doesn't support private genome hosting anymore and I think that the DOOR website is now down.
If you know any other tool, online or not, that would allow the genome-wide prediction of operons in complete, annotated bacterial genomes, that would be of great help to me.
Thanks
Julian
Relevant answer
Answer
You can try annotating your genome with tools like PROKKA, PGAP or RAST. And look at the predicted genes and other features
  • asked a question related to Prediction
Question
9 answers
I am calculating the Akaike Information Criterion (AIC) for a forecasting model. What should be the value of K for the data which is having a nonlinear behaviour? Usually, for the linear, it is taken as 3, while for polynomial it is taken as 4. So in my case do I need to keep it as 4?
Relevant answer
Answer
Muhammad Ali Musarat Do you have a specific example of forecasting model?
See example from here
  • asked a question related to Prediction
Question
6 answers
Hi, I am interested to know the DNA sequence of the TF corrisponding to the DNA binding domain.
Do you know any tool developed recently that can help?
Relevant answer
Answer
1) If you already know the AA sequence of the DNA binding domain, you can find out the start and stop site of this domain in its entire protein sequence. For example from 101 AA- 150 AA, which is translated by the 301 nt -450 nt in the coding region of its cDNA.
or 2) You can reverse translate the AA sequence of its DNA binding domain into degenerate nucleotide sequence, and find it out from the cDNA of your interest or BLAST it against the cDNA of your interest. For example, I have a cDNA sequence with ORF: ATGACCGTTGCCAGCAAATGCgcgtgcgatgaatttggccatattaaactgACGGATCGATACGTACAGTAA, if I need to find out what the exact cDNA sequence corresponding to the AA sequence ACDEFGHIKL, I first reverse translate the AA sequence into gcntgygaygarttyggncayathaarytn (if Snap gene can't do the reverse translation, you can use online tool https://www.bioinformatics.org/sms2/rev_trans.html ), this sequence can be used to search (if Snap gene software allows you search sequence with ambiguity), blast or align against the cDNA sequence.
  • asked a question related to Prediction
Question
4 answers
I am doing prediction for future rates in Eviews via "Automatic Arima Forecasting". While entering the window, in the regressors section "C" is already available. Do I need to keep it when introducing an independent variable or remove "C" in the presence of an independent variable?
Relevant answer
Answer
Waqas Farooq
noted, thank you.
  • asked a question related to Prediction
Question
10 answers
I am interested in finding RUL of bearing, which is currently in operation. I could find papers which uses run to failure datasets, which is not of my interest.
Can anyone suggest methods or papers to obtain rul for an incomplete dataset? If I use vibration data, wont it be different for different bearings?
Relevant answer
Answer
To my experience, in case of incomplete dataset, the only reliable RUL prediction method is based on physical features and a pre- defined degradation model incorporated into a filtering framework, such as Kalman filtering or particle filtering.
  • asked a question related to Prediction
Question
2 answers
I want to generate some nice prediction plots from my MRQAP model. I've laid out my process below, and would be very grateful to get anyone's insight, as I'm not seeing much written about this online.
I am building my own regression models on network data in R, using quadratic assignment procedure with Decker and colleagues (2007) double-semi-partialling method. In other words, I am predicting the weight of an edge given its respective node traits. This approach uses node permutations of residuals to adjust for interdependence of observations in the network. (Regression with networks involves huge heteroskedasticity, because the observations are literally connected).
Traditionally, this method (MRQAP with DSP) just produces a p-value, and original standard errors are suspect. So, I am using a Doug Altman's method to back-transform p-values into new standard errors that better reflect the actual error range (read more here; thanks to @Andrew Paul McKenzie Pegman: https://www.bmj.com/content/343/bmj.d2090). This at least allows me to make nice dot-and-whisker plots of beta coefficients and with their confidence intervals (estimate + se*.196, etc.). However, I'd still really like to make predictions.
There seem to be two logical routes to make predictions from an MRQAP model.
First, you could just make predictions normally.
This relies on your observed residuals in the model to calculate the standard error for your predictions. I think this might even work, because the homoskedasticity assumption in regression is really about covariate standard error and p-values, not prediction; this means that a heteroskedastic model can still produce solid predictions (see Matthew Drury's & Jesse Lawson's helpful notes here: https://stats.stackexchange.com/questions/303787/using-model-with-heteroskedasticity-for-predictions). However, I would love some external verification on this. Any sources I can draw on to be confident I can use this for visualizing predicted effects from networks?
Second, you could simulate the predictions, like in Zelig/Clarify.
Simulation requires building a multivariate normal distribution, where each vector has a mean of one of your model coefficients, and where the vectors share the same general correlation structure as your variance-covariance matrix. Then, you make a sample from this multi-variate distribution (eg. grab a row of observations from each vector), use these as your coefficients, and generate a set of predictions. You then repeat this about 1000 times, grabbing different sets of slightly-differing coefficients.
In other words, this approach comes with a few assumptions: 1) Your coefficients might be slightly off, but if they're wrong, they follow a normal distribution. 2) The distribution for each coefficient is related to the other coefficients in specific, empirically observed ways. 3) These distributions don't necessarily have standard deviations that reflect the nice new standard deviations generated from our DSP p-values! Ordinarily, I'd think that you'd want a multivariate normal distribution where each assumptions 1 (normal) and 2 (correlated) apply, but where you've also constrained each coefficient's distribution to reflect the standard errors from DSP. But there doesn't seem to be a good way to do this, since standard error doesn't directly factor into making a multivariate normal distribution (to my knowledge). You mostly just need the mean (coefficients) and a variance-covariance matrix.
To any kind souls out there who have read this far, what would you recommend? Should I just use normal prediction? Should I simulate with a multivariate normal distribution? Should I make some weird third multivariate-normal-distribution-that-somehow-resembles-my-standard-errors-made-indirectly-from-MRQAP-DSP?
Any thoughts would be appreciated!
Relevant answer
Answer
Thanks Muhammad Ali for your feedback. I'm afraid these papers don't seem to specifically answer how to handle predictions, but I could be wrong. (Dekker and colleagues' 2007 piece is certainly foundational, since they developed the technique I'm using (double-semi partialling). Any thoughts out there would still be very helpful.
Tentatively, for those interested, I've fallen on the following conclusion:
Heteroskedasticity is the big problem in network regression models. But, this is because it inflates type II error for coefficient p-values. Heteroskedasticity does not invalidate model predictions; for example, machine learning models, which are less concerned with coefficient p-values and more with prediction, do not worry about heteroskedasticity as much.
As a result, I have concluded that the standard methods should be fine. Simulations, like used in Zelig, are even better, because the multivariate normal distribution helps us adjust for sampling error too. But, as a safeguard, we probably should only present predictions when varying a coefficient that MRQAP-DSP found to be statistically significant.
Feel free to be in touch if you have thoughts about this; would love to get your input.
  • asked a question related to Prediction
Question
1 answer
Hi!
I have recently come across POD in fluid flow. I have seen they are used to analyze coherent structure in fluid flow. I also came across the use of POD modes and energy content within modes and selection of high energy content modes.
I think my concept is still very shaky. I shall be grateful what are uses of POD in fluid flow and can it be used to flow prediction in future time?
Relevant answer
Answer
Please read the following papers/books.They are very helpful.
For an informal introduction :
  1. Narasimha R., - Kosambi and Proper Orthogonal Decomposition. Resonance 16, 574–581 (2011)
  2. Lindsay I. Smith A tutorial on Principal Component Analysis(2002)
For some rigour :
  1. Kosambi, D. D. Statistics in function space Journal of the Indian Mathematical Society, 7 . pp. 76-88. ISSN 0019-5839(1943)
  2. Holmes P., Lumley J.L., Berkooz G., Rowley C.W. - Turbulence, Coherent Structures, Dynamical Systems and Symmetry -Cambridge University Press (2012)
  • asked a question related to Prediction
Question
7 answers
Can we affirm that whenever one has a prediction algorithm, one can also get a correspondingly good compression algorithm for data one already have, and vice versa?
Relevant answer
Answer
There is some correlation between compression and perdition. Prediction is a tool of compression. Assume you have data and you you have redundancy in it you can predict the redundancy from the context of the signal and remove the redundancy by simply subtracting the the predicted signal from the real signal.
The difference will be the compressed signal.
The prediction is a powerful concept to reduce the redundancy in the signals and consequently compress it.
prediction is used intensively in video codecs and other signal codecs.
Best wishes
  • asked a question related to Prediction
Question
6 answers
Hi folks,
I ran a pharmacological neuroimaging cross-over RCT of single dose citalopram in subjects with and without autism. I used [1H]MRS to quantify glutamate+glutamine (Glx) and GABA.
In both groups, there is a sig correlation between baseline Glx and citalopram induced %change*. Those with the highest Glx at placebo responded with a decrease in Glx and those with the lowest Glx at placebo responded with glx increase.
I'm wondering if this is just a simple case of regression to the mean, or if it's evidence of the normalising effect of citalopram.
It's noteworthy that baseline Glx correlates with severity of ASD symptoms and the scans were done in a random order, separated by at least a week.
Interested to hear your thoughts.
J
%change = ((glutamate under drug condition – glutamate under placebo) / glutamate under placebo)X100))).
Relevant answer
Answer
follow
  • asked a question related to Prediction
Question
9 answers
Are there long-term risk management systems being developed, long-term projects, predicted over the next several decades, adverse climatic changes related to the global warming process?
If so, which institutions do these types of risk management systems for predicted climate change?
Are there scientific research in this area?
Are there published scientific studies that would confirm the need to develop long-term risk management systems, long-term, predictable in the next several decades, unfavorable climate changes related to the global warming process?
Please reply
I invite you to the discussion
Thank you very much
Best wishes
Relevant answer
Answer
Satellite Observation and Climate Model Simulation...Zou, C. Z. (2018). Satellite Observation and Climate Model Simulation of Global Warming Process. AGUFM, 2018, C53B-07.
  • asked a question related to Prediction
Question
3 answers
I have a database, I have converted it to kowledge graph ,this database contain a missing values .
now i want to predict this value using knowledge graph
help me .......
Relevant answer
Answer
Not exactly sure what you would like to do, however maybe knowledge inference is what you are looking for? At least you can use knowledge inference in order to insert missing values. Have a look for instance for those papers:
  • asked a question related to Prediction
Question
13 answers
What in your opinion will the applications of the technology of analyzing big information collections in Big Data database systems be developed in the future?
In which areas of industry, science, research, information services, etc., in your opinion, will the applications of technology for the analysis of large collections of information in Big Data database systems be developed in the future?
Please reply
I invite you to the discussion
I described these issues in my publications below:
I invite you to discussion and cooperation.
Best wishes