Science topics: Data MiningPrediction
Science topic

Prediction - Science topic

Explore the latest questions and answers in Prediction, and find Prediction experts.
Questions related to Prediction
  • asked a question related to Prediction
Question
4 answers
I have been studying a particular set of issues in methodology, and looking to see how various texts have addressed this.  I have a number of sampling books, but only a few published since 2010, with the latest being Yves Tille, Sampling and Estimation from Finite Populations, 2020, Wiley. 
In my early days of survey sampling, William Cochran's Sampling Techniques, 3rd ed, 1977, Wiley, was popular. I would like to know which books are most popularly used today to teach survey sampling (sampling from finite populations).
I posted almost exactly the same message as above to the American Statistical Association's ASA Connect and received a few recommendations, notably Sampling: Design and Analysis,  Sharon Lohr, whose 3rd ed, 2022, is published by CRC Press.  Also, of note was Sampling Theory and Practice, Wu and Thompson, 2020, Springer.
Any other recommendations would also be appreciated. 
Thank you  -  Jim Knaub
Relevant answer
Answer
Here are some recommended ones: 1. "Sampling Techniques" by William G. Cochran This classic book covers a wide range of sampling methods with practical examples. It’s comprehensive and delves into both theory and application, making it valuable for students and professionals. 2. "Survey Sampling" by Leslie Kish'' This is another foundational text, known for its detailed treatment of survey sampling design and estimation methods. Kish's book is especially useful for those interested in practical survey applications. 3. "Model Assisted Survey Sampling" by Carl-Erik Särndal, Bengt Swensson, and Jan Wretman This book introduces model-assisted methods for survey sampling, which blend traditional design-based methods with model-based techniques. It's ideal for more advanced readers interested in complex survey designs. 4. "Sampling of Populations: Methods and Applications" by Paul S. Levy and Stanley Lemeshow This text is widely used in academia and provides thorough explanations of different sampling methods with a focus on real-world applications. It also includes case studies and practical exercises, making it helpful for hands-on learners. 5. "Introduction to Survey Sampling" by Graham Kalton This introductory book offers a concise and accessible overview of survey sampling methods. It’s well-suited for beginners who need a straightforward introduction to key concepts. 6. "Designing Surveys: A Guide to Decisions and Procedures" by Johnny Blair, Ronald F. Czaja, and Edward A. Blair This book focuses on the practical aspects of designing and conducting surveys, with particular emphasis on decision-making and procedural choices in the survey process.
  • asked a question related to Prediction
Question
2 answers
I encountered an unusual observation while constructing a nomogram using the rms package with the Cox proportional hazards model. Specifically, when Karnofsky Performance Status (KPS) is used as a alone predictor, the nomogram points for KPS decrease from high to low. However, when KPS is combined with other variables in a multivariable model, the points for KPS increase from low to high. Additionally, I've noticed that the total points vary from low to high for all variables, while the 1-year survival probability shifts from high to low.
Could anyone help clarify why this directional shift in points occurs? Are there known factors, such as interactions, scaling differences, or confounding effects, that might explain this pattern?
Relevant answer
Answer
Thank you
  • asked a question related to Prediction
Question
2 answers
According to event segmentation theory, high prediction errors let us perceive when there is an event boundary. While pattern separation is a mechanism that provides a way to distinguish similar memories/events from each other, while pattern completion is a mechanism that aids in retrieving complete memories/events from partial memories.
Is it fair to assume that a high prediction error causes pattern separation at event boundaries, while a low prediction error causes pattern completion within events.
Any feedback would be amazing
Relevant answer
Answer
Hello Franciose,
Thank you for taking the time to answer my question. Indeed, the reasoning behind my question is to elucidate more refined questions in the hope that I can find a starting point to get to the main question. I am interested in understanding prediction error and the mechanisms of pattern separation and completion in the hippocampus. I am not familiar with the place cells literature, but it is thought that pattern separation happens in DG while pattern completion is in the CA3, so maybe place cells in the CA3.
Thank you once again
  • asked a question related to Prediction
Question
1 answer
1)
Preprint Nuance
2)
Preprint Nuance 2
Relevant answer
Answer
Yes, some theories can be considered too robust to risk betting against due to their extensive empirical support, explanatory power, and predictive accuracy. Here’s why and how certain theories become so resilient:
Characteristics of Robust Theories:
  1. Empirical Evidence:Robust theories are typically supported by a wealth of empirical evidence from multiple studies across different contexts. This evidence consistently validates the predictions and hypotheses derived from the theory. Example: The theory of evolution by natural selection is supported by extensive evidence from paleontology, genetics, molecular biology, and observational studies in ecology.
  2. Explanatory Power:These theories provide comprehensive explanations for a wide range of phenomena within their domain. They integrate disparate observations into a coherent framework that enhances understanding and insight. Example: The theory of relativity (both special and general) explains diverse physical phenomena such as the behavior of light, gravity, and the structure of the universe.
  3. Predictive Accuracy:Robust theories have predictive power—they accurately forecast future observations and outcomes based on their principles and laws. This predictive capability enhances their credibility and utility. Example: Quantum mechanics accurately predicts the behavior of subatomic particles and has enabled technological advancements such as semiconductor devices and quantum computing.
  4. Consensus Among Experts:There is broad consensus among experts and researchers in the field regarding the validity and reliability of robust theories. This consensus reflects rigorous testing, peer review, and validation processes. Example: The germ theory of disease, which posits that microorganisms are the cause of many infectious diseases, is widely accepted in medical science due to overwhelming evidence and consensus.
Reasons They Are Difficult to Bet Against:
  1. High Confidence Level:The accumulation of evidence and the robustness of testing over time instill a high level of confidence in these theories. They have withstood scrutiny, challenges, and attempts at falsification. Example: Climate change theory, supported by extensive climate data, modeling, and interdisciplinary research, is robust against skepticism due to its consistent findings across different scientific disciplines.
  2. Utility and Applications:Robust theories often underpin practical applications and technological innovations. Their reliability and predictive accuracy make them indispensable for advancing knowledge and driving progress in various fields. Example: Newton's laws of motion and gravitation are fundamental to engineering, astronomy, and space exploration, providing the basis for designing spacecraft trajectories and satellite orbits.
  3. Continual Testing and Refinement:Despite their robustness, theories are continually tested, refined, and sometimes modified in response to new evidence or anomalies. This dynamic process ensures theories remain relevant and accurate. Example: Darwinian evolution has evolved with new insights from genetics, molecular biology, and ecology, enriching our understanding of evolutionary mechanisms over time.
Conclusion:
In conclusion, robust theories are grounded in substantial empirical evidence, possess strong explanatory power, demonstrate predictive accuracy, and enjoy widespread consensus among experts. These qualities make them highly reliable and difficult to bet against because they have repeatedly demonstrated their ability to withstand scrutiny and provide reliable frameworks for understanding the natural world. However, the scientific process encourages ongoing evaluation and refinement, ensuring theories remain dynamic and responsive to new discoveries and challenges.
  • asked a question related to Prediction
Question
2 answers
Modern physics because afterlife prediction is new. More specifically, exact and concrete quantum mechanics.
Relevant answer
Answer
This is a suggestion I've read in a book of Irina Radunskaja. All souls require some bits to be distinguishable. The exact number of bits depends on the underlying religion (do animals have a soul?). Accordings to Landauer's principle, a certain amount of energy is needed to store these bits. One should expect that, when a person dies, this energy is released as a photon, which could be measured with a photodetector, proving the persistence of the soul. In the book a new kind of science was proposed. Quantum theology.
Regards,
Joachim
  • asked a question related to Prediction
Question
1 answer
Relevant answer
Answer
Your project presents a comprehensive exploration of the concept of reincarnation, blending insights from various disciplines including philosophy, hard sciences, engineering, and softer sciences. The structured approach, delineated in the table of contents, facilitates a thorough examination of the topic, guiding readers through different layers of analysis.
However, it's crucial to ensure that each section contributes cohesively to the overarching argument. While the incorporation of differential equations adds an intriguing dimension to the discourse, it's essential to elucidate its relevance and application within the context of reincarnation.
Moreover, attributing the belief in reincarnation solely to white supremacy is a bold assertion that warrants meticulous substantiation. Providing concrete evidence and nuanced reasoning to support this claim will enhance the credibility of your argument and foster deeper engagement with your thesis.
Furthermore, the inclusion of suggestions for fostering social justice through specific metaphysical frameworks, such as a Universalist Christian Heaven, adds depth to the discussion. Nonetheless, ensuring clarity and feasibility in implementing these suggestions will be paramount for their effectiveness.
Overall, your project exhibits a commendable interdisciplinary approach and ambitious scope. By refining your argumentation, providing robust evidence, and ensuring clarity in your proposals, you can elevate the discourse and foster meaningful dialogue on the improbable belief in reincarnation and its societal implications.
  • asked a question related to Prediction
Question
2 answers
Of course I sometimes doubt the afterlife is eternal salvation for all, so, I live and deduce what it might be...
Relevant answer
Answer
Jeus can give man eternal life and redemption, according to the Scritpure.
  • asked a question related to Prediction
Question
3 answers
A long pending challenge to predict in advance the possible earthquakes on this planet, so that our system can take appropriate corrective action to reduce the disaster of human life.
Research all over the globe continuously in progress. Global Society wish to know the status on this important issue.
Relevant answer
Answer
To
All ResearchGate Members who are interacting on this Question:
#🌎🌎🌍🌍🌏🌏#
We had never before the following:
*Highly informed global society due to information technology,
*Importance to knowledge sharing, *Interdependency requirement of all,
*Importance to cooperation at all levels.
Please remember “Mother Earth” has given enough to satisfy everyone's need, but not everyone's greed.
Let us believe, we have one life hence contribute towards humanity and develop a new concept “One Earth and One Family”.
Wish you all a happy and prosperous New Year.
🙏🙏
  • asked a question related to Prediction
Question
7 answers
What are the possibilities of applying AI-based tools, including ChatGPT and other AI applications in the field of predictive analytics in the context of forecasting economic processes, trends, phenomena?
The ongoing technological advances in ICT and Industry 4.0/5.0, including Big Data Analytics, Data Science, cloud computing, generative artificial intelligence, Internet of Things, multi-criteria simulation models, digital twins, Blockchain, etc., make it possible to carry out advanced data processing on increasingly large volumes of data and information. The aforementioned technologies contribute to the improvement of analytical processes concerning the operation of business entities, including, among others, in the field of Business Intelligence, economic analysis as well as in the field of predictive analytics in the context of forecasting processes, trends, economic phenomena. In connection with the dynamic development of generative artificial intelligence technology over the past few quarters and the simultaneous successive increase in the computing power of constantly improved microprocessors, the possibilities of improving predictive analytics in the context of forecasting economic processes may also grow.
In view of the above, I address the following question to the esteemed community of scientists and researchers:
What are the possibilities of applying AI-based tools, including ChatGPT and other AI applications for predictive analytics in the context of forecasting economic processes, trends, phenomena?
What are the possibilities of applying AI-based tools in the field of predictive analytics in the context of forecasting economic processes?
And what is your opinion on this topic?
What is your opinion on this issue?
Please answer,
I invite everyone to join the discussion,
Thank you very much,
Best regards,
Dariusz Prokopowicz
The above text is entirely my own work written by me on the basis of my research.
In writing this text I did not use other sources or automatic text generation systems.
Copyright by Dariusz Prokopowicz
Relevant answer
Answer
Artificial Intelligence (AI) has revolutionized numerous industries, and its potential in the field of predictive analytics for forecasting economic processes is immense. AI-based tools, including ChatGPT and other AI applications, have the capability to transform the way we predict economic trends, phenomena, and processes.
One of the possibilities of applying AI-based tools in predictive analytics is their ability to analyze vast amounts of data quickly and efficiently. Traditional methods often struggle with handling large datasets, leading to delayed insights and inaccurate predictions. However, AI algorithms can process massive amounts of information within seconds, enabling economists to make more informed decisions based on real-time data.
Furthermore, AI-based tools can identify patterns and correlations that are not easily recognizable by humans. By analyzing historical economic data alongside various external factors such as social media sentiment or global events, these tools can uncover hidden relationships that contribute to accurate forecasts. This level of analysis provides invaluable insights for policymakers, businesses, and investors alike.
Another possibility lies in the ability of AI-based tools to continuously learn and adapt. As they process more data over time, these algorithms become smarter and more accurate in predicting economic trends. This iterative learning process ensures that forecasts remain up-to-date and relevant even in rapidly changing economic landscapes.
Moreover, implementing AI-based predictive analytics can significantly reduce human bias in forecasting economic processes. Human judgment is often influenced by personal beliefs or emotions which can lead to biased predictions. However, AI algorithms are driven purely by data-driven analysis without any subjective biases.
In conclusion, the possibilities of applying AI-based tools for predictive analytics in forecasting economic processes are vast. These technologies offer unparalleled speed in processing large datasets while uncovering hidden patterns that humans may overlook. Additionally, their continuous learning capabilities ensure accurate predictions even amidst dynamic environments. By embracing these advancements in technology assertively today, we can unlock a future where our understanding of economics is enhanced through precise forecasting techniques powered by artificial intelligence.
  • asked a question related to Prediction
Question
6 answers
Can the application of artificial intelligence and Big Data Analytics technologies help improve system energy security management processes and enhance this security?
Probably yes if the issue of new green technologies, the development of emission-free clean energy is a priority in the energy policy shaped by the government. Efficient application of artificial intelligence and Big Data Analytics technologies can help improve system energy security management processes and increase this security. However, it is crucial to effectively combine the functionality of artificial intelligence and Big Data Analytics technologies and efficiently apply these technologies to manage the risk of energy emergencies, analyze the determinants shaping the development of energy and energy production, analyze the factors shaping the level of energy security, and forecast future energy production in the context of forecasting changes in the level of energy demand, energy production from specific types of energy sources and the possibility of energy production from specific types of energy sources determined by specific determinants.
In view of the above, I address the following question to the esteemed community of scientists and researchers:
Can the application of artificial intelligence and Big Data Analytics technologies help improve the processes of systemic energy security management and enhance this security?
Can artificial intelligence and Big Data Analytics help improve systemic energy security management processes?
And what is your opinion on this topic?
What is your opinion on this issue?
Please answer,
I invite everyone to join the discussion,
Thank you very much,
Best wishes,
Dariusz Prokopowicz
The above text is entirely my own work written by me on the basis of my research.
In writing this text I did not use other sources or automatic text generation systems.
Copyright by Dariusz Prokopowicz
Relevant answer
Answer
Big data can be used to improve the energy system and ensure energy security by securing operations through load forecasting, fault detection and diagnosis, and voltage dip estimations.
Regards,
Shafagat
  • asked a question related to Prediction
Question
6 answers
Is it possible to build a highly effective forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science analytics, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies?
Is it possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science analytics, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies as part of a forecasting system for complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of a self-fulfilling prediction and to increase the scale of the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied?
What do you think about the involvement of artificial intelligence in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies for the development of sophisticated, complex predictive models for estimating current and forward-looking levels of systemic financial, economic risks, debt of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting trends in economic developments and predicting future financial and economic crises?
Research and development work is already underway to teach artificial intelligence to 'think', i.e. the conscious thought process realised in the human brain. The aforementioned thinking process, awareness of one's own existence, the ability to think abstractly and critically, and to separate knowledge acquired in the learning process from its processing in the abstract thinking process in the conscious thinking process are just some of the abilities attributed exclusively to humans. However, as part of technological progress and improvements in artificial intelligence technology, attempts are being made to create "thinking" computers or androids, and in the future there may be attempts to create an artificial consciousness that is a digital creation, but which functions in a similar way to human consciousness. At the same time, as part of improving artificial intelligence technology, creating its next generation, teaching artificial intelligence to perform work requiring creativity, systems are being developed to process the ever-increasing amount of data and information stored on Big Data Analytics platform servers and taken, for example, from selected websites. In this way, it may be possible in the future to create "thinking" computers, which, based on online access to the Internet and data downloaded according to the needs of the tasks performed and processing downloaded data and information in real time, will be able to develop predictive models and specific forecasts of future processes and phenomena based on developed models composed of algorithms resulting from previously applied machine learning processes. When such technological solutions become possible, the following question arises, i.e. the question of taking into account in the built intelligent, multifaceted forecasting models known for years paradoxes concerning forecasted phenomena, which are to appear only in the future and there is no 100% certainty that they will appear. Well, among the various paradoxes of this kind, two particular ones can be pointed out. One is the paradox of a self-fulfilling prophecy and the other is the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied. If these two paradoxes were taken into account within the framework of the intelligent, multi-faceted forecasting models being built, their effect could be correlated asymmetrically and inversely proportional. In view of the above, in the future, once artificial intelligence has been appropriately improved by teaching it to "think" and to process huge amounts of data and information in real time in a multi-criteria, creative manner, it may be possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology, a system for forecasting complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of a self-fulfilling prophecy and increase the scale of the paradox of not allowing a predicted crisis to occur due to pre-emptive anti-crisis measures applied. In terms of multi-criteria processing of large data sets conducted with the involvement of artificial intelligence, Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4. 0 technologies, which make it possible to effectively and increasingly automatically operate on large sets of data and information, thus increasing the possibility of developing advanced, complex forecasting models for estimating current and future levels of systemic financial and economic risks, indebtedness of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting economic trends and predicting future financial and economic crises.
In view of the above, I address the following questions to the esteemed community of scientists and researchers:
Is it possible to build a highly effective, multi-faceted, intelligent forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies in a forecasting system for complex, multi-faceted economic processes in such a way as to reduce the scale of the impact of the paradox of the self-fulfilling prophecy and to increase the scale of the paradox of not allowing a forecasted crisis to occur due to pre-emptive anti-crisis measures applied?
What do you think about the involvement of artificial intelligence in combination with Data Science, Big Data Analytics, Business Intelligence and/or other Industry 4.0 technologies to develop advanced, complex predictive models for estimating current and forward-looking levels of systemic financial risks, economic risks, debt of the state's public finance system, systemic credit risks of commercially operating financial institutions and economic entities, forecasting trends in economic developments and predicting future financial and economic crises?
What do you think about this topic?
What is your opinion on this subject?
Please respond,
I invite you all to discuss,
Thank you very much,
Warm regards,
Dariusz Prokopowicz
Relevant answer
Answer
In my opinion, in order to determine the question of the possibility of building a highly effective forecasting system for future financial and economic crises based on artificial intelligence technology in combination with Data Science analytics, Big Data Analytics, Business Intelligence and/or other Industry 4.0/5.0 technologies, it is first necessary to precisely define the essence of forecasting specific risk factors, i.e. factors that in the past were the sources of the occurrence of certain types of economic, financial and other crises and that may be such factors in the future. But will such a structured forecasting system based on a combination of Big Data Analytics and Artificial Intelligence be able to forecast events that appear as unusual, generating new types of risks, referred to as so-called "black swans", such as forecasting the appearance of another but generated by a difficult to predict new type of risk, an unusual event leading to the occurrence of another e.g. something similar to the 2008 global financial crisis, the 2020 pandemic, or something completely new that has not yet appeared.
What is your opinion on this issue?
Please answer,
I invite everyone to join the discussion,
Thank you very much,
Warm regards,
Dariusz Prokopowicz
  • asked a question related to Prediction
Question
3 answers
Hello everyone,
I have a question regarding the use of the ANN tool in Matlab. I'm wondering if there is a specific model or final predictive equation that is generated when utilizing this tool. Any insights would be greatly appreciated. Thank you.
Relevant answer
Answer
When using the ANN tool in MATLAB, a specific model or final predictive equation isn't generated. The trained neural network operates as a complex, non-linear black-box model, making predictions based on learned connections and parameters. It doesn't have a simple, explicit equation like in traditional statistical models. Instead, it captures intricate patterns in the data to provide accurate predictions for new inputs.
  • asked a question related to Prediction
Question
4 answers
Performance prediction is required to optimally deploy workloads and inputs to a particular machine/accelerator in computing systems. Different predictors (e.g. AI predictors) come with different trade-offs, such as complexity, accuracy, and overheads. Which ones are the best?
Relevant answer
Answer
Performance predictors serve the captivating role of crystal balls in the realm of human endeavors, aiming to unlock the mysteries of future accomplishments. Their purpose is to peer into the enigmatic fog of uncertainty and offer glimpses of potential outcomes, providing guidance and informed decision-making. These predictors, resembling intrepid explorers of probability, draw upon a plethora of data, statistical models, and machine learning algorithms, all in pursuit of unveiling the secrets of success. While the notion of "best" remains elusive due to the ever-evolving nature of predictive analytics, the most esteemed predictors harmonize precision, versatility, and adaptability. These superlative predictors, analogous to virtuoso symphonies of foresight, dance in tandem with the idiosyncrasies of the domain, capturing intricate patterns, subtle nuances, and contextual dynamics to bestow valuable insights and empower us with the ability to chart courses towards triumph.
  • asked a question related to Prediction
Question
6 answers
In 2007 I did an Internet search for others using cutoff sampling, and found a number of examples, noted at the first link below. However, it was not clear that many used regressor data to estimate model-based variance. Even if a cutoff sample has nearly complete 'coverage' for a given attribute, it is best to estimate the remainder and have some measure of accuracy. Coverage could change. (Some definitions are found at the second link.)
Please provide any examples of work in this area that may be of interest to researchers. 
Relevant answer
Answer
I would like to restart this question.
I have noted a few papers on cutoff or quasi-cutoff sampling other than the many I have written, but in general, I do not think those others have had much application. Further, it may be common to ignore the part of the finite population which is not covered, and to only consider the coverage, but I do not see that as satisfactory, so I would like to concentrate on those doing inference. I found one such paper by Guadarrama, Molina, and Tillé which I will mention later below.
Following is a tutorial i wrote on quasi-cutoff (multiple item survey) sampling with ratio modeling for inference, which can be highly useful for repeated official establishment surveys:
"Application of Efficient Sampling with Prediction for Skewed Data," JSM 2022: 
This is what I did for the US Energy Information Administration (EIA) where I led application of this methodology to various establishment surveys which still produce perhaps tens of thousands of aggregate inferences or more each year from monthly and/or weekly quasi-cutoff sample surveys. This also helped in data editing where data collected in the wrong units or provided to the EIA from the wrong files often showed early in the data processing. Various members of the energy data user community have eagerly consumed this information and analyzed it for many years. (You might find the addenda nonfiction short stories to be amusing.)
There is a section in the above paper on an article by Guadarrama, Molina, and Tillé(2020) in Survey Methodology, "Small area estimation methods under cut-off sampling," which might be of interest, where they found that regression modeling appears to perform better than calibration, looking at small domains, for cutoff sampling. Their article, which I recommend in general, is referenced and linked in my paper.
There are researchers looking into inference from nonprobability sampling cases which are not so well-behaved as what I did for the EIA, where multiple covariates may be needed for pseudo-weights, or for modeling, or both. (See Valliant, R.(2019)*.) But when many covariates are needed for modeling, I think the chances of a good result are greatly diminished. (For multiple regression, from an article I wrote, one might not see heteroscedasticity that should theoretically appear, which I attribute to the difficulty in forming a good predicted-y 'formula'. For psuedo-inclusion probabilities, if many covariates are needed, I suspect it may be hard to do this well either, but perhaps that may be more hopeful. However, in Brewer, K.R.W.(2013)**, he noted an early case where failure using what appears to be an early version of that helped convince people that probability sampling was a must.)
At any rate, there is research on inference from nonprobability sampling which would generally be far less accurate than what I led development for at the EIA.
So, the US Energy Information Administration makes a great deal of use of quasi-cutoff sampling with prediction, and I believe other agencies could make good use of this too, but in all my many years of experience and study/exploration, I have not seen much evidence of such applications elsewhere. If you do, please respond to this discussion.
Thank you - Jim Knaub
..........
*Valliant, R.(2019), "Comparing Alternatives for Estimation from Nonprobability Samples," Journal of Survey Statistics and Methodology, Volume 8, Issue 2, April 2020, Pages 231–263, preprint at 
**Brewer, K.R.W.(2013), "Three controversies in the history of survey sampling," Survey Methodology, Dec 2013 -  Ken Brewer - Waksberg Award article: 
  • asked a question related to Prediction
Question
8 answers
Can artificial intelligence already predict our consumer behaviour and in a short while will it be able to predict which shop we will go to and what we will buy tomorrow?
With the help of artificial intelligence, how can systems for monitoring citizens' consumer behaviour based on GPS geolocalisation and information contained in smartphones be improved?
The lockdowns and national quarantines introduced during the coronavirus pandemic (Covid-19) caused a strong decline in sales and turnover generated in traditionally, physically functioning shops and service establishments. The lockdowns imposed on selected service industries and on traditionally operated trade also resulted in an acceleration of e-commerce, the sale of products and services conducted via the Internet. When the coronavirus pandemic was no longer interpreted in terms of high health and economic risk, a significant proportion of traditionally operated trade and physical service establishments also returned to traditionally operated business, customer service, product or service sales. On the other hand, emerging new ICT and Industry 4.0 solutions are being implemented and support the economic activities of companies, enterprises, service establishments and shops producing and/or offering their products or services in both traditional and Internet-based formats. when the pandemic was considered to be over and did not generate major risks for the economic activities of service establishments and shops, new ICT and Industry 4. 0, including artificial intelligence technologies, are being implemented in information systems to support the sales processes of product or service offerings, including improving tools for activating potential consumers, getting customers interested in new product or service offerings, and encouraging customers to visit stationary shops and service establishments. In this regard, startups have been rapidly developing over the past few years, which, using anonymous mobile user identifiers and accurate location and internet user data available in various applications installed on smartphones, are able to precisely locate where a smartphone user is at any given time and diagnose whether he or she is by chance making a purchase in a specific stationary shop, walking down the street passing by an establishment providing specific services and perhaps considering using those services. In a situation where a technology start-up has data on a specific Internet user downloaded from a number of different Internet applications and, on the basis of this data collected on Big Data Analytics information processing and analysis platforms, has drawn up information-rich characteristics of the interests and purchasing preferences of a kind of digital avatar equivalent to a specific Internet user, then, in combination with analysis of current customer behaviour and GPS-based geolocalisation, it is able to make real-time predictions about the subsequent behaviour and/or purchasing decisions of individual potential customers of specific product or service offerings. Some technology start-ups conducting this kind of analytics based on large sets of customer data and on geolocalisation, use of specific apps and social media available on the smartphone and knowledge of the psychology of consumer behaviour are first able to precisely locate consumers in real time with reference to specific shops, service establishments, etc. They are able to firstly locate consumers in real time and precisely identify specific shops, service providers, etc., and then display information on advertising banners appearing in specific applications on the smartphone about the current offer, including a price or other promotion for a specific product available for sale in the shop where the Internet user and potential customer is currently located. Thanks to this type of technological solutions, more and more often an Internet user available on a smartphone in a situation when he/she is in the vicinity, next to specific stands, shop shelves, specific shops in shopping centres, and is thinking about buying a specific product, then at that moment he/she receives information on the smartphone, an advertisement appears with information on a price or other promotion concerning that particular product or a similar, highly substitutable product. At the aforementioned point in time when the customer is in a specific shop or part of a shop, online advertisements are displayed on his or her smartphone, e.g. on social media, the Google ecosystem, third-party web browsers or other applications that the potential customer has installed on his or her smartphone.
When such technological solutions are complemented by artificial intelligence analysing the consumer behaviour of individual customers of different product and service offers, it is possible to create intelligent analytical systems capable of predicting who will visit a specific shop, when they will do so and what they plan to buy in that shop. Statistically, a citizen has several applications installed in his or her smartphone, which provide the technology-based analytical companies with data about their current location. Therefore, thanks to the use of artificial intelligence, it may not be long before Internet users receive messages, see online advertisements displayed on their smartphones showing the products and services they are about to buy or think about tomorrow. Perhaps the artificial intelligence involved in this kind of analytics is already capable of predicting our consumer behaviour in real time and will soon be able to predict which shop we will go to and what we will buy tomorrow.
In view of the above, I would like to address the following question to the esteemed community of scientists and researchers:
With the help of artificial intelligence, how can monitoring systems for citizens' consumer behaviour based on GPS geolocation and information contained in smartphones be improved?
Can artificial intelligence already predict our consumer behaviour and in a few moments will it be able to predict which shop we will go to and what we will buy tomorrow?
Can artificial intelligence already predict our consumer behaviour?
What do you think about this topic?
What is your opinion on this subject?
Please answer,
I invite you all to discuss,
The above text is entirely my own work written by me on the basis of my research.
I have not used other sources or automatic text generation systems such as ChatGPT in writing this text.
Copyright by Dariusz Prokopowicz
Thank you very much,
Best regards,
Dariusz Prokopowicz
  • asked a question related to Prediction
Question
3 answers
A number of people have asked on ResearchGate about acceptable response rates and others have asked about using nonprobability sampling, perhaps without knowing that these issues are highly related.  Some ask how many more observations should be requested over the sample size they think they need, implicitly assuming that every observation is at random, with no selection bias, one case easily substituting for another.   
This is also related to two different ways of 'approaching' inference: (1) the probability-of-selection-based/design-based approach, and (2) the model-based/prediction-based approach, where "prediction" means estimation for a random variable, not forecasting. 
Many may not have heard much about the model-based approach.  For that, I suggest the following reference:
Royall(1992), "The model based (prediction) approach to finite population sampling theory." (A reference list is found below, at the end.) 
Most people may have heard of random sampling, and especially simple random sampling where selection probabilities are all the same, but many may not be familiar with the fact that all estimation and accuracy assessments would then be based on the probabilities of selection being known and consistently applied.  You can't take just any sample and treat it as if it were a probability sample.  Nonresponse is therefore more than a problem of replacing missing data with some other data without attention to "representativeness."  Missing data may be replaced by imputation, or by weighting or reweighting the sample data to completely account for the population, but results may be degraded too much if this is not applied with caution.  Imputation may be accomplished various ways, such as trying to match characteristics of importance between the nonrespondent and a new respondent (a method which I believe has been used by the US Bureau of the Census), or, my favorite, by regression, a method that easily lends itself to variance estimation, though variance in probability sampling is technically different.  Weighting can be adjusted by grouping or regrouping members of the population, or just recalculation with a changed number, but grouping needs to be done carefully. 
Recently work has been done which uses covariates for either modeling or for forming pseudo-weights for quasi-random sampling, to deal with nonprobability sampling.  For reference, see Elliott and Valliant(2017), "Inference for Nonprobability Samples," and Valliant(2019), "Comparing Alternatives for Estimation from Nonprobability Samples."  
Thus, methods used for handling nonresponse, and methods used to deal with nonprobability samples are basically the same.  Missing data are either imputed, possibly using regression, which is basically also the model-based approach to sampling, working to use an appropriate model for each situation, with TSE (total survey error) in mind, or weighting is done, which attempts to cover the population with appropriate representation, which is mostly a design-based approach. 
If I am using it properly, the proverb "Everything old is new again," seems to fit here if you note that in Brewer(2014), "Three controversies in the history of survey sampling," Ken Brewer showed that we have been all these routes before, leading him to have believed in a combined approach.  If Ken were alive and active today, I suspect that he might see things going a little differently than he may have hoped in that the probability-of-selection-based aspect is not maintaining as much traction as I think he would have liked.  This, even though he first introduced 'modern' survey statistics to the model-based approach in a paper in 1963.  Today it appears that there are many cases where probability sampling may not be practical/feasible.  On the bright side, I have to say that I do not find it a particularly strong argument that your sample would give you the 'right' answer if you did it infinitely many times when you are doing it once, assuming no measurement error of any kind, and no bias of any kind, so relative standard error estimates there are of great interest, just as relative standard error estimates are important when using a prediction-based approach, and the estimated variance is the estimated variance of the prediction error associated with a predicted total, with model misspecification as a concern.  In a probability sample, if you miss an important stratum of the population when doing say a simple random sample because you don't know the population well, you could greatly over- or underestimate a mean or total.  If you have predictor data on the population, you will know the population better.  (Thus, some combine the two approaches: see Brewer(2002) and Särndal, Swensson, and Wretman(1992).) 
..........         
So, does anyone have other thoughts on this and/or examples to share for this discussion: Comparison of Nonresponse in Probability Sampling with Nonprobability Sampling?    
..........         
Thank you.
References:
Brewer, K.R.W.(2002), Combined Survey Sampling Inference: Weighing Basu's Elephants, Arnold: London and Oxford University Press
Brewer, K.R.W.(2014), "Three controversies in the history of survey sampling," Survey Methodology, Dec 2013 -  Ken Brewer -   Waksberg Award: 
Elliott, M.R., and Valliant, R.(2017), "Inference for Nonprobability Samples," Statistical Science, 32(2):249-264,
Royall, R.M.(1992), "The model based (prediction) approach to finite population sampling theory," Institute of Mathematical Statistics Lecture Notes - Monograph Series, Volume 17, pp. 225-240.   Information is found at
The paper is available under Project Euclid, open access: 
Särndal, C.-E., Swensson, B., and Wretman, J.(1992), Model Assisted Survey Sampling, Springer-Verlang
Valliant, R.(2019), "Comparing Alternatives for Estimation from Nonprobability Samples," Journal of Survey Statistics and Methodology, Volume 8, Issue 2, April 2020, Pages 231–263, preprint at 
Relevant answer
Answer
This is a very interesting perspective, James R Knaub , and one that you could well share on Frank Harrell's Datamethods discussion forum : https://discourse.datamethods.org
Other than that, I'm going to have a look at those references over a largeish pot of coffee before I say anything stupid (stupid plus references allows you to cover your retreat better!)
r
  • asked a question related to Prediction
Question
2 answers
I am trying to use machine learning algorithms to predict whether a pipe has broken or not and I also want to predict the time to failure of a particular pipe. So, I need a dataset that contains the pipe installation year, the date of recorded failure for failed pipes and also some other parameters such as pipe length, operating pressure, type of material and pipe diameter among others.
  • asked a question related to Prediction
Question
20 answers
In my country, more than a dozen years ago or more, there were real winters with snow and frost after the autumn. Whereas last winter, during the last few years it looked like autumn, without snow and positive temperatures. I think that the greenhouse effect, ie the warming of the Earth's climate, has already begun. This is also confirmed by numerous climatic cataclysms and weather anomalies, which in the current year 2018 appear in numerous places on the Earth. In some parts of the Earth there are fires of huge forest areas such as in Scandinavia, California in the USA, Australia, the Iberian Peninsula, Africa, etc. In addition, weather anomalies, e.g. snow and floods in October and November in the south of Europe.
In addition, tornadoes in many places on Earth and so on.
Perhaps these problems will get worse. It is necessary to improve security systems and anti-crisis services, improve the prediction of these anomalies and climatic cataclysms so that people can, have managed to shelter or cope with the imminent cataclysm. One of the technologies that can help in more precise forecasting of these cataclysms is the processing of large collections of historical and current information on this subject in the cloud computing technology in Big Data database systems.
Therefore, I am asking you: Will new data processing technologies in Big Data database systems allow for accurate prediction of climate disasters?
Please, answer, comments. I invite you to the discussion.
Relevant answer
Answer
Despite a small amount of uncertainty, scientists find climate models of the 21st century to be pretty accurate because they are based on well-founded physical principles of earth system processes. This basis solidifies the confidence of the scientific community that human emissions are changing the climate, which will impact the entire planet.
  • asked a question related to Prediction
Question
4 answers
At the US Energy Information Administration (EIA), for various establishment surveys, Official Statistics have been generated using model-based ratio estimation, particularly the model-based classical ratio estimator.  Other uses of ratios have been considered at the EIA and elsewhere as well.  Please see
At the bottom of page 19 there it says "... on page 104 of Brewer(2002) [Ken Brewer's book on combining design-based and model-based inferences, published under Arnold], he states that 'The classical ratio estimator … is a very simple case of a cosmetically calibrated estimator.'" 
Here I would like to hear of any and all uses made of design-based or model-based ratio or regression estimation, including calibration, for any sample surveys, but especially establishment surveys used for official statistics. 
Examples of the use of design-based methods, model-based methods, and model-assisted design-based methods are all invited. (How much actual use is the GREG getting, for example?)  This is just to see what applications are being made.  It may be a good repository of such information for future reference.
Thank you.  -  Cheers. 
Relevant answer
Answer
In Canada they have a Monthly Miller’s Survey, and an Annual Miller’s Survey.  This would be a potential application, if used as I describe in a paper linked below. As in the case of a survey at the US Energy Information Administration for electric generation, fuel consumption and stocks for electric power plants, they collect data from the largest establishments monthly, and from the smallest ones just annually.  After the end of the year, for a given data item, say volume milled for a type of wheat, they could add the twelve monthly values for each given establishment, and with the annual data collected, there is then an annual census.  To predict totals each month, the previous annual census could be used for predictor data, and the new monthly data would be used for quasi-cutoff sample data, for each data item, and with a ratio model, one may predict totals each month for each data item, along with estimated relative standard errors.  Various techniques might apply, such as borrowing strength for small area predictions, adjustment of the coefficient of heteroscedasticity, and multiple regression when production shifts from say, one type of grain to another, as noted in the paper. 
Here are the mill surveys: 
Canadian Mill surveys: 
Monthly Miller’s Survey: 
Annual Miller’s Survey: 
This survey information is found on page 25 of the paper below, as of this date.  There will likely be some revisions to this paper.  This was presented as a poster paper at the 2022 Joint Statistical Meetings (JSM), on August 10, 2022, in Washington DC, USA.  Below are the poster and paper URLs. 
Poster:
The paper is found at
.........................     
If you can think of any other applications, or potential applications, please respond. 
Thank you. 
  • asked a question related to Prediction
Question
3 answers
I am working on landslide hazard and risk zonation. I trained some of factors for landslide in python/R and SPSS. I have calculated ROC/AUC and confusion matrix of the model. I want to get a solution about how can I generate the final Landslide prediction maps from those trained and evaluated Machine Learning (ML) models?
Relevant answer
Answer
Sounds like you have trained multiple ML models and got multiple prediction results. You can get predictions from each and pool them together using blending/voting/stacking methods to combine them
  • asked a question related to Prediction
Question
4 answers
I use a conditional logit model with income, leisure time and interaction terms of the two variables with other variables (describing individual's characteristics) as independent variables.
After running the regression, I use the predict command to obtain probabilities for each individual and category. These probabilities are then multiplied with the median working hours of the respective categories to compute expected working hours.
The next step is to increase wage by 1%, which increases the variable income by 1% and thus also affects all interaction terms which include the variable income.
After running the modified regression, again I use the predict command and should obtain slightly different probabilities. My problem is now that the probabilities are exactly the same, so that there would be no change in expected working hours, which indicates that something went wrong.
On the attached images with extracts of the two regression outputs one can see that indeed the regression coefficients of the affected variables are very, very similar and that both the value of the R² and the values of the log likelihood iterations are exactly the same. To my mind these observations should explain why probabilities are indeed very similar, but I am wondering why they are exactly the same and what I did possibly wrong. I am replicating a paper where they did the same and where they were able to compute different expected working hours for the different scenarios.
Relevant answer
Answer
Either something went wrong or performed the same test. Did you use the same version of the software as the original study?
  • asked a question related to Prediction
Question
5 answers
I just starting to try out the google colab version for the alphafold2 for protein 3D structure prediction via this link:
Pretty much a newbie, so still trying to figure out how best to interpret the results and put them into proper words for a report/presentation. Also, is there a way to download the predicted 3D structure that is displayed?
Thanks in advance.
  • asked a question related to Prediction
Question
1 answer
Like protein metal predictor or simulation programs
Relevant answer
Are you looking for a software that predicts the binding mode/site between metals and protein? If that’s what you mean, molecular docking is what you are looking for. Here is an article that can help to get started. Hope this helps.
  • asked a question related to Prediction
Question
6 answers
I have recently been working on using machine learning for yield prediction, however, I was exploring what inputs would be better at predicting yield. I am confused by only three papers that use historical yields as an input to predict yields for the new year. From the test results this does improve the prediction accuracy substantially. But does this count as data leakage? If not, what is the rationale for doing so? What are the limitations? (It seems that the three papers are from the same team.)
Relevant answer
Answer
Pinery Lee Machine learning algorithms' predictions will assist farmers in deciding which crop to cultivate in order to maximize production by taking into account aspects such as temperature, rainfall, acreage, and so on.
  • asked a question related to Prediction
Question
8 answers
I am trying to tweaking my machine learning model optimizer, and i would love to test that in healthcare domain space, especially for rare illnesses.
Thus, do any one knows any deidentified electronic health records for Epilepsy, Parkinson , or other rare diseases patients (maybe those who are treated with warfarin) ?
Please guide me how to get these datasets.
I already spoke with many research authors, but yet no responses.
Relevant answer
Answer
That's too painful.
The thing is that you understand their privacy concerns, and you have your procedures that you can assure your compliance to that.
But still its not easy to get those patient records.
I wrote this post after searching all the suggested resources, thought i really thank you for your care to respond.
  • asked a question related to Prediction
Question
2 answers
Hi, I want to predict post-transitional modification for phosphorylation. I found lots of websites like Phosida, PhosphoSite Plus. I am just curious about is there any python code for this phosphorylation prediction. If you have, could you share the GitHub link?
Relevant answer
Answer
Shaban Ahmad thank you
  • asked a question related to Prediction
Question
3 answers
i have predicted the solubility of a compound using a webserver in units of mol/L and it was 0.00126 and i want to know whether this value means the compound is soluble in water or not and it would be better to compare it with other compounds in the market.
Thanks
Relevant answer
Answer
We have to create in-vitro conditions to get solubility, which is vital for its bioavailability. Just mere water solubility is only for data generation.
Mild alkaline or acidic medium I mean
  • asked a question related to Prediction
Question
11 answers
I have been doing research on different issues in the Finance and Accounting discipline for about 5 years. It becomes difficult for me to find some topics which may lead me to do projects, a series of research articles, working papers in the next 5-10 years. There are few journals which have updated research articles in line with the current and future research demand. Therefore, I am looking for such journal(s) that can help me as a guide to design research project that can contribute in the next 5-10 years.
Relevant answer
Answer
You don't need to look for any journals.
All you need to do is narrow your search to topics listed in "special issues" and "call for papers". Top publishers e.g. elsevier, wiley, T&F, Emerald, etc., often advertise call for papers and special issues of journals. The topics in the special issue or call for paper can give you some hint on current and future research trends. I think this is the standard practice in academia.
I hope this advice helps.
  • asked a question related to Prediction
Question
1 answer
Dear collegues.
I would like to ask,if anybody works with neural networks,to check my loop for the test sample.
I've 4 sequences (with a goal to predict prov,monthly data,22 data in each sequence) and I would like to construct the forecast for each next month with using training sample size 5 months.
It means, I need to shift each time by one month with 5 elements:
train<-1:5, train<-2:6, train<-3:7...,train<-17:21. So I need to get 17 columns as a output result.
The loop is:
shift <- 4
number_forecasts <- 1
d <- nrow(maxmindf)
k <- number_forecasts
for (i in 1:(d - shift + 1))
{
The code:
require(quantmod)
require(nnet)
require(caret)
prov=c(25,22,47,70,59,49,29,40,49,2,6,50,84,33,25,67,89,3,4,7,8,2)
temp=c(22,23,23,23,25,29,20,27,22,23,23,23,25,29,20,27,20,30,35,50,52,20)
soil=c(676,589,536,499,429,368,370,387,400,423,676,589,536,499,429,368,370,387,400,423,600,605)
rain=c(7,8,2,8,6,5,4,9,7,8,2,8,6,5,4,9,5,6,9,2,3,4)
df=data.frame(prov,temp,soil,rain)
mydata<-df
attach(mydata)
mi<-mydata
scaleddata<-scale(mi$prov)
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x)))
}
maxmindf <- as.data.frame(lapply(mydata, normalize))
go<-maxmindf
forecasts <- NULL
forecasts$prov <- 1:22
forecasts$predictions <- NA
forecasts <- data.frame(forecasts)
# Training and Test Data
trainset <- maxmindf()
testset <- maxmindf()
#Neural Network
library(neuralnet)
nn <- neuralnet(prov~temp+soil+rain, data=trainset, hidden=c(3,2), linear.output=FALSE, threshold=0.01)
nn$result.matrix
plot(nn)
#Test the resulting output
#Test the resulting output
temp_test <- subset(testset, select = c("temp","soil", "rain"))
head(temp_test)
nn.results <- compute(nn, temp_test)
results <- data.frame(actual = testset$prov, prediction = nn.results$net.result)
}
minval<-min(x)
maxval<-max(x)
minvec <- sapply(mydata,min)
maxvec <- sapply(mydata,max)
denormalize <- function(x,minval,maxval) {
x*(maxval-minval) + minval
}
as.data.frame(Map(denormalize,results,minvec,maxvec))
Could you tell me please,what can i add in trainset and testset (with using loop) and how to display all predictions using a loop so that the results are displayed with a shift by one with a test sample of 5?
I am very grateful for your answers
  • asked a question related to Prediction
Question
25 answers
I would like to know whether there is a direct relationship between quantum computer technology and artificial intelligence. Can you provide your explanation with examples for more understanding?
Relevant answer
Answer
Yes definitely .
Quantum Computing and Artificial intelligence( Computing) are directly related to each other similar to Physics and Quantum Mechanics( Physics).
Quantum Computing development definitely would support to make more intelligent to machine ( Artificial Intelligence ).
Quantum computing conjures many myriads but one meaning is so fast computing that could not be countable in time.
Recognition of a person among the millions of people without taking time or thinking is quantum computing intelligence of humans that could be implemented in machines.
  • asked a question related to Prediction
Question
4 answers
Dear collegues.
I've 400 data (monthly) and I need to construct the forecast for each next month with using learning (training ) sample 50.
It means, I need to shift each time by one month with 50 elements.
train<-1:50, train<-2:51, train<-3:52,...,train<-351:400.
Could you tell me please,which function can I write in the program for automatic calculation?
Maybe, for() loop?
I am very grateful for your answers
Relevant answer
Answer
embed( data, 50 )
  • asked a question related to Prediction
Question
4 answers
I want to predict water in my project. I need to know which of them have more advantages.
Relevant answer
Answer
A Bayesian network is a graphical model; it consists of a collection of random variables that are represented as nodes in a directed graph, with the graph's edges representing the variables' interdependence.
In theory, a Dynamic Bayesian Network (DBN) functions identically to a Bayesian Network (BN): given a directed network (the structure), you may learn conditional probability tables (the parameters) from a dataset.
The primary distinction is that a DBN reflects a time-dependent phenomena; therefore, whereas a conventional BN may have a node reflecting variable "A" influencing variable "B," a DBN may have variable "A" at time=1 influencing variable "A" at time=2.
  • asked a question related to Prediction
Question
8 answers
There are different empirical equations and techniques like Fuzzy, ANN, etc.. for predicting Blast Induced Ground Vibration. In addition to these is there any software for predicting Blast Induced Ground Vibration
Relevant answer
Answer
  • asked a question related to Prediction
Question
5 answers
I have the following dataset:
SQ - SEX - Weight - letter - duration - trail - quantity
1 - male - 15KG - abc - Year 1 - 2 - quantity 1
- Year 2 - 3 - quantity 2
2 female - 17KG - cde - Year X - 4 - quantityx
- 16KG - Year Y - 6 - quantityy
- Year Z - 3 - quantityz
.... etc...
I want to make a prediction model that predict the quantity, but using classic machine learning models ( not deep learning ones, like LSTM or RNN ), i.e. linear regression, SVM , .. such that:
predict quantity of n individual at a certain duration ( duration A) what will be the quantity ?
n - male - 25KG - xlm - 34 - A - ?
What is the best was to treat and pre-process duration , trail and quantity features before fitting them to preserve their correlation with the target quantity ?
Relevant answer
Answer
Aggrigation with rolling window may help you to rearrange your column values according
  • asked a question related to Prediction
Question
9 answers
I am trying to predict peak demand using machine learning techniques. Current articles consider this as a time series prediction issue and consider a 7-day lag to predict peak demand. A ML model I am trying to apply considers new features for this prediction, and I applied it without a week prior value lag. I was challenged why I did not use lag values for time series prediction like this issue.
The objective of my project was to evaluate whether adding new features would improve the daily peak demand prediction and assess the effects of the new features. If I use new features to predict daily demand, should I also consider the previous seven days' lags as a new feature? Is it correct to combine several COVID-19 related features with the lag demand for peak demand prediction for an unstable situation like COVID-19?
Ps:
1- The model I used for prediction is LightGradient Boosting.
2- Data trained and tested during COVID-19 situation (2020 & 2021)
3- The weekly trends of my target value in 2020 and 2021 are as below figures.
Relevant answer
Answer
Pleasure Negin Zarbakhsh
Choose a high number of lags and calculate a penalized model (e.g. using LASSO, ridge or elastic net regularization). The penalization should reduce the influence of irrelevant delays, allowing the selection to be done more effectively. And Experiment with various lag combinations and either.
Fisher's Points. One of the most popular supervised feature selection approaches is the Fisher score. The method we'll employ returns the variables' rankings in decreasing order depending on the fisher's score. The variables can then be chosen based on the circumstances.
Kind Regards
Qamar Ul Islam
  • asked a question related to Prediction
Question
3 answers
I am using Qgis 2.8.3 version for molusce plugin to get the prediction landuse map. But got the error while creating a change map in area change tab.
Here attached the link for details.
Relevant answer
  • asked a question related to Prediction
Question
1 answer
Dated: 10-June-2020.
Perhaps!
Prefatory, it may be, because this year the radiations and greenhouse gases interaction feedback processes on different timescale (one of the main factor in monsoon dynamics) which makes the monsoon predictability erratic is not expected to add much uncertainty in the prediction system due to the substantial reduction in the greenhouse gas emissions. Implies, may be an upper hand for potential predictive models in the line. Recall that model ability to predict the SW monsoon is higher with initial conditions been used for the month of Feb., March, April (this years these are main lockdown month in the world when atmosphere is not invaded by atmospheric gases) than months closer to the SW monsoon. On other side, can be also be test bed for the models have near accurate long rage forecasting tendency with early months (as mentioned above) initial conditions.
Over all it may be also be manifested that NATURE can be predicted correctly if it is not disturbed. BUT if we keep on disturbing it then predictability may not be that easy and precise.
If yes, then "Commendations" to the accurate predictability of the monsoon system will be higher this year, I think. Good! This may also considered because of Nature natural tendency is higher this year apart from having well resolved and improved interannual and climate systems predictability aspects in the modelling systems, etc...
Nature is in NATURAL swing. Enjoy and try to be safe! But we should also be ready for the monsoon system predictability in the times to come or years to come when emissions will again be dumped in the earth system. It will certainly obstruct the prediction realities. Consistency is the accuracy in the prediction should be addressed responsibly.
What’s your take on that!
Relevant answer
Answer
I think yes.
  • asked a question related to Prediction
Question
9 answers
Let consider there is a selling factor like this:
Gender | Age | Street | Item 1 | Count 1 | Item 2 | Count 2 | ... | Item N | Count N | Total Price (Label)
Male | 22 | S1 | Milk | 2 | Bread | 5 | ... | - | - | 10 $
Female | 10 | S2 | Cofee | 1 | - | - | ... | - | - | 1 $
....
We want to predict the total price for a factor based on their buyer demographic information (like gender, age, job) and also their buying items and counts. It should be mentioned that we suppose that we don't know each item's price and also, the prices will be changed during the time (so, we although will have a date in our dataset).
Now it is the main question that how we can use this dataset that contains some transactional data (items) which their combination is not important. For example, if somebody buys item1 and item2, it is equal to other guys who buy item2 and item1. So, the values of our items columns should not have any differences for their value orders.
This dataset contains both multivariate and transactional data. My question is how can we predict the label more accurately?
Relevant answer
Answer
Hi Dr Behzad Soleimani Neysiani . I agree with Dr Qamar Ul Islam .
  • asked a question related to Prediction
Question
13 answers
Hi,
I am currently looking for a dataset in which I can get historical weather data (like temperature, precipitation, wind speed) for every day in every city from 2005 to today.
The data will be used for a prediction project.
Where can I find these kinds of data, or anything related?
Thank you very much.
p/s: To clarify, what I mean is I have a table with 2 columns, "date" and "city", and I want to fill the third(or how many it takes) column with weather information of that date+city combination. A lot of websites provide weather information but since my dataset is quite large, I need a way to automate the process, either a data set or a crawler-friendly website with enough information.
Relevant answer
Answer
ECMWF ERA5 dataset
  • asked a question related to Prediction
Question
7 answers
I am doing MS thesis. Title is "Time series crop yield estimation using satellite images". Below are my aims and objectives but supervisor said objectives are not correct. I dont know what should I change.Any one can help me to rewrite my objectives.
Aim: The aim of this study is to develop a model for wheat yield prediction using satellite imagery before the harvest time.
Objectives:
1.It is mandatory for the planners of any regime to have an accurate and precise estimate of a crop to cope with the shortage crises of the crop, as Pakistan faced a very serious crisis of wheat’s shortage in 2007
2.An accurate estimate of a crop gives a significant relief to the country’s exchequer in terms of saving foreign exchange
3.The main purpose of this research is, therefore, the scientific construction of a model employing all the information available via remote sensing in order to get a good and trustworthy estimate of wheat crops.
Relevant answer
Answer
You should start with a problem statement - come up with the main research question. Then you will have to break it into pieces as 3-4 research questions, which when answered would answer the main problem/research question. The research questions can be converted into objectives. All the best
  • asked a question related to Prediction
Question
4 answers
Which best way to classify table dataset using MATLAB?
How to predict or categorize text data using Convolutional Neural Network. Also, how to use deep learning for classification of text data in table dataset. (for example, numerical data or textual data)
Can we use regression or classification for prediction? Which is the best approach?
Relevant answer
Answer
The below Python framework is a repository for different deep learning based text classifications, and might be helpful
  • asked a question related to Prediction
Question
21 answers
I have 27 features and I'm trying to predict continuous values. When I calculated the VIF (VarianceInflation Factors), only 8 features are less than 10 and the remaining features range from 10 to 250. Therefore, I am facing a multicollinearity issue.
My work is guided by two aims:
1- ML models should be used to predict the values using regression algorithms.
2- To determine the importance of features( interpreting the ML models).
A variety of machine learning algorithms have been applied, including Ridge, Lasso, Elastic Net, Random Forest Regressor, Gradient Boosting Regressor, and Multiple Linear Regression.
Random Forest Regressor and Gradient BoostingRegresso showing the best performance (Lowest RMSE), while using only 10 features (out of 27 features) based on the feature importance results.
As I understand it, if I face multicollinearity issues, I can fix them using regularized Regression models like LASSO. When I applied Lasso to my model, the evaluation result is not as good as Random Forest Regressor and Gradient BoostingRegresso. However, none of my coefficients become zero when I apply the feature importance.
Moreover, I want to analyse which feature is affecting my target value and I do not want to omit my features.
I was wondering if anyone could help me determine which of these algorithms would be good to use and why?
Relevant answer
Answer
Dear Nergin,
one of the first things that I would do is to first analyse the data prior to modelling: is the collinearity of the features due to some causal process that you can hypothesise on? If so, is there a set of root variables that you can identify? Factor analysis in this case might be a good thing to look into.
Another thing to try before modelling might be clustering and selecting a subset of representatives from the clusters and input those into an algorithm for selection, separating the features into groups of collinear features and taking one of each according to some criteria. Another thing might be to use PCA on the clustered features to combine and reduce the number of features.
The choice of algorithm depends on how to model your data, If you have a linear relationship the usual LS penalised approaches could be fine, but if the Random Forest performs better on a similar number of features and without model overfitting (because that is a common thing with thsi ensemble approach), the relationship might not be linear so it gets a better score using the trees. For interpreting the results, the TreeInterprteer in Python or LIME could perhaps help you give a better understanding of what the random forest is doing.
I hope this is helpful and lots of luck!
Asier
  • asked a question related to Prediction
Question
3 answers
I calculated the Shapley values (using the R xgboost package, based on gbm regression) of several big actors in the cocoa market and received results which I cannot explain: it seems that Shapley increases (the trend, in general) for all of them. The same thing happened when I calculate it for other sectors.
Does it make sense? If it does, what stands behind these results?
If not, what could be my mistake?
Thanks a lot for any help!
Relevant answer
Answer
Hi Rotem,
Along with the Alexander suggested literature, have a look at https://christophm.github.io/interpretable-ml-book/shapley.htm
l hope it will be worth reading and help you out to interpret your results.
  • asked a question related to Prediction
Question
15 answers
Please suggest if any specific software is used.
Relevant answer
Answer
I am in a holidays, away from the lab computer to check the options of origin, but anyane who have the experience in using it can provide help. Sorry for being unable to help at this time period.
  • asked a question related to Prediction
Question
4 answers
In my current project, I want to answer if various cognition items (ratio, 30+ of them, may get reduced based on a separate factor analysis) predict moral outrage - in other words, do increases in item 1 (then item 2, item 3, etc) predict increases in outrage in a significant way. Normally, this would be a simple regression. But then I complicated my design, and I'm having a hard time wrapping my head around my potential analyses and whether it will actually answer my stated question, or if I'm over-thinking things.
Currently, I'm considering a set-up where participants will see a random selection of 3 vignettes (out of 5 options) and answer the cognition items and moral outrage about each. This complicates matters because 1) there is now a repeated measure component that may (or may not?) need to be accounted for and 2)I'm not sure how my analyses would work if the vignette selection is random (thus, all vignettes will show up the same number of times, but in different combinations to different people). I am anticipating that different vignettes will not be equal in their level of DV (which is on purpose - I want to see if these patterns are general, not just at very high or very low levels of outrage).
When originally designing this, I had wanted to average the 3 vignette scores together for each subject, treating them as single, averaged item values to use in a multiple regression. But I've been advised by a couple people that this isn't an option, because the variance between the vignettes needs to be accounted for (and the vignettes can't be shown to be equivalent, and thus can't be collapsed down in analysis).
One potential analysis to combat this is a nested, vignette-within-individual multilevel design, where I see if the pattern of cognition items to outrage is consistent between vignettes (level 1) and across subjects (level 2), to account for/examine any vignette-by-cognition/MO pattern interactions. And this makes sense, as MLMs can be used to compare patterns, rather than single scores.
But I can't wrap my head around what part of this set-up/the output I would look at to actually answer my question: generally, which, if any, of these cognition items predicts outrage (regardless of vignette, or across many scenarios)? And can this approach work when the vignettes combinations differ between subjects?
Or is this the incorrect analysis approach and another, simpler one would be more fitting? For example, is the averaging approach workable in another format? What if all vignettes were done by all subjects (more arduous on the subjects, but possible if the strength of the analysis/results would be compromised/overly-complicated)?
Confirmation that my current analysis approach will indeed work, help with what part of the output would answer my actual RQ, or suggestions for an alternative approach, would be appreciated.
Relevant answer
Answer
You have many answers there
Do not complicate you research design to much: complicate just to surround some issue.
Analysis should be made at different levels as if it was a splitplot.
  • asked a question related to Prediction
Question
3 answers
Are you interested in the application of complex systems to the global history of humankind? I'm working on such a project, and I'm interested in discussions with like-minded people.
I published several articles on that in "The Complex Systems" journal (thecomplexsystems.com). A short overview of my work is in my blog (vtorvich.com) and the description of my book "Subsurface History of Humanity: Direction of History" on Amazon.
Relevant answer
Answer
All classifications are up to people.
Everybody will tell you that the history of humanity started around 5,150 years ago. Of course, not the same wording would be used. The phrase would be like this. The history of mankind is recorded history. In other words, our history began only when humanity invented writing.
The reason is simple and very convincing. The magic word is convenience. It is much easier to work with historical facts and artifacts if you have written records about them. It is hard to work with only archeological or similar data. The history with existed written records is a comfort zone for everybody. Any choice of the beginning of the history of humankind as a date before first writing will throw researchers and the public out of this comfort zone.
When did the Agricultural Revolution happen? Well, it began many thousands of years before writing was invented. If the history of mankind began in 3150 BC, then that revolution is thrown from the history of humanity into prehistory.
I consider humankind's history, as the one, which started in 42000 BC. Why exactly at this date?
You could read it in my book - https://www.amazon.com/dp/B08WZCVDTD.
  • asked a question related to Prediction
Question
25 answers
If artificial intelligence is implemented for the online mobile banking, can this banking segment be deprived of employing human capital altogether?
Please reply
Best wishes
Relevant answer
Answer
Dariusz Prokopowicz In my experience, bank employees are needed less and less banking applications, mobility, online services and even financial and credit analyzes are performed using artificial intelligence.
  • asked a question related to Prediction
Question
12 answers
After 30 years, much will change. 30 years is a long period for the continuation of the current fourth technological revolution, known as Industry 4.0.
The current technological revolution known as Industry 4.0 is motivated by the development of the following factors:
Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
On the basis of the development of the new technological solutions in recent years, dynamically developing processes of innovatively organized analyzes of large information sets stored in Big Data database systems and computing cloud computing for the needs of applications in such areas as: machine learning, Internet of Things, artificial intelligence are dynamically developing, Business Intelligence.
The development of information processing technology in the era of the current technological revolution defined by Industry 4.0 is determined by the application of new information technologies in the field of e-commerce and e-marketing.
Added to this are additional areas of application of advanced technologies for the analysis of large data sets, such as Medical Intelligence, Life Science, Green Energy, etc. Processing and multi-criteria analysis of large data sets in Big Data database systems is made according to the V4 concept, ie Volume (meaning number of data), Value (large values ​​of specific parameters of the analyzed information), Velocity (high speed of new information appearing) and Variety (high variety of information).
The advanced information processing and analysis technologies mentioned above are used more and more often for marketing purposes of various business entities that advertise their offer on the Internet or analyze the needs in this area reported by other entities, including companies, corporations, financial and public institutions. More and more commercial business entities and financial institutions conduct marketing activities on the Internet, including on social media portals.
More and more companies, banks and other entities need to conduct multi-criteria analyzes on large data sets downloaded from the Internet describing the markets on which they operate, as well as contractors and clients with whom they cooperate. On the other hand, there are already specialized technology companies that offer this type of analytical services, develop customized reports that are the result of multicriteria analyzes of large data sets obtained from various websites and from entries and comments on social media portals.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
What are the known futurological visions of technology development until around 2050?
Please reply
I invite you to the discussion
Best wishes
Relevant answer
Answer
Tech future dev. in...The advent of Biotechnology courses a few decades ago appeared to provide a better alternative to young students for their career options. The applications of Biotechnology are vast as it caters to various agricultural, animal husbandry, fishery, health, pharmaceutical n etc …Samal, K. C., Mohanty, A., Patnaik, L., & Sahoo, J. P. (2021). Career Options and Future Prospects in Biotechnology. Biotica Research Today, 3(3), 135-138.
Robotics, space tech, AI, BC, BD...Wedler, A., Schuster, M. J., Müller, M. G., Vodermayer, B., Meyer, L., Giubilato, R., ... & Reill, J. (2021). German Aerospace Center's advanced robotic technology for future lunar scientific missions. Philosophical Transactions of the Royal Society A, 379(2188), 20190574.
Rusakova, E. P., & Inshakova, A. O. (2021). INDUSTRIAL AND MANUFACTURING ENGINEERING IN DIGITAL LEGAL PROCEEDINGS IN THE ASIA-PACIFIC REGION: A NEW LEVEL OF QUALITY BASED ON DATA, BLOCKCHAIN AND AI. International Journal for Quality Research, 15(1).
  • asked a question related to Prediction
Question
8 answers
My main goal is to use Neural Networks to forecast Sunspot Numbers. Requesting the option of ANN or RNN seems simple enough. However, which is best to learn and utilize for a complete beginner? If there is a GitHub repository for similar Space Science topics based on Neural Networks, please link me to it. I'd be extremely appreciative.
Relevant answer
Answer
Ashok Silwal Are you performing the right kind of multi-step forecasting? If it is closely related to its neighbours then choose RNN which provides the possibility to model time series dynamic systems and and for accuracy you may want to create two different models for each output.
Good luck
  • asked a question related to Prediction
Question
8 answers
Will the development of computerized business analytics of large collections of economic information collected in Big Data database systems improve the forecasting of future economic processes?
Please reply
I invite you to the discussion
Thank you very much
Dear Colleagues and Friends from RG
The key aspects and determinants of applications of data processing technologies in Big Data database systems are described in the following publications:
I invite you to discussion and cooperation.
Best wishes
Relevant answer
Answer
More than 2 years have passed since I asked the above question. During these two years, has there been a significant progress in analytics based on Big Data Analytics technology towards using this analytics to forecast complex climate, natural, social and economic processes?
Regards,
Dariusz Prokopowicz
  • asked a question related to Prediction
Question
4 answers
There has been a debate on the topic "Why the sunspot number needs re-examination?". What is the reason behind this controversial topic? Which model is currently the best model to predict the Sunspot Number in Solar Cycle 25?
Relevant answer
Answer
Here is a discussion of a revised SSN:
  • asked a question related to Prediction
Question
3 answers
There is a lot of research on AI-based air pollution forecasting, but very few have put up a reasonable explanation in this regard.
I want to know what might be the reasons for the performance drop ??
Is it a problem of data length or any other issue ??
Relevant answer
It's very simple. All forecasting methods are based on the search for patterns in the retrospective data and on the assumption (hypothesis) that these patterns will be valid in the future for the forecast period. In other words, it is assumed that the training sample is representative for a certain period in the future. This period is called the period of ergodicity. But this is an incorrect assumption. Sometimes the patterns in the modeled domain change. The period of ergodicity is violated. New patterns are being formed, although the old ones may remain. Therefore, the point of violation of ergodicity is called the bifurcation point. It is necessary to predict not only based on the patterns of the past period, but also to predict the risks of violating these patterns. I did it back in 1994: http://lc.kubagro.ru/aidos/aidos02/7.4.htm (see Figure 7.2).
Это очень просто. Все методы прогнозирования основаны на поиске закономерностей в ретроспективных данных и на предположении (гипотезе), что эти закономерности будут действовать и в будущем на период прогнозирования. Иначе говоря, предполагается, что обучающая выборка репрезентативна на определенный период в будущее. Этот период называется периодом эргодичности. Но это неверное предположение. Иногда закономерности в моделируемой предметной области меняются. Период эргодичности нарушается. Формируются новые закономерности, хотя могут оставаться и прежние. Поэтому точка нарушения эргодичности называется точка бифуркации. Надо прогнозировать не только основываясь на закономерностях прошлого периода, но и прогнозировать риски нарушения этих закономерностей. Я это делал еще в 1994 году: http://lc.kubagro.ru/aidos/aidos02/7.4.htm (см. рис. 7.2).
  • asked a question related to Prediction
Question
6 answers
Good day scholars, I have read a lot of articles on LSTM time-series forecasting capabilities.
However, I want to know if LSTM can be used for multi-output time-series forecasting. For example, I have x,y,z variables with 1000 time steps, and I want to use LSTM to forecast all the variables (x,y,z) in future time steps). Any recommendation or suggestion will be highly appreciated.
Thanks
Relevant answer
Answer
@ Ahmed Rezaei, Thanks for the contribution. Please what do you mean by using attention??
  • asked a question related to Prediction
Question
8 answers
The stock market prediction is a vibrant and exciting topic around the globe due to its ability to mint money with its magical prediction possibilities and furthermore rewarding academic appreciations associated with it.
But what is the feasibility of having a FUTURES* (derivative) prediction mechanism in place?
There is wide-ranging literature out there featuring Stocks, Indices and options. Why are there no articles related to futures market prediction, if there is a possibility let me know your insights.
Futures are restricted by their expiration. But, if that can be predicted, the chance of earning a handsome return is wide open as per the limited knowledge I possess.
Relevant answer
As per my predictions, Nifty will be around 15000 in the next year and the share market will go as high as 52000 in 2021. ... Market will go high as 2021 will proceed further. The share market in the last two months of 2021 will be at the peak.
  • asked a question related to Prediction
Question
20 answers
Apparently, on the financial markets and in macroeconomic determinants of the economic situation in particular sectors and entire economies of developed countries, there are symptoms that suggest a high probability of economic slowdown from 2020 in individual countries and, consequently, in the entire global economy.
Therefore, I am asking you: Do you know the forecasts of the global economic development that would suggest a high probability of deceleration (or possibly acceleration) of economic growth from 2020 in individual countries and, consequently, in the entire global economy?
What are the symptoms of potential changes in the financial markets and / or the scope of macroeconomic determinants of the economic situation in particular sectors and entire economies?
If you know the results of prognostic research in this area, please send links to websites or scientific publications in which this type of prognostic issues are taken.
I wish you the best in New Year 2019.
Best wishes
Relevant answer
Answer
9 May MMXXI
Please read attached article, THE ART OF GREED...
Cordially...
ASJ
  • asked a question related to Prediction
Question
5 answers
Hi everyone, lately by using all the nice tools and benefits that Artificial Intelligence can offer as a technology, I am searching for possible applications in the Automotive Business.
Below are some simple examples of basic scenarios that I have discovered already and I would like to enhance them or discover new ones, that could help Automotive Industries on taking proper business actions:
1. Based on historical CRM Opportunities taking into account the Lead Source (TV, WEB, Phone), Customer Genre, Customer Age, Customer Geographical Area, Customer Follow-Up-Times and Model Of Interest, Model Price, predict the possibility to convert this opportunity into an Invoice.
2. Based on historical Service Document Turnover (Service Quote -> Service Schedule -> Service Order -> Service Invoice), predict the possibility of a new open (un-invoiced) Document.
3. Based on historical Vehicle Document Turnover (Vehicle Quote -> Vehicle Order -> Vehicle Invoice), predict the possibility of a new open (un-invoiced) Document.
4. Based on historical clocking of technicians that spent on fixing vehicles taking into account Model Code, Vehicle Mileage, Job Qualification Code, Parts Number Labor, Number, predict the expected workshop load for the following scheduled based on open Service Schedules.
What do you think?
Relevant answer
Answer
Dear Stavros Koureas Utilizing rapidly expanding data availability. These companies are, for example, drastically increasing their marketing efficiency through programmatic advertising with AI at its core.
  • asked a question related to Prediction
Question
17 answers
I try to predict the occurrence of individual aquatic plants (48 species) with Random Forest (RF) models. For this I use six explanatory variables. The datasets are highly unbalanced. Lets say minimal 2.5% have presences, but can also go up to 25% (of 2000 observations). Not surprisingly, the accuracy (~70%) and Cohen's kappa (~0.2) are not very satisfactory. Moreover, the True Negative (TN) rate is high (~80%) while True Positive (TP) rate is low (~15%). I tried multiple things from changing the cut-off to 40-45%, which works somehow (still not satisfactory). Additionally, I subsampled my dataset (also down-sampling), build an RF model with 50 trees and repeat this 20 times and combine these 20 RF models in a one RF model (somehow circle reasoning as this is what down sampling does), but results in similar performance. Changing the mtry, node size (85-100% of the lowest class) or maximum number of observations ending in the terminal node (0-15% of the lowest class) also does not improve the performance. However, the latter two "smooth" the patterns, but does not improve performance or distinction between TN and TP. The best option seems to set the cut-off to 45%, node size to 90% and maximum obs to 10%.
First, my guess resulting to the low performance is of course due to the unbalanced dataset, where simply the pattern of absences is better captured than that of the presences. However, I cannot resolve this with the data I currently have (am I sure that I cannot resolve this? not really). This would mean I need more data (anyhow I want this). Second, TN are easier to predict in general. For example, fish need water, if there is no water the model predicts no fish (easy peasy). However, if there is water the model predicts fish, but because there is water, this does not necessarily mean there is fish. For aquatic plants, if flow velocity is > 0.5 m/s species of vascular plants are often absent and mosses are present. Yet, if flow velocity < 0.5 m/s this does not mean vascular are present or mosses are absent. Third, the predictor variables are not suitable and in general the species seem to distributed widely along the gradient of these predictors (you do not need an ML model to tell you this if you look at the boxplots). Moreover, correlations between predictors also present (while not an issue for prediction it is an issue for inference), for some species this is more apparent than others; and some species occur everywhere along these gradient. Although, this idea somehow seems to float around, actually relative little articles discuss this (excluding articles addressing the high misclassification rates of Trophic Macrophyte Indices in general):
Even using different model types does not really work (SVM, KNN, GLM, [binomial]). Naive Bayes seem to work, but the prior ends up extremely low for some species thus the model hardly predicts presence. However I turn or twist (organize) the data, I cannot obtain a satisfactory prediction. Are there any statistic or machine learning experts who have any tips or tricks to improve model performance, besides increasing the datasets?
P.S. Perhaps I should start a contest on Kaggle.
Relevant answer
Answer
Methods to Boost the Accuracy of a Model
Add more data. Having more data is always a good idea. ...
Treat missing and Outlier values. ...
Feature Engineering. ...
Feature Selection. ...
Multiple algorithms. ...
Algorithm Tuning. ...
Ensemble methods.
Regards,
Shafagat
  • asked a question related to Prediction
Question
11 answers
Hello all,
I am a new user of Python and Machine learning!
Hereunder, I am trying to explain my data and model and then ask my question.
I have a couple of independent variables: Ambient temperature, Solar intensity, PCM melting temperature (PCM is a kind of material that we glue to the back of the PV panel in this experiment) and, Wind speed. My only dependent variable is the power output of a photovoltaic panel.
All independent variables change during a day (from 8 AM to 5 PM) as "Time" passes. For instance, ambient temperature increases from 8 AM to 3 PM and gradually drops from 3 PM to 5 PM.
My question is: can I consider Time (which is defined in an hour -- e.g. 8,9,....,13,14,....,17) as another independent variable to use machine learning techniques (in Python) like linear regression, Bayesian linear regression and SVM in order to predict the behaviour of the system?
I think because time here shows its effects on temperatures and solar intensity directly, I can disregard "time" as another independent variable.
I am quite confused here. Any help and suggestion would be much appreciated.
Thanks a lot.
Relevant answer
Answer
Dear Mohammad Rezvanpour
I found the below page very useful if you work on Matlab. It's easy to understand
  • asked a question related to Prediction
Question
16 answers
Here is the situation: I am trying to predition the energy consumption (load) of households using the artificial intelligence (machine learning) techniques.
Problem: The data is only available for the 40% of the households. Is it possible to predict the energy consumption for the rest of 60% households based on the available data (features) of 40% of households?
Relevant answer
Answer
I think, it is possible, but I cannot guess the accuracy of the developed forecasting model.
  • asked a question related to Prediction
Question
4 answers
Dear All!
Is there a software in which I will make NMR prediction of compounds in deuterated acetontrile, acetone or methanol ? In mestrenova I can make only predictions in chloroform, dmso or water.
Thank you so much for your help!
Relevant answer
Answer
Mrs/Miss Haraźna,
Do you know the reliability of these so called predictions of NMR spectra?
There is a plenty of software for prediction of mass spectra as well. However, a comparative analysis with experiment shows a dramatic lack of accuracy between theory and experiment.
Such software are very useful to only educational purpose. Because of, all important and really observed both NMR and mass spectrometric phenomena are unable to be accounted. Thus, the so-call predicted spectra produce very illustratively the fundamental basick knowledge.
An additional comment on: RG represent forum for exchange of knowledge at a highly specialized professional level. Very frequently, many participants are unable to distinguish between highly specialized technical information and low specialized information of general or popular character. The latter one is typical for the public press. Owing to the fact that the comments on RG are inaccessible to the mass reader or to the communities as whole, this means that RG does not represent forum for distribution of knowledge at a general public level.
  • asked a question related to Prediction
Question
2 answers
Dear researchers,
Any recommendation on FREE online Webserver/ Software For metabolomic approaches and toxicity prediction for dermal ?
Is better if enclosed with guidance on how to interpret the results generated from the webserver.
This is because I would like to generate a report and have to do interpretation on it.
Thank you.
Relevant answer
Answer
ADMET toxicity tools in general.
A list of tools available on the following link:
  • asked a question related to Prediction
Question
10 answers
It is about a 3 class classification problem. Where test data has the probability of occurance of different classes are almost similar. I.e. they occur around 33% times each. Now upon training a model yields an accuracy of 45-48% on out of sample test data. Is this result significant in terms of prediction? Here accuracy is computed as %of correctly identified class to all classes. In other similar problems where problem is modelled as 2 class classification problem the maximum accuracy obtained in the literature is around 69%. But in present case the classes are "up" "down" and "no-change" instead of just "up" and "down"
Relevant answer
Answer