Science topics: MathematicsData Science
Science topic
Data Science - Science topic
Data science combines the power of computer science and applications, modeling, statistics, engineering, economy and analytics.
Whereas a traditional data analyst may look only at data from a single source - a single measurement result, data scientists will most likely explore and examine data from multiple disparate sources.
According to IBM, "the data scientist will sift through all incoming data with the goal of discovering a previously hidden insight, which in turn can provide a competitive advantage or address a pressing business problem. A data scientist does not simply collect and report on data, but also looks at it from many angles, determines what it means, then recommends ways to apply the data."
Data Science has grown importance with Big Data and will be used to extract value from the Cloud to business across domains.
Questions related to Data Science
I'm curious to understand how possible it is to bridge the gap between Data Science and Spatial Ecotoxicology in developing countries. Dearth of information and lack of data is a major setback in developing countries. It will be interesting to understand fully the intricacies involved and the way forward. Thanks.
If I have multiple scenarios with multiple variables changing, and I want to conduct a full factorial analysis, how do I graphically show the results?
Hello,
I have 2x2 Mixed Design ANOVA
I did assumptions before Anova. I fixed 1 or 2 person (outlier) for data and I went for analysis. After executed outliers, I got still significant and large effect size (partial eta) therefore, I prefered to report without outliers. I have also between group differences, I will report them too. But I have problem in group interaction and some within-subject results.
Group intereaction; I have nLOW=10, nHigh=11 group. The result showed me that
F(1,19)=4.18, p= .056, Partial eta =0.17! and another F(1,19)=3.72, p= .69, partial eta=0.16!
Within-Subject:
F(1,19)=4.18, p= .055, partialeta=0.18 ! and I have 4 more like these for within result.
My Question:
i) I wanted to report nonsigficant data, because I read that "p" is probability but effect size also shows us "how much variances affect/interaction for each other". I also read about "effect size choosing for sample size. They said that " if you have small size (maybe like me for 2x2 design (nLOW=10, nHigh=11, nTOTAL:21)) you must use omega squared" and If I use omega for my results, all large effect sizes have gone. They decreased to medium (around 0,50)
After I read Dr. Lakens's review (10.3389/fpsyg.2013.00863), I thought that can be " generalized eta", because I realized after computed Omega squared and General Eta , they look similar (general eta:0.047, omega square :0.037). But If I have these effects, I thought there is no need to report for nonsignificant data as I mentioned above. Because I also lost effect sizes, right? It is same for some within-subject results. If I compute effect size as a omega squared and generalized eta, almost all large partial eta decreases to medium or low.
What should I do about it? Should I report both (partial eta and generalized eta which they are very different for my data: when Partial eta shows:0,241; GeneralEta shows:0.047; omega squared shows: 0.037) ?
Thank you in advanced for your time.
Berkant Erman
I really need to advance in Python for data science. I need the best books. Anybody here can recommend me the list of books? no matter of their costs, the intellectual cultivation pays back.
1. We generally expect the P value <0.05. However, it is quite possible to see a value greater than 0.14 in the partial eta squared (as an effect size) in the two-way analysis of variance (2-way mixed etc.) with p>0.05.
In addition to these, we can also use another effect sizes as eta squared, omega squared etc. Here is my question about it :
- - -How often do you encounter such a situation [(p>0.05 and ES>0.14 (large)] and what importance do you place on such a result(s) in the discussion section of the paper?
2. Due to the complexity of studying in sports sciences (professional athletes, muscle biopsy etc.), there are still important articles published with a few individuals/participants/subjects (may be in Q1-2 Journals). But some journals insist on asking for effect size of study as well (they are also Q1-2 Journals).
- - - What do you think about the effect size of studies in sports sciences?
Your thoughts will be very valuable to me.Thank you in advance.
Do you think R and Python should be taught together in a master's level social data science course, or should the course pick one and focus on it? Let's assume students have some background with Excel and have minimal training in R or Python.
I'm looking for datasets containing coherent sets of tweets related to Covid-19 (for example, collected within a certain time period according to certain keywords or hashtags), containing labels according to the fact they contain fake/real news, or according to they fact they contain pro-vax / anti-vax information. Possibly, the dataset I'm looking for would also contain a column showing the textual content of each tweet, a row showing the date, and columns showing 1)The username /id of the autohor; 2)The username/id of the people who retweeted the tweet.
Do you know any dataset with these features?
I'm a computer engineering student with specialization in software engineering and data science
and soon I have to write my proposal for my master's thesis , would be appreciated a lot if the experts of field would lend me their knowledge and insight to help me in choosing a problem to work on or where to search and look and read in order to be able to detect and work on a problem for my thesis.
Thanks a lot in advance.
Hi Team, I am working on a problem where I need to classify the product in categories like Bad, Good, Better and Best. I am looking forward to use data related to product price, customer reviews, product features, etc.
Please share your ideas or any reference material related to this.
I have the data available related to product, price, features, reviews, website visit. I have all data science tools and techniques to make use of the data as well.
Thank you!
I am looking for upcoming conferences for Data Science, Data Analytics snd Information Systems. Any recommendation!!
Modern politics is characterized by many aspects which were not associated with traditional politics. Big data is one of them. Data mining is being done by political parties as they seek help from data scientists to arrive at various patterns to identify behavior of voters. Question is, what are the various ways in which big data is being used by modern political parties and leaders?
I have 400 individual data of renal stone patients; the data is about the parameters of some blood biopsies and 24 hours urine analysis; I have done the random Forest using R. However, the result is no significant statistic. What can I do about this data, Is there any Statistic tool to analyze? Thank you so much.
PS: Definitely I've done a logistic regression was no correlation.
What are the latest research trends in Data Science ?
Dear Researchers,
Greetings for the day !
My name is Shard and I completed B.S. Marine Engineering from Birla Institute of Technology, Pilani, Rajasthan, in 2006 and then M.S. Information Technology (Application Development) from Sunderland University, United Kingdom, in 2010. I also did my second master's degree i.e, MSc Mathematics from Shoolini University in 2021.
I'm a Research Scholar in Yogananda School of AI, Computer and Data Sciences and currently pursuing the Ph.D. degree in Mathematics at Shoolini University, Solan, Himachal Pradesh, India. Also, working as Assistant Professor in Shoolini University since 2014.
My areas of interest in research includes "Technology adoption using mathematical models and statistical tools like SPSS and also waste management energy".
I'm actively looking for research collaborations with national/international collaborators to be the contributor in the scientific community.
Wamly,
Shard
+91-8219639808
I have been studying on Data analysis softwares. Currently trying to expand my knowledge on R programming. I am interested in projects related to R programming. I can provide free service to anyone that needs help on data analysis specifically R programming and Financial data analysis. Thanks
I have 2 equation, and in each equation i have a coefficient of interest:
lets say:
eq1: Y=bX+2cX2+4
eq2: Y=aX2+2dX+12
Giving that the value of a and b are changing over time.
And I am aiming to record the values of a in list A and b in another list B
And from their behaviour i want to draw conclusion about the strength of these coefficients.
But i am a bit confused about how to draw such conclusion and what is the most representative way to monitor a and b behaviour change over time.
Or its better to monitor the increase or decrease of coefficient by summing the difference of recorded values over time.
I have more coefficients to be monitored, and they may have value or not. and my aim is to build meaningful classification that can categorise coefficients as useful or not.
- What is the best metric for model selection?
- Does accuracy derived by cross-validation is a good metric? *
- Does the selected model in the model selection process based on the metrics, surely leads to better results?
Dear Researchers,
I with my team are working on Machine learning, data science, association rule mining and Simpsons Paradox.
I am looking for potential collaborators in these areas. If you are interested to collaborate or discuss our work, you can email me at rahul.gla@gmail.com
Thank you.
Best Wishes,
I'm using Mutual Information (MI) for Multivariate Time Series data as a feature selection method.
MI is nonnegative MI >= 0 where 0 indicates that both variables are strictly independent and above that means the variables share a useful amount of information.
After computing the MI between 8 features and the target variable I got the following values:
[0.19, 0.23, 0.34, 0.19, 0.19, 0.12, 0.21, 0.071]
and when computing MI between some of the Input features, and another input feature to search for redundancy, I got the following values:
[4.0056916 , 1.58167938, 1.20578024, 1.0273157 , 0.93991675,0.9711158 ]
The values are not between -1 and 1 or between 0 and 1 so that it's easy to compare and draw conclusions.
My question is:
Is there a way to set a threshold value (t for example) where if MI >= t, then the variables are sharing a good amount of information and if MI < t they are not? How to know if the MI is high enough?
hey guys, I'm working on a new project where I should transfer Facebook ads campaigns data to visualize in tableau or Microsoft power BI, and this job should be done automatically daily, weekly or monthly, I'm planning to use python to build a data pipeline for this, do you have any suggestions or any Resources I can read or any projects similar I can get inspired from ? thank you .
Data science is a growing field of technology in present context. There have been notable applications of data science in electronic engineering, nanotechnology, mechanical engineering and artificial intelligence. What kind of future scopes available for data science at civil engineering aspects in the field of structural analysis, structural design, geotechnical engineering, hydrological engineering, environmental engineering and sustainable engineering?
In recent years, data science has emerged as a promising interdisciplinary subject, and helped understand and analyze actual phenomena with data in multiple areas. The availability and interpretation of large size data that is a vital tool for many businesses and companies has changed business models, and led to creation of new data-driven businesses.
In agriculture including crop improvement programs, both short and long term experiments are conducted, and big size data is generated. However, deep data mining, meaningful interpretation, deeper extraction of knowledge and learning from data sets are more often missing. Whether application of data science is also vital in agriculture including crop improvement for understanding and analyzing the actual phenomena and extracting deeper knowledge??
Scientific articles published in esteemed newspapers hold many informations that can be of great value which often includes biodiversity documentation, sporadic incidents, demographic data and many more that are seldom being cited and eventually consign to oblivion.
Do you think that these datas, only those that are relevant and authentic, should be cited and a proper archive and citation protocol should be constituted.
I am looking to start my Masters' Dissertation in Data Science. with a five months time frame. What interesting topics would you suggest, one in which I would be able to get a dataset from open source and analyse.
How shipping and logistics industry can be improved using Data science and Statistical analysis methods?
I am looking for a journal where I can publish my research paper. As a beginner, it's very difficult to publish my paper in the Q1 or Q2 Journal. That's why, I am looking for some Scopus Indexing Journal, Where I can easily publish my paper. My research direction is Data Science, Machine Learning, or Deep Learning for the time being.
1. what are the high-quality research monographs (or books) on artificial intelligence or data science or big data analytics? I hope to have 3 recommendations for each.
2. What are the important technologies or techniques developed in the past 10 years since 2021? I hope to know 5 of them. Please do not mention deep learning.
Thank you
2022-1-12
How can a scientist learn about data science and modelling that is applicable to solve wide varieties of problems. If possible contact of where you can do that even as a visiting researcher ?
dear community, I need some sources for some data science project or machine learning project related to analyzing the google analytics and Facebook business data , your help is appreciated.
Actually, I am new at research and don't know where to start I even don't know the problem statement because I don't know about the field I am gonna research well I want to study them and find the ratio of cultural impact (food, religion, dress, norms, etc)on different countries like Afghanistan or Pakistan or China or any Country of Asia or Europe of Some of the world largest Empires Like Mongol Ottomon British so as it is related to social science research but I want to make it in term of data science so can anyone tell me what do from where to start and what will have to do and from where I get the dataset I am new at research so I don't know what to do as i read some paper but nothing get anything about models and anything so please suggest me and tell me the brief answer if you can
A ground truth map is essential for supervised classification of hyperspectral image cube .However, data resources are limited, up till now, almost all the papers I've read use the Indian Pine, Washington D.C.Mall, Salinas, Pavia University. It seems all we have are the four or five sets of data cube. With the development of state-of-art algorithms for hyperspectral image classification, data should be updated too. Hope anybody can provide me new data with Ground truth map, or recommend me a website to download.
If I have an experience but no certificate in Machine Learning, and have both experience and certifcate in Medicine (Urology) ... and want to publish a peper that include an interaction between these 2 feilds, in either Urology or CS journals. Should i add a coauthor who is certified in Data science or CS? ... if not, Would it be necessary to prove my competency in ML by any means?
Which free software for text analytics tasks (such as text mining and sentiment analysis) can be recommended for social science researchers and students who do not have much background in programming or data science?
I understand that R could be the obvious answer for many, given its capabilities, I am specifically looking to shortlist 4 to 5 GUI/point-and-click options which can be recommended to early researchers and post graduate students in social sciences, especially Psychology.
I have experimented with KNIME and Orange, but won't certify them as 'friendly enough'. This could be because I did not spend enough time on them, though.
I have recently come across many online forums recommending Kaggle for people wanting to get into machine learning or data science. However, I would like some scholarly opinions on this website: https://www.kaggle.com/
Currently, artificial intelligence/machine learning (AI/ML) and other technologies (mathematics or statistics) have been introduced into various engineering fields for data science research. We all know that in the past few decades, civil/mining and other fields have accumulated a large amount of data indoors or on-site, but because these data are directly or indirectly related to projects, enterprises, or state secrets, they have very high privacy. . In other words, it is data sensitivity and security. So, with the rapid development of AI/ML, how to do a trade-off between the scientific issues of data science and data security?
Dear Colleagues,
If you are researcher who is studying or already published on Industry 4.0 or digital transformation topic, what is your hottest issue in this field?
Your answers will guide us in linking the perceptions of experts with bibliometric analysis results.
Thanks in advance for your contribution.
The main goal behind this question is extracting the best practices on how to establish a Data Science / IA Project.
Anyone would be generous to share with us the set of tools or even a boilerplate that may help or guide anyone working in the field and applying data science & IA to any domain.
I have a dataset X filled with 4x4 matrices that I am generating through a complex model. However, the values of this dataset are slightly off due to some unknown factor that is being taken into consideration by the software that I am using for the generation. I know this because I have another dataset y filled with 4x4 matrices that contain the actual values. I decided to build a ML model that can predict the right output by using X and y to train the model.
At first, I decided to use the sequential neural network API from Keras to create the model.
X = pandas dataframe
y = pandas dataframe
## Normalize with MinMaxScaler ##
scaler = preprocessing.MinMaxScaler()
names = X.columns
d = scaler.fit_transform(X)
X = pd.DataFrame(d, columns=names)
names = y.columns
d = scaler.fit_transform(y)
y = pd.DataFrame(d, columns=names)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, shuffle=True)
model = keras.Sequential()
model.add(layers.Dense(12, input_dim=12, activation='linear', name='layer_1'))
model.add(layers.Dense(100, activation='linear', name='layer_2'))
model.add(layers.Dense(150, activation='linear', name='layer_3'))
model.add(layers.Dense(100, activation='linear', name='layer_4'))
model.add(layers.Dense(12, activation='linear', name='output_layer'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=5, shuffle=True,verbose=2)
In this case, I used neural networks because that is really all I have ever worked with, but I have a feeling as though that might not have been the correct option. What do you guys think?
Hi,
I am looking for some article/book/clip about performing and interpreting Network Analysis using R in psychology studies. I would be grateful if you could help me in this matter.
Thank you
The accuracy of the model (MLP) decreased when I removed duplicated features (two columns) from dataset. Should I remove them or keep them? It is supposed to get improved, any idea why this happened?
Thank you.
- If I wanted to build up a career in data science, which one should I focus on, statistics concepts or technical skills in Python or R?
- Is it true that SQL is the most important skill for a data scientist?
- Suggest some papers which might be helpful for research in data science.
Thank you!
For example AI can automate processes in the initial interpretation of images and shift the clinical workflow of radiographic detection, management decisions. But what are the clinical challenges for its application?
#AI #cancer #clinical #oncology #biomedical
Dear Community , in order to analyze a component website and its features ; I want to extract some data from the website , it's first time to hear about the web scraping, can you guide me please , how to do it using python for competition intelligence?
thank you
I almost asked this as a "technical question" but I think it's more of a discussion topic. Let me describe where I get lost in this discussion, and what I'm seeing in practice. For context, I'm a quantitative social scientist, not "real statistician" nor a data scientist per se. I know how to run and interpret model results, and a little statistical theory, but I wouldn't call myself anything more than an applied statistician. So take that into account as you read. The differences between "prediction-based" modeling goals and "inference-based" modeling goals are just starting to crystalize for me, and I think my background is more from the "inference school" (though I wouldn't have thought to call it that until recently). By that I mean, I'm used to doing theoretically-derived regression models that include terms that can be substantively interpreted. We're interested in the regression coefficient or odds ratio more than the overall fit of the model. We want the results to make sense with respect to theory and hypotheses, and provide insight into the data generating (i.e., social/psychological/operational) process. Maybe this is a false dichotomy for some folks, but it's one I've seen in data science intro materials.
The scenario: This has happened to me a few times in the last few years. We're planning a regression analysis project and a younger, sharper, data-science-trained statistician or researcher suggests that we set aside 20% (or some fraction like that) of the full sample as test sample, develop the model on our test sample, and then validate the model on the remaining 80% (validation).
Why I don't get this (or at least struggle with it): My first struggle point is a conceptual/theoretical one. If you use a random subset of your data, shouldn't you get the same results on that data as you would with the whole data (in expectation) for the same reason you would with a random sample from anything? By that I mean, you'd have larger variances and some "significant" results won't be significant due to sample size of course, but shouldn't any "point estimates" (e.g., regression coefficients) be the same since it's a random subset? In other words, shouldn't we see all the same relationships between variables (ignoring significance)? If the modeling is using significance as input to model steps (e.g., decision trees), that could certainly lead to a different final model. But if you're just running a basic regression, why would anyone do this?
There are also some times when a test sample just isn't practical (i.e., a data set of 200 cases). And sometimes it's impractical because there just isn't time to do it. Let's set those aside for the discussion.
Despite my struggles, there are some scenarios where the "test sample" approach makes sense to me. On a recent project we were developing relatively complex models, including machine learning models, and our goal was best prediction across methods. We wanted to choose which model predicted the outcome best. So we used the "test and validate" approach. But I've never used it on a theory/problem-driven study where we're interested in testing hypotheses and interpreting effect sizes (even when I've had tens of thousands of cases in my data file). It always just seems like a step that gets in the way. FWIW, I've been discussing this technique in terms of data science, but I first learned about it when learning factor analysis and latent variable models. The commonality is how "model-heavy" these methods are relative to other kinds of statistical analysis.
So...am I missing something? Being naive? Just old-fashioned?
If I could phrase this as a question, it's "Why should I use test and validation samples in my regression analyses? And please answer with more than 'you might get different results on the two samples' " :)
Thanks! Looking forward to your insights and perspective. Open to enlightenment! :)
Dear Community , I have a column in my dataset containing ip-adress , I'm trying to do some exploratory data analysis on my dataframe , but I don't know how to explore or describe the ip address column , should I delete it ?
Actually, I am doing research in data science, where I want to explore the new classifiers based on attention mechanisms and transformers for classification. I read a lot of stuff regarding the same. can anybody please suggest some recent research papers on the transformer NN model for classification?
All,
I need your suggestions/ideas on M.Tech dissertation topics in DataScience in NMS/EMS preferably in Telecommunication domain. Thanks in advance.
While reading a research paper I have found that to find the abnormality in a signal authors were using a sliding window and in each window he was dividing the mean^2 by variance. After searching in the internet I found a term called fano factor but it was inverse of that. Could anyone please give an intuitive idea behind
the equation mean^2/variance?
Please, if you are a data scientist or have first-hand experience, write your own experience here, please do not reply by sending links from other pages.
Nowadays, the majority of data scientist when dealing with soil, water, environment, or climate subjects tend to use of two programming languages including R and Python. Personally, I am aware that each one has its own advantages and disadvantage. I want you to share here your ideas about following questions:
- when you dealing with a data science matter, you feel more comfortable with which one?
- During your works, you have been faced with less errors while dealing with which one?
- Generally speaking, which language is more efficient
- which language provide better and newer update for libraries and packages?
- If you want to teach them to your future students, which one will be your preference?
- which one would be extinct or less important in future?
Sincerely
Data Science has evolved as a new branch of knowledge. Some people define it as applied statistics whereas others define it as an umbrella under which several existing interrelated branches have been grouped together. Being a recent topic, we may find edited volumes and obviously several papers. But, do we have standard fundamental books on this topic. This is essential as when we frame a syllabus, it is highly mandatory to provide some books as text, which cover most of the contents of the syllabus.
I would like to have single source text books or multiple text books covering the entire branch of knowledge.
How do we calculate data processing time, response time and cost in cloud analyst simulator?
I have protein data consisting of Degree, Closeness, Betweenness, Degree, And Eigenvector values in an XL sheet. Now my assignment is to make a Prediction with accuracy value using the machine learning approach. I'm not an expert in programming. I have downloaded Anaconda and R don't know which will be best.
-Can anybody please guide me on where do I get the algorithms or provide me some materials to do it?
-can anybody please tell me how to define train data and test data from my whole data sets?
-What type of classifier will be best to get a high accuracy?
- a Step-by-Step instruction will be easier to understand.
A big big thank you in advance
How valuable is it to master Excel VBA in 2021, when most fields are dominated by data science and Python?
Thinking about online and MOOC-type certificates for R programming and data analysis, are there any that are recognized and respected by potential graduate schools and employers?
I guess if people can recommend those for SAS or Python, that would be useful as well.
I am learning ml, data science for data crunching of financial market data for my trading in financial market . I want to make a terminal which takes live data from NSE and do certain task(some calculation , graph representations ,ml model to run on data )but don't know is it possible with ml and python or have to go through whole software development road . so pls help me to figure out what i need to learn for this and how to do this .
Respectfully
Hi,
I saw that some public administration study programs have the following. It will be interesting to learn what is the course about?
I would like to dive into the research domain of explainable AI. What are some of the recent trending methodologies in this domain? What can be a good start to dive into this field?
Some people in Academia criticize those who use neural networks for using a tool without knowing what is happening inside. While I agree (with academics like Cynthia Rudin) that we should benchmark these models against white box models and statistical methods, I think this opinion against black boxes has its origin in conflating data science with pure sciences.
In my opinion data science is not a science (for example in Popperian or Kuhnian sense).
Although data science is highly mathematical it shares methologies with humanities and social sciences where declaration of limitations is usually applied over attempts to reach "objective" results as in pure sciences.
A neural network is an instrument like a radio to capture a pattern of particular complexity from a sample of data. Main engineering task at the application level is to engineer the sample (dataset) where model is a mere tool.
I think a well-formulated philosophy of deep learning is in need.
I would like to know how CNN differs from ANN and DL, I want to know the actual concept behind each one of them??
I am looking for appropriate books or platforms to learn data science using python from a basic level. I would be happy to get some genuine responses. Thanks.
Hello everyone
One of the obstacles for me in a ml project is data cleaning phase. I am fairly good at the model implementation but I have major issues in data preprocessing. Could someone give me an advice?
Thanks
For example, can I import two different datasets as one for training the model and other for testing it. In classifier can I use X_train, y_train just from training dataset and X_test, y_train from testing dataset. is it legit to go for it?
is it affecting on Model accuracy? or not?
Thinking someone will give a plate what to be solved and will play around with some model and will be paid as a scientist (Youtube is full of many such tutorial videos) is not going to happen, this kind of task will be done by "Artificial Intelligence" itself/automatically. Scientists mean to formulate a new problem which not even seen as a problem and then analyze and convince its potential benefit and then how it can be solved.
what can be the research topic for application of data science in finance sector?
Hi All
I'm looking for some papers on Self driving cars that discuss the current capability and work, on how much Data processing these cars do, and what are the steps and techniques applied to improve them
One of the common problems in data science is gathering data from various sources in a somehow cleaned (semi-structured) format and combining metrics from various sources for making a higher level analysis. Looking at the other people's effort, especially other questions on this site, it appears that many people in this field are doing somewhat repetitive work. For example analyzing tweets, facebook posts, Wikipedia articles etc. is a part of a lot of big data problems.
Some of these data sets are accessible using public APIs provided by the provider site, but usually, some valuable information or metrics are missing from these APIs and everyone has to do the same analyses again and again. For example, although clustering users may depend on different use cases and selection of features, but having a base clustering of Twitter/Facebook users can be useful in many Big Data applications, which is neither provided by the API nor available publicly in independent data sets.
Is there any index or publicly available data set hosting site containing valuable data sets that can be reused in solving other big data problems? I mean something like GitHub (or a group of sites/public datasets or at least a comprehensive listing) for the data science. If not, what are the reasons for not having such a platform for data science? The commercial value of data, need to frequently update data sets, ...? Can we not have an open-source model for sharing data sets devised for data scientists?
how network data science can address social problem of intersectional inequality?
Kindly give your valuable suggestions and help me to have some literature on it.
i need dataset with world's or country's macroeconomic indicators and/or global indexes yearly for build some machine learning predictive model, who can help? Thanks a lot
Hello every one
Based on my studies, I have found that it is not possible for all researchers to make the same conditions for producing samples. Therefore, each researcher by reporting working conditions and effective parameters, trying to produce a sample and perform a series of experiments to extract a series of response data based on their experiment design. The issue in such reports is when one intends to study and then compare the results. Due to differences in parameters, comparisons between studies are not possible. The reason for this is the difference of several parameters during the comparison. My question is, is there a general way to normalize response data based on multi-independent parameters?
I'm working on a flood prediction using machine learning under my masters degree for data science. I would like to know the basic process to finding out the data for this and also would like to know what are the models that I can use in order to accomplish this project.
It would also be great to know some of the gap that lies in so far flood prediction models that have been developed by other researchers.
i try to build ml-model for oil prediction, what important factors do you recommend to use for predict big price jumps (volatility ) in 2008 and 2011-2012. thanks a lot
How machine learning can be helpful for Public Health and Social Science? Please mention some prospective research areas.
I want to develop a curriculum that includes a consortium of subjects like AI, Data Analytics, Machine Learning, Evolutionary Optimization techniques and the like. Data Science is one among them.
Predicting human behavior is challenging. When using machine learning algorithms in human behavior research the NRMSE would be higher than when we are investigating more accurate areas (NRMSE < 0.1), but what is the acceptable range?
Some models like NIMA evaluate photos aesthetically, but I believe these evaluations are more based on the basics of photography, like photos being focused and so on. I need to analyze more sophisticated photos.