Science topic
Big Data - Science topic
In information technology, big data is a loosely-defined term used to describe data sets so large and complex that they become awkward to work with using on-hand database management tools.
Questions related to Big Data
If I have multiple scenarios with multiple variables changing, and I want to conduct a full factorial analysis, how do I graphically show the results?
Hello respected all Researchers,
I will start my masters in September 2022. I am really confused about my master's major topic. There are two options for me now to select for my future study BIG Data or Computer Vision for my research work. I don't know which one I should take as a beginner. Please help !!! That I can start my study and choose a research direction. I anticipate hearing from you.
Thank you.......
mahfil
then the time column is showing as numbers, not as dates or different time steps. Please guide me how i can convert those nummbers into time steps or dates. Furthermore, if i compare the time column then 01-04-1959 08:00 AM is displaying as 519344.
Edge computing is a research hotspots, but I can not find any open data set of edge computing. Does anybody know any big data set available in literature for edge computing?
I'm working on an update to our previous global geochemical database. At the moment, it contains a little over one million geochemical analyses. It contains some basic geochronology data, crystallization dates for igneous rocks and depositional dates for sedimentary rocks. The database differs from GEOROC and EarthChem, in that it includes some interpretive metadata and estimates of geophysical properties derived from the bulk chemistry. I'd like to expand these capabilities going forward.
What would you like to see added or improved?
Here's a link to the previous version:
Big Data , personalization of E- learning Big Data , learner behavior . machine learning .
Electrochemical impedance has attracted more and more attention in recent years.However, due to the limitation of experimental conditions, data in this respect are very scarce.I would appreciate it if who can share papers or available data about EIS.
Dear users.
I need to construct the plots in excel and R.I've 2000 data and these are large numbers (millions, because this is profit).
Could you tell me please,how I can to do it in excel and r with a goal to look years 200-2005 on the horizontal axis?
Excel:=series(;'63'!$A$1:$A$204;'63'!$D$1:$D$346;1)
R:
data <- read.table(file = "2.txt", header = TRUE)
head(data)
plot(data,type = "l", col = "red")
Doesn't work correctly
Thank you very much
My aim is to use six classifiers to test various ML tools and generate a model for each of them from the raw data ( Big analytic tools on the data set)
All the conclusions of personalized medicine go through AI applications on enormous masses of biomedical information. Molecular data play a crucial role in obtaining metabolic models to be used for patient analysis. Data relating to proteins and their functions in almost all cases have to do with protein forms that have undergone PTMs (Post-translational modifications). We are speaking about 100,000 PTMs or so, for about 20.000 – 25,000 protein-coding genes. These numbers point to an estimate in humans of around 6 million protein species, that is, the human proteome. Obviously they are not all present at the same time but perform their function in different spatiotemporal contexts.
PTMs of proteins change the protein structure, its chemical-physical characteristics and makes possible new functions with specific molecular partners. The response of the modified protein to the environment also changes, because we are dealing with a new molecular form, with new properties. In a nutshell with a new molecule. From the number and types of potential sites for PTMs on a protein, it is possible to calculate how many molecular forms a single protein can produce. For example, 4 phosphorylation sites on a protein are enough to have 15 distinct combinations for 15 different molecular forms. In cell, each molecular form is generated by the specific space-time context in which it occurs, because only, and only in that cell context, it can exist with its specific functional role. So, when we want to analyze a molecular form experimentally, we should simulate as much as possible the metabolic context in which we think that function should take place, or in vivo studies we should extract and purify the protein from the tissue. Without context, we have inappropriate results on the molecular form because it is not identifiable in space and time. Thus the context should be explicitly reported in papers. Unfortunately, this is a very rare information. What commonly happens is that these data without spatio-temporal context flow into the databases and are used for network analysis, where we find them all collapsed on the native protein. This generates static metabolic models and most of the analyzes are therefore flawed with the possibility that the models used for personalized medicine may be wrong, with possible damage to patients. Another problem then arises, how to eliminate these errors from biomedical Big-data systems? A fundamental rule of Big-data systems is that in order to have reliable results the data must be characterized by a high index of Veracity. Today, this is not true.
What do supporters of personalized medicine think about?
We are working on a collaboration project between Babeș Bolyai University in Cluj-Napoca and KPMG Romania, aiming to observe how the accounting profession is being transformed by the technological advancements.
Technologies under assessment are: Cloud Computing, Robotic Process Automation (RPA), Big Data and Data Analytics, Machine Learning (ML) and Artificial Intelligence (AI).
The questionnaire takes around 5 minutes and is anonymous. Your input would benefit us greatly.
Thank you for your time!
How to connect data collected using IoT (Big Data) to a neural network? Example: 24 hours the patient's pressure is measured using the so-called portable holter. Data is transmitted via a sensor to the server in the form of Big Data. How to transfer data to the neural network?
Modern politics is characterized by many aspects which were not associated with traditional politics. Big data is one of them. Data mining is being done by political parties as they seek help from data scientists to arrive at various patterns to identify behavior of voters. Question is, what are the various ways in which big data is being used by modern political parties and leaders?
Over the last few months, I have come across several posts on social media where scientists/researchers even Universities are flaunting their ranking as per AD Scientific Index https://www.adscientificindex.com/.
When I clicked on the website, I was surprised to discover that they are charging a fee (~24-30 USD) to add the information of an individual researcher.
So I started wondering if it's another scam of ‘predatory’ rankings.
What's your opinion in this regard?
Hello everyone,
Could you recommend an alternative to IDC please to get records from the global datasphere for free?
Thank you for your attention and valuable support.
Regards,
Cecilia-Irene Loeza-Mejía
Hello,
I'm a masters degree student and I am struggling to find a good thesis topic for my masters degree. I would really appreciate if you can help me.
As you know, biosystem engineering is a major where I can work on both mechanical engineering side of things and electrical/computer engineering side of things. Personally, I am interested in precision agriculture(electrical/computer side) and have academical experience on implementing computer vision models(Generally Deep Learning), analyzing and modeling big data(Generally Machine Learning) and deploying IOT applications.
Thank you for your time.
Dear All,
I would like to ask, is it possible to obtain data in some databases, websites about sexual behavior in different countries of Europe or the World? Thank You!
Best regards
Stefan
I am currently conducting a study on the effects of adopting IoT and Big data technologies in a manufacturing facility. I'm trying to get hold of data that would concern the change in the capacity of the plant, the maintenance costs, and OEE. I am aware that there is previous case studies on the matter but I am trying to quantify the change using real data. Does anybody know where I can find production data of a manufacturing facility I can use?
I have 2 seperate data sets that I need to combine, time point 1 and time point 2.
Not all participants in time point 1 are in time point 2 (i.e., attrition, etc.). So I will need to know how to match participants, and keep the duplicates. Too many instructional videos tell you how to remove duplicates.
Next, I need to structure of the time points to be on top of each other, by columns. Meaning, I need time 1 to be above time 2, not next to each other. Again too many videos tell you how to add rows, (i.e., dplyr left join, right join), its very hard to find those that teach you how to add by columns. I want the data structure to be suitable for longitudinal analysis, or at least some form of repeated measures - where adding data sets to left or right may not work.
Please help! Either in R or excel!
Hello everyone. I have question about obtaining data from Internet.
In my research I will analyze comments from websites and social media platforms. And I am searching for applications/apps/technologies other tools to download comments from Internet to my computer.
Do you know any tools/apps to download comments for free?
There is around 10.000 comments and if I would copy/paste one by one it would take me a lot of time. I want to obtain data quickly.
Do you have any suggestions for me?
Thank you so much for help.
Regards, Nejc
Hello everyone. I have question about obtaining data from Internet.
In my research I will analyze comments from websites and social media platforms. And I am searching for applications/apps/technologies other tools to download comments from Internet to my computer.
Do you know any tools/apps to download comments for free?
There is around 10.000 comments and if I would copy/paste one by one it would take me a lot of time. I want to obtain data quickly.
Do you have any suggestions for me?
Thank you so much for help.
Regards, Nejc
Good day,
I am looking into the application and risks associated with the application of big data in South Africa - specifically in terms of groundwater resource management?
I am looking into transboundary resource management but also curious on impact thereof on for example detecting illegal abstraction or resource depletion because of it as well as contamination detection.
Please share your thought or relevant articles?
Regards,
Cindy
The current technological revolution, known as Industry 4.0, is determined by the development of the following technologies of advanced information processing:
Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
Which of these technologies are applicable or will be used in the future in the education process?
Please reply
Best wishes
Can anyone suggest any ensembling methods for the output of pre-trained models? Suppose, there is a dataset containing cats and dogs. Three pre-trained models are applied i.e., VGG16, VGG19, and ResNet50. How will you apply ensembling techniques? Bagging, boosting, voting etc.
Good day house. Please I just updated my r to R 4.1.2 and then I started having issues with read_tsv() and read_csv().
It seems readr 2.1.2 is not compatible with R 4.1.2; because when I used read.csv (which is not readr function), I was able to import .csv file. But using read_tsv() and read_csv() which are both readr functions kept giving me error:
Error in app$vspace (new_style$`'margin-top' %||% 0) : attempt to apply non-function
Can anyone help with this or help me with another function to import .tsv file apart from read_tsv() in readr package?
In my upcoming research on Big Data architecture, I'd like to make use of data from some of the best conferences I've attended ( practice-led conferences not academic ones )
What's the most rigorous methodology to capturing data from a video in academia ?
Hi, I'm currently on my master's at the University of Bradford studying "Applied Artificial Intelligence and Data Analytics" and I was looking for capstone dissertation topics related to AI and data analytics. So if anyone has a few suggestions for research topics I would love to see some.
Hi All,
What do you find is the best solution for big data such as NGS sequences storage options that would allow easy transfer to and forth to university's HPC server?
Do you have your own local server in the group? Loads of hardrives? Google cloud? Commercial clouds?
Thanks in advance!
hey guys, I'm working on a new project where I should transfer Facebook ads campaigns data to visualize in tableau or Microsoft power BI, and this job should be done automatically daily, weekly or monthly, I'm planning to use python to build a data pipeline for this, do you have any suggestions or any Resources I can read or any projects similar I can get inspired from ? thank you .
In CSE, What are the recent research in Energy Management System Using Big data??
Big data Characteristic defines 5 Vs. i am planning to take velocity ( IoT sensor data) , so anyone suggests recent research in Energy Management systems with big data and IoT.
research topic > masters degree> big data> medicine/ health
Commercial banks are increasingly worried about competition from fintechs, including online technology companies that expand the range of financial and pre-financial services. Commercial banks are more and more actively using IT technologies of online banking, building Business Intelligence data processing platforms, extending Big Data database systems, developing integrated risk management systems and conducting advertising campaigns on social media websites. In view of the above, large commercial banks have the opportunity to conduct a sentiment analysis on data collected in Big Data database systems for the purpose of analyzing the expectations and opinions of Internet users regarding, for example, financial services. Information obtained from the Internet and processed in the aforementioned manner can be used for more precise risk analysis, credit risk management, planning subsequent advertising campaigns, modifying the financial services offer in line with changing expectations of Internet users, searching for clients on social media portals. In this way, interdisciplinary analytical processes are also developed at commercial banks, for which the information from the websites of social media portals is the source of data.
Do commercial banks have a chance to win in this matter in competition with the fintech technology companies operating on the Internet?
Besides, What is the effectiveness of online advertising campaigns run by commercial banks?
Please, answer, comments.
I invite you to the discussion.
Does anybody know a solver for a large scale sparse QP that works on the GPU?
Or, more in general, can a GPU speed up solvers for sparse QPs?
Phylogeny analysis seems beyond my capacity right now, however, there are so much information in my dataset of seed traits along with climate, phenology, any suggestion will be appricitated.
1. what are the high-quality research monographs (or books) on artificial intelligence or data science or big data analytics? I hope to have 3 recommendations for each.
2. What are the important technologies or techniques developed in the past 10 years since 2021? I hope to know 5 of them. Please do not mention deep learning.
Thank you
2022-1-12
How can a scientist learn about data science and modelling that is applicable to solve wide varieties of problems. If possible contact of where you can do that even as a visiting researcher ?
dear community, I need some sources for some data science project or machine learning project related to analyzing the google analytics and Facebook business data , your help is appreciated.
As it is known, Artificial Intelligence, Machine Learning and Deep Learning methods are used today to produce meaningful information from big data. So how can the MCDM approach be integrated into such a structure? Is it healthy to use MCDM in big data?
Software Experts: Ever wanted to write a book? Here's an opportunity close to it that you may not want to miss. Please see
for more details.
Hello everyone,
I am looking for links of scientific journals with dataset repositories.
Thank you for your attention and valuable support.
Regards,
Cecilia-Irene Loeza-Mejía
Big data is a new trend in the Technology field, it has many applications in Education especially in analysis students performance if the teacher using LMS.
my question about how we can make Big dat benefit for us in Mathematics Education ?
AI and Big Data have recently seen widespread application in virtually every field. With the economy's increasing digitization, it is expected that massive amounts of data will be generated at every node. I wonder if primary data based research in consumer behavior, economics, agricultural economics, and related fields will become obsolete in the future as more sophisticated models aided by AI and Big Data provide a more accurate picture of various phenomena. Please share your thoughts on what will be the role of researchers in applied economics, business, and marketing etc (not including those in the fields of computer science).
I have devoted time to the pricing of "dataset" asset access, a form of licensing? you can read more here [1].
I have also addressed an approach to structuring data in conjunction with a structural graph of relationships (like the maritime and other routes between two harbours), where the notion of set is leveraged upon for its flexibility (you aggregate what you want in a set, homogeneity not required unlike with vectors), and the structure of matrix is used for showing relations between nodes i and j. [2]
I am now exploring specific application fields, such as recommendations based on collaborative filtering from observed user behaviours towards items, and how to generalise Fuzzy Cognitive Maps (FCM) which overlay thin information on graphs, as matrices of sets do with thicker possibilities.
Another area I am exploring with the tool framework [2] is how to handle the cases where information is heterogenously available over time t and space x (for instance lots of information in data set D(t1,x1) relative to (t1,x1) but maybe less at (t2,x2) or (t1,x2) or (t2,x1)...
This can be expressed by the data gathered/observed at time t and position x as D(t,x), whereas the complete data which would ideally describe the details of what is happening at (t,x) might be C(t,x) which contains set D(t,x) but may be larger than D(t,x).
Have you encountered cases similar to the ones mentioned above? Can you give details, maybe references?
Ref:
[1]
Conference Paper Towards Data Market Places: Nature of Data, Exchange Mechani...
[2]
Preprint Matrices of Sets and L-Sets
I have 4 varieties 30 treatments with 3 levels which type of graphs is suitable to express results ??
what are the recent reach in sustainable development using big data?
What will be the future applications of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management?
The analytics conducted on computerized Business Intelligence platforms is one of the key advanced information technology technologies of the fourth technological revolution, known as Industry 4.0. The current technological revolution described as Industry 4.0 is determined by the development of the following technologies of advanced information processing: Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
The analytics conducted on computerized Business Intelligence platforms currently supports business management processes, including logistics management.
In my opinion, the use of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management, including supply logistics, production logistics, provision of services and distribution of manufactured products and services, is currently growing.
The analytics conducted on large data sets conducted in the cloud computing on Business Intelligence computerized platforms in Big Data database systems makes it particularly easy to identify opportunities and threats to business development, allows for quick generation of analytical reports on selected issues in the economic and financial situation of the business entity. In this way, the generated reports can be helpful in the processes of enterprise logistics management, including supply logistics, production logistics, provision of services and distribution of manufactured products and services.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
What will be the future applications of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management?
Please reply
I invite you to the discussion
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes
My essay is an attempt to answer the following : « Is the data economy, then, destined to benefit only a few elite firms? » Apparently that would be the issue till now. What are available tools to avoid this false target ? Reference to my essay on Stochastic Models in particular the section « Handling human social technical dimension; in particular man-system interface including positioning technology at man services » you may find guidelines to produce these tools and make BIG DATA exploitable by large majority of users : 1. Engine should trace “player” behaviour, evaluate its capabilities and quickly meet its needs. 2. Immersion generated by simulation enables training and experimentation of behaviour strategies, in particular learning “by doing”. 3. Engine should use following resources : 3.1. Tools to be customized by trainers. 3.2. Applied standards. 3.3. New learning approaches discovery through obtained results, whether these approaches are positive or negative, in the sense of improving technology performance of assembled prototypes. 4. How SPDF (Standard Process Description Format) may produce a universal engine to run the stochastic model ? 4.1. SPDF consists of two parts : 4.1.1. Message structured-data part (including semantics) and, 4.1.2. Process description part (with higher level of semantics). 4.2. Two key outputs of the SPDF research will be a process description specification and framework for the extraction of semantics from legacy systems. 4.2.3. Note that : a)The more we may have semantic rules the more unpredictable events are controlled. b) Acquired knowledge to elaborate semantic rules for unpredictable events requires many occurrences of the stochastic model. c) Convergence shall not be reached until getting more qualitative semantic rules. d) Performing dynamically a given scenario is the goal of the proposed messaging system.
The current technological revolution, known as Industry 4.0, is determined by the development of the following technologies of advanced information processing: Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
In connection with the above, I would like to ask you:
Which information technologies of the current technological revolution Industry 4.0 to the greatest extent support the enterprise management process?
Please reply
Best wishes
Big Data Analytics (Undergraduate or Postgraduate Course)
It usually have
Hadoop and Spark as its main outline with Hive and HBase as Distributed Data warehouse and Distributed Database examples.
Anyone who is teaching this course with LABS for VM, Docker or Kubernetes based Labs for Hadoop and Spark from Single Node Cluster to Multiple Nodes Cluster configurations and some example Labs starting from Word Count distributed / parallel processing run on multiple nodes.
Please share any resources / labs / tutorials. Thanks
#BigData #dataanalytics #BigDataAnalytics #Hadoop #Spark #kubernetes #Docker #virtualmachines #Hbase #Hive
The goal of predictive analysis is to develop predictions for the development of complex, multifaceted processes in various fields of science, industry, economy or other spheres of human activity. In addition, predictive analysis may refer to objectively performing processes such as natural phenomena, climate change, geological, cosmic etc.
Predictive analysis should be based on taking into account in the analytical methodology possible the most modern prognostic models and a large amount of data necessary to perform the most accurate predictive analysis. In this way, the result of the prediction analysis performed will be the least subject to the risk of analytical error, ie an incorrectly designed forecast.
Predictive analysis can be improved by using computerized modern information technologies, which include computing in the cloud of large data sets stored in Big Data database systems. In the predictive analysis, Business Intelligence analytics and other innovative information technologies typical of the current fourth technological revolution, known as Industry 4.0, can also be used.
The current technological revolution known as Industry 4.0 is motivated by the development of the following factors:
Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies. On the basis of the development of the new technological solutions in recent years, dynamically developing processes of innovatively organized analyzes of large information sets stored in Big Data database systems and computing cloud computing for the needs of applications in such areas as: machine learning, Internet of Things, artificial intelligence, Business Intelligence are dynamically developing.
For the abovementioned application examples, one can add predictive analyzes of subsequent, other fields of application of advanced technologies for the analysis of large data sets such as Medical Intelligence, Life Science, Green Energy, etc. Processing and multi-criteria analysis of large data sets in Big Data database systems is carried out according to V4 concepts, ie Volume (meaning a large number of data), Value (large values of certain parameters of the analyzed information), Velocity (high speed of new information) and Variety (high variety of information).
The advanced information processing and analysis technologies mentioned above are used more and more often for the needs of conducting predictive analyzes concerning, for example, marketing activities of various business entities that advertise their offer on the Internet or analyze the needs in this area reported by other entities, including companies, corporations, institutions financial and public. More and more commercial business entities and financial institutions conduct marketing activities on the Internet, including on social media portals.
More and more public institutions and business entities, including companies, banks and other entities, need to conduct multi-criteria analyzes on large data sets downloaded from the Internet describing the markets on which they operate, as well as contractors and clients with whom they cooperate.
On the other hand, there are already specialized technology companies that offer this type of analytical services, including offering predictive analysis services, develop custom reports, which are the result of multicriteria analyzes of large data sets obtained from various websites and from entries and comments. contained on social media portals based on sentiment analyzes of the content of entries in the comments of Internet users.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
How can you improve the process of predictive analysis?
Please reply
I invite you to discussion and scientific cooperation
Dear Colleagues and Friends from RG
The key aspects and determinants of the applications of modern computerized information technologies for data processing in Big Data and Business Intelligence database systems for the purpose of conducting predictive analyzes are described in the following publications:
I invite you to discussion and cooperation.
Best wishes
How to obtain currently necessary information from Big Data database systems for the needs of specific scientific research and necessary to carry out economic, business and other analyzes?
Of course, the right data is important for scientific research. However, in the present era of digitalization of various categories of information and creating various libraries, databases, constantly expanding large data sets stored in database systems, data warehouses and Big Data database systems, it is important to develop techniques and tools for filtering large data sets in those databases data to filter out of terabytes of data only information that is currently needed for the purpose of conducted scientific research in a given field of knowledge, for the purposes of obtaining answers to a given research question and for business needs, eg after connecting these databases to Business Intelligence analytical platforms. I described these issues in my scientific publications presented below.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
How to obtain currently necessary information from Big Data database systems for the needs of specific scientific research and necessary to carry out economic, business and other analyzes?
Please reply
I invite you to the discussion
Thank you very much
Dear Colleagues and Friends from RG
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes
Sometimes the advance technologies like CT Scan and MRI are available. But the competency to read the results is lack. According to big data technology, is it possible to construct "translation machine" or "scanner" for CT Scan or MRI "graphs/ pictures" into diagnostics statement that can be understood for the common ones ?
Hi, I am currently starting to work my degree and I am having trouble coming up with a good research topic. If possible I would like to include IoT, AI and Big Data. Any suggestions?
Thank you very much.
Dear Sir/Madam,
I trust you are keeping safe.
My name is Martin a masters student at Nottingham Business School.
I need your help on a questionnaire survey related to usage of big data analytics in the investment management (or asset management) industry.
Please find more details on the Participant Information Sheet (PIS) in the link below (1) and if you agree to participate please find the survey in below link (2)
(1) PIS link :
(2) Online questionnaire survey: https://nbsntu.eu.qualtrics.com/jfe/form/SV_8dLcy5PUSGJeDDT
Thank you in advance !
Regards,
Martin
Hey!
I tried out Knime to save some time because some of my evaluation processes include a lot of copy-pasting of data into the right format for me.
I created a workflow that helped me now. What would be great is if the results file would be integrated in the original Excel file as a new work sheet. Can anyone explain to me how to do this? Is it possible at all? Right now I have an Excel writer node create a new file with the results.
Also I usually have several files and already found that If they are in the same folder i can execute the same routine on all files in the folder. But then it can not be processed because 1) i only have 1 output file and 2) some data sets are different (too good---without error message rows) and give the error that the respective rows are missing.
So if you have an idea how i can make the respective changes please help.
Attachted a file how my original data looks like ("Results") and the results i got with Knime copied into the next work sheet (sorting, getting rid of duplicates and rows i don't need). Also tried to export the workflow so you can have a look. I execute the writer at the bottom most of the time when i saw at the machine that the duplicates are ok.
What are the important topics in the field: Data analysis in Big Data database systems?
What kind of scientific research dominate in the field of Data analysis in Big Data database systems?
Please reply. I invite you to the discussion
Dear Colleagues and Friends from RG
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes
What are the best research topics for doining PhD in Big data?
Below are some issues related to Big Data database technologies that can be developed scientifically:
- Application of data processing technology in Big Data database systems for modern education 4.0,
- Improvement of forecasting of natural, climatic, economic, economic, financial, social etc. phenomena based on analyzing large data sets,
- Analysis of sentiment, opinions of citizens, Internet users regarding brand recognition of companies, customer reviews of specific services and products, views on various topics, citizens' worldview based on the analysis of large collections of information downloaded from various websites, from comments downloaded from social media portals,
- Analysis of information and marketing services of commercially operating companies that carry out specific analyzes of sentiment, citizens' opinions, Internet users regarding brand recognition, customer reviews of specific services and products etc. on behalf of other companies that purchase specific analytical reports,
- Analysis of the possibilities of cooperation, synergy, correlation, conducting interdisciplinary research, connecting Big Data database systems with other information technologies typical for the development of the current fourth technological revolution called Industry 4.0, which include technologies such as: cloud computing, machine learning, Internet of Things, Artificial Intelligence, etc.
In what other areas are the technologies of processing and analysis of information in Big Data database systems used?
Please answer
Best wishes
Dear Colleagues and Friends from RG
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes
Hello fellow researchers,
I am currently dealing with very large data sets of SNPs (more than 2 million) to investigate whether GWAS significant SNPs are more frequently located within certain genomic regions than non-significant SNPs. I have a 2x2 table stating the absolute number of SNPs in the significant vs. non-significant group that are either located within this specific region or not. Now, I obviously need to check my results for statistical significance, which initially I have done with the Chi-square test. But because I have so high numbers, every investigation is (putatively) statistically significant. I know that some publications just state the Cramer's V as an additional indicator, but I would rather have something alternative to use (if it exists). So do any of you know good alternative tests or methods to deal with these high numbers without this large sample size bias? How do you normally deal with these huge samples sizes?
I would be grateful for any tip or advice.
Thank you!
Some studies say that the Random forest method could be the best. But I'd like to get more opinions since many people seem to be using many different methods. It would be nice if someone could provide any resources for carrying out the methods too (Tutorials, R code, etc)
Hello Everyone, I want to find out if there is a way to do (remote) scientific collaboration in the field of Big Data Analytics, Machine Learning/Deep Learning. The goal is only to learn and to enhance my publication list and my portfolio of projects.
I have a PhD in Big Data Analytics. I have worked a lot on Big Data/Deep Learning but I am available to work on any filed.
Thanks in advance,
Hi. Since missing value imputation is determined by nearby data, should we separate control and treatment groups and perform MVI separately for each? Context: This is for mass spectrometry data.
Does anyone of you use sentiment analysis in research conducted on data downloaded from the Internet and analyzed in the Big Data database system?
If so, please let me know in which issues, in which research topics do you use sentiment analysis?
Is sentiment analysis helpful in forecasting economic and financial processes?
Please reply
Best wishes