Science topic

Data Processing - Science topic

Explore the latest questions and answers in Data Processing, and find Data Processing experts.
Questions related to Data Processing
  • asked a question related to Data Processing
Question
2 answers
To give more chance to my next research, I would like to use advanced graphs and diagrams for geochemical data processing
Relevant answer
Answer
A neat compilation of geochemical plotting programs is given at:
Next, perhaps you can try: https://www.diagrams.net/
free, high quality diagramming software for flowcharts or diagrams, which allow online collaboration as well...
  • asked a question related to Data Processing
Question
1 answer
,,
Relevant answer
Answer
Dear Doctor
Go To
A Review Study of Apache Spark in Big Data Processing
V Srinivas Jonnalagadda , P Srikanth , Krishnamachari Thumati, Sri Hari Nallamala
International Journal of Computer Science Trends and Technology (IJCST) – Volume 4 Issue 3, May - Jun 2016
"Apache Spark is a powerful open source processing engine built around speed, ease of use, and sophisticated analytics. Since its release, Apache Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Yahoo, Baidu, Airbnb, eBay and Tencent, have eagerly deployed Spark at massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. It has quickly become the largest open source community in big data, with over 1000 contributors from 250+ organizations. Spark provides a simple way to parallelize these applications across clusters, and hides the complexity of distributed systems programming, network communication, and fault tolerance. The system gives them enough control to monitor, inspect, and tune applications while allowing them to implement common tasks quickly. The modular nature of the API (based on passing distributed collections of objects) makes it easy to factor work into reusable libraries and test it locally."
  • asked a question related to Data Processing
Question
17 answers
Hi everyone
I'm facing a real problem when trying to export data results from imageJ (fiji) to excel to process it later.
The problem is that I have to change manually the dots (.) , commas (,) even when changing the properties in excel (from , to .) in order not count the numbers as thousands, (let's say I have 1,302 = one point three zero two) it count it as (1302 = one thousand three hundred and two) when I transfer to excel...
Lately I found a nice plugin (Localized copy...) that can change the numbers format locally in imageJ so it can be used easily by excel.
Unfortunately, this plugin has some bugs because it can only copy one line of the huge data that I have and only for one time (so I have to close and reopen the image again).
is there anyone that has faced this problem? Can anyone suggest me please another solutions??
Thanks in advance
Problem finally solved... I got the new version of 'Localized copy' plugin from the owner Mr Wolfgang Gross (not sure if I have the permission to upload it here).
Relevant answer
Answer
Jonas Petersen cool! some answers after years XD
  • asked a question related to Data Processing
Question
3 answers
Good day everyone.
I have been doing some GRACE data processing in GEE, but from what I can tell - only the first mission's data (dating from 2002/04 to 2017/01) is accessible through the library for import.
Any recommendations how I can access more recent data from the GRACE-FO mission for analysis in GEE?
Any feedback is greatly appreciated.
Best wishes.
CV
Relevant answer
Answer
Cindy Viviers have you been able to incorporate GRACE-FO data in GEE? I am also working on the similar one and need assistance.
  • asked a question related to Data Processing
Question
1 answer
I, as a professional urban planner, want to specialize in data processing and programming skills. Therefore, I seek to figure out from experienced experts what programing language better suits urban planning context and urban analysis.
Relevant answer
Answer
Few recommendations:
1. Python: Python is a versatile programming language that is widely used for data analysis and processing, as well as for building web applications and scripts. It has a large and active user community, which means there are many libraries and tools available for various tasks. Python is a good choice for urban planners who need to work with large datasets, perform spatial analysis, or build data-driven applications.
2. R: R is a programming language and software environment for statistical computing and graphics. It is particularly well-suited for data analysis, visualization, and machine learning tasks. R has a large and active community of users and developers, and many packages are available for spatial analysis and visualization.
3. SQL: SQL (Structured Query Language) is a standard language for managing and manipulating data stored in relational databases. It is commonly used by urban planners for managing large datasets and performing queries to extract specific information.
4. GIS software: Geographic Information System (GIS) software is specialized software designed for working with spatial data, such as maps, satellite imagery, and other geospatial data. Popular GIS software includes ArcGIS, QGIS, and Google Earth. These tools can be used in combination with programming languages like Python and R to perform spatial analysis and visualization.
Good luck
credit AI tools
  • asked a question related to Data Processing
Question
7 answers
I have a two-factor experiment. We investigated the effect of the drug on the level of erythrocytes after surgery. One of the factors is the presence of the operation, the second is the presence of the drug, the third is the interaction of these factors. I have difficulty in interpreting the received data. We have one time point. Example: we obtained the influence of the operation factor (P <0.01) on the level of erythrocytes after the operation and the interaction of factors - operation * drug (P <0.05). But I didn't get the influence of the drug factor. Can I conclude that the drug affects the level of red blood cells after surgery?
Relevant answer
Answer
Interpreting data processed with two-way ANOVA can be difficult. However, there are some things you can do to make it easier. One thing you can do is to look at the interaction between the two factors. Another thing you can do is to look at the main effects of each factor.
  • asked a question related to Data Processing
Question
3 answers
Dear Professsors and studend friends,
I am an undergraduate student and I want to desing an experimental graduation project with one of my friends by using EEG. However, we do not have any data aquations and processing experiences. Can you give me some recomendations. from your experiences.
Relevant answer
Answer
There are many data processing methods for EEG data, including mathematical signal analysis, machine learning techniques, and pre-processing methods such as Electrooculogram (EOG) artifact correction and filtering. Some specific techniques commonly used in EEG data analysis include t-tests, ANOVAs, and non-parametric procedures. Additionally, simultaneous acquisition of EEG with functional magnetic resonance imaging (fMRI) has shown promise in EEG-informed fMRI analysis . Overall, the specific data processing methods used for EEG data analysis will depend on the research question and the available resources.
  • asked a question related to Data Processing
Question
3 answers
If you had the opportunity, what artificial intelligence would you design and create to be helpful in the research, analytical, editorial, other work you do in conducting your scientific research and/or describing its results?
In your opinion, how would it be possible to improve the processes of conducted research and analytical work, processing of the results of conducted research through the use of artificial intelligence in combination with certain technologies typical of the current fourth technological revolution, technologies categorised as Industry 4.0, including analytics conducted on large sets of data and information, on Big Data Analytics platforms?
The development of artificial intelligence technologies has accelerated in recent years. New applications of specific information systems, various ICT information technology solutions combined with technologies typical of the current fourth technological revolution, technologies categorised as Industry 4.0, including machine learning, deep learning, artificial intelligence and analytics performed on large data and information sets, on Big Data Analytics platforms, are emerging. Particularly in the field of ongoing research work, where large sets of both qualitative information and large sets of quantitative data are produced, the aforementioned technologies are particularly useful in facilitating analytics, processing, elaboration of research results and their preparation for presentation at scientific conferences and in scientific publications. In the analytics of large quantitative data sets, analytical platforms built using integrated information systems, computers characterised by high performance computing power, equipped with servers, high-capacity memory disks, on which Big Data Analytics platforms are built, are used. On the other hand, artificial intelligence technology can also be useful for aggregating, multi-criteria processing and elaboration of large sets of qualitative information. In addition to this, certain IT applications, including statistical and business intelligence applications, are also useful for processing the results of studies carried out, presenting them in scientific publications, statistically processing large data sets, generating descriptions and drawing graphs based on them. As part of the digital representation of researched, complex, multi-faceted processes, digital twin technology can be useful. Within the framework of improving online data transfer, remote communication conducted between researchers and scientists, for example, Blockchain technology and new cyber security solutions may be helpful.
Probably many researchers and scientists would like to have state-of-the-art ICT information technologies and Industry 4.0. including Big Data Analytics, artificial intelligence, deep learning, digital twins, Business Intelligence, Blockchain, etc. Many researchers would probably like to improve the processes of the research and analytical work carried out, the processing of the results of the research carried out, through the use of artificial intelligence in combination with certain technologies typical of the current fourth technological revolution, technologies categorised as Industry 4.0, including the use of artificial intelligence and analytics carried out on large sets of data and information, on Big Data Analytics platforms.
The construction of modern laboratories, research and development centres in schools, colleges, universities, equipped with the above-mentioned new ICT information technologies and Industry 4.0 is therefore probably an important factor for the development of scientific and research and development activities of a particular scientific institution. However, it is usually limited by the financial resources that schools, colleges, universities are able to allocate for these purposes. However, should these financial resources appear, the questions formulated above would probably be valid. In such a situation, as part of a systemic approach to the issue, the construction of modern laboratories, research and development centres in schools, colleges and universities, equipped with the above-mentioned new information technologies, ICT and Industry 4.0, would also be determined by determining the priority directions of research work, the specific nature of the research carried out in relation to the directions of the teaching process, the mission adopted by the scientific institution in the context of its research, scientific work, the achievement of specific social objectives, etc.
In view of the above, I would like to address the following questions to the esteemed community of scientists and researchers:
In your opinion, how would it be possible to improve the processes of conducted research and analytical work, processing of the results of conducted research through the use of artificial intelligence in combination with certain technologies typical of the current fourth technological revolution, technologies classified as Industry 4.0, including analytics conducted on large sets of data and information, on Big Data Analytics platforms?
If you had the opportunity, what artificial intelligence would you design and create to be helpful in the research, analytical, editorial, other work you carry out as part of your scientific research and/or describing its results?
What artificial intelligence would you design and create to be helpful in the research, analytical, data processing, editorial, other work you are doing?
What do you think about this topic?
What is your opinion on this subject?
Please respond,
I invite you all to discuss,
Counting on your opinions, on getting to know your personal opinion, on an honest approach to the discussion in scientific issues and not the ready-made answers generated in ChatGPT, I deliberately used the phrase "in your opinion" in the question.
The above text is entirely my own work written by me on the basis of my research.
I have not used other sources or automatic text generation systems such as ChatGPT in writing this text.
Copyright by Dariusz Prokopowicz
Thank you very much,
Best wishes,
Dariusz Prokopowicz
Relevant answer
Answer
None: I don't need it.
  • asked a question related to Data Processing
Question
7 answers
..
Relevant answer
Answer
Data Processing and Data Mining are both essential components of the data analysis process, but they have distinct purposes and methods. Here's a breakdown of the key differences between the two:
Data Processing: Data processing refers to the manipulation and transformation of raw data into a more meaningful and organized format. It involves various operations that cleanse, validate, integrate, and format data to make it suitable for further analysis. The primary goal of data processing is to ensure data quality, consistency, and reliability. It typically includes tasks such as data cleaning, data transformation, data aggregation, and data summarization. Data processing focuses on preparing data for efficient storage, retrieval, and analysis.
Data Mining: Data mining, on the other hand, is a specific technique or process within data analysis that involves discovering patterns, relationships, and insights from a large volume of data. It employs statistical and mathematical algorithms, machine learning techniques, and data visualization tools to extract knowledge and actionable information from the data. Data mining aims to uncover hidden patterns, trends, correlations, or anomalies that are not readily apparent. It can be used to solve specific business problems, predict future outcomes, identify market trends, or support decision-making processes.
In summary, data processing is the broader concept that encompasses the overall handling and preparation of data, ensuring its quality and consistency. Data mining, on the other hand, is a focused analysis technique that aims to extract valuable insights and knowledge from processed data by applying various statistical and machine-learning algorithms.
  • asked a question related to Data Processing
Question
1 answer
Now I want to access the two future simulations of temperature and precipitation in the CMIP6 dataset (in ssp126, ssp245, ssp370, ssp585 senario). It is only necessary to obtain the data for the Chinese region. Finally it will be carried out to calculate the annual average of rainfall and temperature for each province according to the regional boundaries of the Chinese provinces. How do I implement it?
Relevant answer
Answer
If you can understand Chinese, you can read one of my blogs: https://zhuanlan.zhihu.com/p/424218019. Simply, it is actually using CDO to cut .nc file by .shp files.
  • asked a question related to Data Processing
Question
5 answers
If the answer is YES, whether the use of artificial intelligence for statistical data processing must be stated in the methods
Relevant answer
Answer
The research on and application of artificial intelligence (AI) has triggered a comprehensive scientific, economic, social and political discussion. Here we argue that statistics, as an interdisciplinary scientific field, plays a substantial role both for the theoretical and practical understanding of AI and for its future development. Statistics might even be considered a core element of AI.
Regards,
Shafagat
  • asked a question related to Data Processing
Question
2 answers
I want to transfer the healthcare data, which includes demographic and multimedia such as medical imaging, etc., from Asia & the US to Europe and vice versa. So the healthcare data will be collected from various hospitals and clinical centers in India, the US, and EU countries. The questions are as per below: Q1. I know that i can transfer demographic data in anonymized or pseudo codes form, but how can I generate anonymized or pseudo codes for multimedia data that contain patient identity? Q2. How can i use healthcare data in Europe countries when consent was taken in other countries of different parts of the world? The patient consents to data per legal regulations of other countries, such as the US & India. While EU countries have data protection laws such as GDPR. In contrast, the US has health data protection laws like HIPAA. I am looking for some solutions to these two questions.
Relevant answer
Answer
Peter Donald Griffiths Thank you for your answer. But my real consent is how we can use this for practical use because a lot of legal terms will be fulfilled to use this data in the EU. We already have patients' consent during data collection, but it is insufficient for data transfer.
  • asked a question related to Data Processing
Question
3 answers
Consider the natural case when we have an imbalanced dataset, maybe with missing values and containing categorical features. We need to impute missing values, do some sort of balancing and also encode categorical variables. In train-test split case we need to put aside a test set and never perform any learning on it (for example not use sklearn's fit or fit_transform methods, use only transform method). Generally, what is the appropriate order of steps in case of K-fold cross-validation to avoid estimators learning from test fold during the process?
Relevant answer
Answer
All operations like preprocessing, augmentation, hyperparameters tuning, and so on, should be performed on every split independently to avoid data leakage and adaptative overfitting.
  • asked a question related to Data Processing
Question
4 answers
Main ideas in processing theory of learning
** be clear not clever, because the bad kind of clever leads to bad results
**be specific. Because Specifity leads to clarity
Relevant answer
Answer
The still dominant learning paradigm is the industrial assembly line; digitization adds currently only the i-point to this economic stage.
It can not be ruled out that prudent data transformation management,
with respect to human learning process, will alter this state in the coming decades and lead to open or multi-loop learning.
Such an approach would start in elementary school, after the basic cultural techniques (ABC, 123, technical skills) are mastered, i.e. further schooling would continue with project learning and project groups to solve real world problems In different fields.
  • asked a question related to Data Processing
Question
1 answer
Hello, now I am processing a 300kv cryoEM data by relion for a small membrane protein(~100kd), which only has transmembrane domain. However, when undergoing 3d classification, I found protein model didn't stack well. It seemed multiply slides in the model. Also the model looks not so compact but a little scattered. Does anyone meet this problem before, or has some suggestion during data process?
Relevant answer
Answer
If your 2d classification results are good, such as the transmembrane helix are visible, you should read structural biology papers about small membrane protein in recent years.
  • asked a question related to Data Processing
Question
3 answers
Can Big Data Analytics technology be helpful in forecasting complex multi-faceted climate, natural, social, economic, pandemic, etc. processes?
Industry 4.0 technologies, including Big Data Analytics technology, are used in multi-criteria processing, analyzing large data sets. The technological advances taking place in the field of ICT information technology make it possible to apply analytics carried out on large sets of data on various aspects of the activities of companies, enterprises and institutions operating in different sectors and branches of the economy.
Before the development of ICT information technologies, IT tools, personal computers, etc. in the second half of the 20th century as part of the 3rd technological revolution, computerized, partially automated processing of large data sets was very difficult or impossible. As a result, building multi-criteria, multi-article, big data and information models of complex structures, simulation models, forecasting models was limited or impossible. However, the technological advances made in the current fourth technological revolution and the development of Industry 4.0 technology have changed a lot in this regard. More and more companies and enterprises are building computerized systems that allow the creation of multi-criteria simulation models within the framework of so-called digital twins, which can present, for example, computerized models that present the operation of economic processes, production processes, which are counterparts of the real processes taking place in the enterprise. An additional advantage of this type of solution is the ability to create simulations and study the changes of processes fictitiously realized in the model after the application of certain impact factors and/or activation, materialization of certain categories of risks. When large sets of historical quantitative data presenting changes in specific factors over time are added to the built multi-criteria simulation models within the framework of digital twins, it is possible to create complex multi-criteria forecasting models presenting potential scenarios for the development of specific processes in the future. Complex multi-criteria processes for which such forecasting models based on computerized digital twins can be built include climatic, natural, social, economic, pandemic, etc. processes, which can be analyzed as the environment of operating specific companies, enterprises and institutions.
In view of the above, I address the following question to the esteemed community of researchers and scientists:
In forecasting complex multi-faceted climate, natural, social, economic, pandemic, etc. processes, can Big Data Analytics technology be helpful?
What is your opinion on this issue?
Please answer,
I invite everyone to join the discussion,
Thank you very much,
Best wishes,
Dariusz Prokopowicz
Relevant answer
Answer
Dear Dariusz
The simple answer is YES!
The problem, however, is that the analytics are imperfect...
It is necessary to understand how human intuition (expert intuition) works... I am convinced that understanding the mechanisms of intuition holds great potential for improving analytics and forecasting...
My own research shows that nature has found genius ways to deal with radical uncertainty with limited resources...
Yurii
  • asked a question related to Data Processing
Question
2 answers
I am a beginner in this field. I want to learn basic audio deep learning for classifying audio. If you have articles or tutorial videos, please send me the link. Thank you very much.
Relevant answer
Answer
Hi Tim Albiges
Wow. This is very helpful. Thank you so much.
  • asked a question related to Data Processing
Question
3 answers
Hello there,
After some research about Brillouin sensors, ı couldn’t understand data processing after optical signal obtaining.
How to data is processed in conventional BOTDA and BOTDR after the detection of backscattering light on the photodetector?
Do the analysis processes done in the time domain or frequency domain?
I would be very appreciated if you help.
Thanks in advance,
Best Regards.
Relevant answer
Answer
Dear Volkan Türker,
You may want to take a look at the following info:
_____
_____
Photon-counting Brillouin optical time-domain reflectometry based on up-conversion detector and fiber Fabry-Perot scanning interferometer
A direct-detection Brillouin optical time-domain reflectometry (BOTDR) is proposed and demonstrated by using an up-conversion single-photon detector and a fiber Fabry-Perot scanning interferometer (FFP-SI). Taking advantage of high signal-to-noise ratio of the detector and high spectrum resolution of the FFP-SI, the Brillouin spectrum along a polarization maintaining fiber (PMF) is recorded on a multiscaler with a small data size directly. In contrast with conventional BOTDR adopting coherent detection, photon-counting BOTDR is simpler in structure and easier in data processing. In the demonstration experiment, characteristic parameters of the Brillouin spectrum including its power, spectral width and frequency center are analyzed simultaneously along a 10 km PMF at different temperature and stain conditions. © 2014 Optical Society of America
_____
_____
Recent Advances in Brillouin Optical Time Domain Reflectometry
In the past two decades Brillouin-based sensors have emerged as a newly-developed optical fiber sensing technology for distributed temperature and strain measurements. Among these, the Brillouin optical time domain reflectometer (BOTDR) has attracted more and more research attention, because of its exclusive advantages, including single-end access, simple system architecture, easy implementation and widespread field applications. It is realized mainly by injecting optical pulses into the fiber and detecting the Brillouin frequency shift (BFS), which is linearly related to the change of ambient temperature and axial strain of the sensing fiber. In this paper, the authors provide a review of new progress on performance improvement and applications of BOTDR in the last decade. Firstly, the recent advances in improving the performance of BOTDRs are summarized, such as spatial resolution, signal-to-noise ratio and measurement accuracy, measurement speed, cross sensitivity and other properties. Moreover, novel-type optical fibers bring new characteristics to optic fiber sensors, hence we introduce the different Brillouin sensing features of special fibers, mainly covering the plastic optical fiber, photonic crystal fiber, few-mode fiber and other special fibers. Additionally, we present a brief overview of BOTDR application scenarios in many industrial fields and intelligent perception, including structural health monitoring of large-range infrastructure, geological disaster prewarning and other applications. To conclude, we discuss several challenges and prospects in the future development of BOTDRs.
_____
_____
Time and Frequency Localized Pulse Shape for Resolution Enhancement in STFT-BOTDR
Short-Time Fourier Transform-Brillouin Optical Time-Domain Reflectometry (STFT-BOTDR) implements STFT over the full frequency spectrum to measure the distributed temperature and strain along the optic fiber, providing new research advances in dynamic distributed sensing. The spatial and frequency resolution of the dynamic sensing are limited by the Signal to Noise Ratio (SNR) and the Time-Frequency (T-F) localization of the input pulse shape. T-F localization is fundamentally important for the communication system, which suppresses interchannel interference (ICI) and intersymbol interference (ISI) to improve the transmission quality in multicarrier modulation (MCM). This paper demonstrates that the T-F localized input pulse shape can enhance the SNR and the spatial and frequency resolution in STFT-BOTDR. Simulation and experiments of T-F localized different pulses shapes are conducted to compare the limitation of the system resolution. The result indicates that rectangular pulse should be selected to optimize the spatial resolution and Lorentzian pulse could be chosen to optimize the frequency resolution, while Gaussian shape pulse can be used in general applications for its balanced performance in both spatial and frequency resolution. Meanwhile, T-F localization is proved to be useful in the pulse shape selection for system resolution optimization.
_____
_____
BOTDR MEASUREMENT TECHNIQUES AND BRILLOUIN BACSCUTTER CHARACTERISTICS OF CORNIG SINGLE-MODE OPTIC FILTERS
_____
_____
Domain Reflectometry
Similarly, FDR data can be acquired by using a TDR to measure the reflected wave over the large bandwidth and then using Fourier transform to convert from time to frequency domains.
From: Materials Ageing and Degradation in Light Water Reactors, 2013
_____
_____
  • asked a question related to Data Processing
Question
5 answers
Dear scholars,
I have a database with 100 geolocated samples in a given area, each sample contains 38 chemical elements that were quantified.
Some of these samples contain values Below the Detection Level of the instrument (BDL), clearly, when we have 100% of the samples with BDL values there is not much to do, but what can be done when, for example, when there is only 20% BDL, what do we do with them, with what value do we replace a BDL sample?
Some papers show that a BDL sample can be replaced by the detection level (for the instrument's minimum detection level for that chemical element) divided by 0.25, others show that you have to divide it by 0.5... What would you do in each case, and is there any literature you would recommend? If it matters, I am mostly interested in Copper and Arsenic.
Regards
Relevant answer
Answer
What fraction of values below BDL is acceptable.
Why are you making the measurements? The why determines what is acceptable.
If you are concerned about an upper limit, then BDLs are of no concern.
If you are concerned about a lower limit, it will depend upon the nature of your concern.
There is no recommendation. There is no rule of thumb.
You decide from the criteria associated with WHY if you have enough information.
Too many BDLs might mean you need a different technique, but it always returns to WHY.
If , say, a customer wants an answer at each location, use the actual result and note the uncertainty. The result is usually meaningless, because of the high uncertainty.
  • asked a question related to Data Processing
Question
3 answers
Dear scholars,
I have a database with 100 geolocated samples in a given area, each sample contains 38 chemical elements that were quantified.
Some of these samples contain values Below the Detection Level of the instrument (BDL), clearly, when we have 100% of the samples with BDL values there is not much to do, but what can be done when, for example, when there is only 20% BDL, what do we do with them, with what value do we replace a BDL sample?
Some papers show that a BDL sample can be replaced by the detection level (for the instrument's minimum detection level for that chemical element) divided by 0.25, others show that you have to divide it by 0.5... What would you do in each case, and is there any literature you would recommend? If it matters, I am mostly interested in Copper and Arsenic.
Regards
Relevant answer
Answer
You don't give a research question so I would simply report exactly what you said in full detail. You may want also to give information on the detection limits of the methods you used. Just let Prof Kan know detection limits are determined by Nature's own chemistry and not by personal choice. Best wishes David Booth
  • asked a question related to Data Processing
Question
10 answers
I would like to calculate the strain rates from horizontal velocities obtained from data processing of GPS permanent stations .
Relevant answer
Answer
Strain Calculator of cronin, 2012. It is both excel and matlab based.
  • asked a question related to Data Processing
Question
2 answers
ATEX is a very simple EBSD data processing software. I want to use this software to further process XRD data, such as texture, etc., but I don't know where to start and what steps are required?
Relevant answer
Answer
Thanks for the reply, but I also want to know if Rigaku's data needs special processing, because when I import it, it always prompts me that some parameters are incorrect.@Benoit Beausir
  • asked a question related to Data Processing
Question
9 answers
I have a problem. I obtain a FTIR data of my thin films but it looks like only a part of spectra seems to have valid imaginary part. Is that possible ?
I am using a 26th order of Savitzky-Golay filter to smooth the data but I'am pretty sure that I did not corupt my data.
Relevant answer
Answer
Emma Hill thank you for help and interest, problem was partially resolved (still in progress).
  • asked a question related to Data Processing
Question
2 answers
Is there any free online software to process the fluxomics raw data files.???
1. To calculate 12C and 13C ratio
2. Pathway mapping
Relevant answer
Answer
OptFlux3 is open source and "might" be modified to meet your needs.
  • asked a question related to Data Processing
Question
3 answers
If I have multiple scenarios with multiple variables changing, and I want to conduct a full factorial analysis, how do I graphically show the results?
Relevant answer
Answer
The best way to display your graphs with variables whose values change is to use the well-known program "Matlab"
Use the plot function inside an iteration chain of variables that takes all the variables' values For more information, see the Matlab book link and you will find the answer
good luck.
  • asked a question related to Data Processing
Question
7 answers
I am currently attending a PhD program in innovative Urban Leadership and I am interested how and in what way have the church leadership responded to the pandemic that happened in the past two years. I want to learn as how most leaders responded to the scenario created by the COVID-19 Pandemic. What innovative leadership techniques have been employed by church pastors and how were those techniques of innovative leadership principles and values have been employed to address the dire situation of the church goers? What models have been employed to address the wholistic needs of the members of the church? How did the church leadership overcame their vulnerability within this dire situation as they were in the forefront of fighting the pandemic? What is the learning most institutions generated and how do we recreate those learnings and use in the future when similar incidents happen?
Relevant answer
Answer
SPSS and SEM could be useful for quantitative research. Contrastly, the qualitative approach could employ for NVivo software.
  • asked a question related to Data Processing
Question
4 answers
I'm using XPS these days to analyze a set of nanomaterials that I am synthesizing. I'm getting familiar with the Avantage data processing tool that is installed on the computer attached to the XPS instrument. For learning the processing tool at my pace, I wanted to know if I can install it on my personal computer. Suggestions will be highly appreciated.
Relevant answer
Answer
For proper deconvolution of XPS peaks use origin software. It is free for 7 days trial.
  • asked a question related to Data Processing
Question
3 answers
Hi all,
I am currently trying to quantify the levels of ergesterol in soil samples for a small part of my PhD work. I have completed the necessary extractions and have ran them via HPLC. I am a complete novice using HPLC and cannot seem to find any literature on how to convert the data into a usable format.
When extracting, I spiked some of my samples to quantify extraction efficiency. I also created a series of standards for the creation of calibration curves and ran a middle standard as a drift every 10 samples.
The data I have is as follows:
A chromatogram including run information (time, injection volume etc.), retention time, area (mAU*min), height (mAU), and relative area/ height (%) (see attached word document)
Any advice/ literature references would be extremely useful.
Thanks in advance,
Dan
Relevant answer
Answer
Hi Dan,
Have a look at the following earlier posts:
It contains the following article (see section external std):
As said by Casey and Farrukh you can constitute a calibration curve, area (mAU.min) vs. concentration, and use this to calculate the concentration of your sample (unknown)
Regards, Hendrik
  • asked a question related to Data Processing
Question
3 answers
I am trying to understand how multivariate data preprocessing works but there are some questions in my mind.
For example, I can do data smoothing, transformation (box-cox, differentiation), noise removal in univariate data (for any machine learning problem. Not only time series forecasting). But what if one variable is not noisy and the other is noisy? Or one is not smooth and another one is smooth (i will need to sliding window avg. for one variable but not the other one.) What will be the case? What should I do?
Relevant answer
Answer
You are looking for a perfect answer that will cover any situation. Data analysis doesn't work like that. I would suggest you look at John Tukey work on Exploratory Data Analysis. That's the best answer that I know of today. Best wishes David Booth
  • asked a question related to Data Processing
Question
7 answers
I am interested in the usefulness of zero knowledge proof in verifying an algorithm (for bias, privacy, data processing, and general deployment process). Have you come across examples of it in regulatory compliance?
  • asked a question related to Data Processing
Question
8 answers
I am working on 2D pre-stake marine seismic sections and I need to make Surface Related Multiples Elinmiantions (SRME) by using REVEAL. Unfortunately, this is the first time for me to deal with seismic processing and reveal. I finished the first five steps which are Importing a SEGY, Survey QC, Creating a project binning, Sorting to CMP, and Velocity Analysis.
Now I am trying to apply SRME flow and I started with SRMENearInsert but when I submit the flow an error appears every time. So I need someone who has experience in seismic processing to help me to solve this problem.
I attached some screenshots for a seismic section sorted to CMP, input and parameters for SRMENearInsert, and the error.
Relevant answer
Answer
We working on Land seismic data processing, But unfortunately, We are not working with ShearWater REVEAL. We working with (CGG's GEOVATION, and WestrenGeco OMEGA).
I think you must check your first Offset, you are input Zero value for the first offset, so, I think this is a false value.
you can read the error messages carefully.
with my best regards.
  • asked a question related to Data Processing
Question
4 answers
Hi, I'm trying to start protein NMR again. When I was doing protein NMR 10 years ago, I used NMRpipe for data processing. But I think NMRpipe is not user friendly. Please teach me any other software that is similar to NMRpipe. Free software, user friendly GUI, run on Windows is better.
Relevant answer
Answer
The NMRPipe software package consists of a series of standalone programs
  • asked a question related to Data Processing
Question
3 answers
Hi,
there ist something wrong in one of my pages:
"Definition von individueller Datenverarbeitung (IDV)
January 2019
DOI: 10.1007/978-3-658-25696-8_1
In book: Sustainable Agriculture and Agribusiness in Iran"
It has nothing to do with Iran but it should be:
"Individuelle Datenverarbeitung in Zeiten von Banking 4.0: Regulatorische Anforderungen, Aktueller Stand, Umsetzung der Vorgaben "
Regards,
Holger
Relevant answer
Answer
OK, Thank you
  • asked a question related to Data Processing
Question
3 answers
I am looking for a book detailing the application of Convolutional neural networks (CNN) for the buildup of efficient computational frameworks for complex/Turbulent Flows modelling and data processing.
Relevant answer
Answer
The topic is rapidly evolving and, therefore, there is no text as yet. Thus, your best bet would be to refer to the latest papers from journals and conference publications. Also, I am afraid that the primary emphasis in this early literature may be on simpler, canonical flows as against complex turbulent flows. Here are a few papers that you might begin with. Hope they are helpful.
You may also keep in mind that more advanced frameworks beyond the CNNs are now being applied in the context of turbulent flows.
Broader reviews:
1. Duraisamy, K., Iaccarino, G. & Xiao, H. 2019 Turbulence modeling in the age of data. Annu. Rev. Fluid Mech. 51 (1), 357–377.
2. Duraisamy K. Perspectives on machine learning-augmented Reynolds-averaged and large eddy simulation models of turbulence. Phys Rev Fluids. 2021;6(5):050504.
Research articles:
3. Kim, J. and Lee, C. (2020). Prediction of turbulent heat transfer using convolutional neural networks. Journal of Fluid Mechanics, 882, A18. doi:10.1017/jfm.2019.814
4. Ocáriz Borde, H. S., Sondak, D., and Protopapas, P. (2021). Convolutional neural network models and interpretability for the anisotropic Reynolds stress tensor in turbulent one-dimensional flows, Journal of Turbulence, DOI: 10.1080/14685248.2021.1999459
5. Razizadeh, O. and Yakovenko, S. N. (2020). Implementation of Convolutional Neural Network to Enhance Turbulence Models for Channel Flows. 2020 Science and Artificial Intelligence conference. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9303178
6. Fang, R., Sondak, D., Protopapas, P., and Succi, S. (2020) Neural network models for the anisotropic Reynolds stress tensor in turbulent channel flow, Journal of Turbulence, 21:9-10, 525-543, DOI: 10.1080/14685248.2019.1706742
7. Moghaddam, A. A. and Sadaghiyani, A. (2018) A deep learning framework for turbulence modeling using data assimilation and feature extraction. https://arxiv.org/pdf/1802.06106.pdf
8. Chang, C.-W., Fang, J., and Dinh, N. T. (2020) Reynolds-Averaged Turbulence Modeling Using Deep Learning with Local Flow Features: An Empirical Approach, Nuclear Science and Engineering, 194:8-9, 650-664, DOI: 10.1080/00295639.2020.1712928
9. J. Ling, A. Kurzawski, and J. Templeton, “ Reynolds averaged turbulence modelling using deep neural networks with embedded invariance,” J. Fluid Mech. 807, 155 (2016). https://doi.org/10.1017/jfm.2016.615
10. Beck, A., Flad, D. & Munz, C. 2019 Deep neural networks for data-driven LES closure models. J. Comput. Phys. 398, 108910.
11. T Nakamura, K Fukami, K Hasegawa, Y Nabae, and K Fukagata. (2021) Convolutional neural network and long short-term memory based reduced order surrogate for minimal turbulent channel flow. Physics of Fluids 33 (2), 025116.
  • asked a question related to Data Processing
Question
2 answers
Hey!
I tried out Knime to save some time because some of my evaluation processes include a lot of copy-pasting of data into the right format for me.
I created a workflow that helped me now. What would be great is if the results file would be integrated in the original Excel file as a new work sheet. Can anyone explain to me how to do this? Is it possible at all? Right now I have an Excel writer node create a new file with the results.
Also I usually have several files and already found that If they are in the same folder i can execute the same routine on all files in the folder. But then it can not be processed because 1) i only have 1 output file and 2) some data sets are different (too good---without error message rows) and give the error that the respective rows are missing.
So if you have an idea how i can make the respective changes please help.
Attachted a file how my original data looks like ("Results") and the results i got with Knime copied into the next work sheet (sorting, getting rid of duplicates and rows i don't need). Also tried to export the workflow so you can have a look. I execute the writer at the bottom most of the time when i saw at the machine that the duplicates are ok.
Relevant answer
Answer
Thank you for this collection. I'll have a look and see if one of them offers a solution.
Best,
Juliane
  • asked a question related to Data Processing
Question
8 answers
Hello everyone
I have fitted data on XPSpeak41 software but when i try to export the spectrum (in .dat format), it only exports the raw spectrum but not the fitted curves. I tried plotting the .dat file in origin but it only shows raw spectrum as i said above...Any help in this regard would be highly appreciated if anyone knows how to save the fitted curves from XPSpeak41?
Relevant answer
Answer
How to plot XPS survey graph using Origin
  • asked a question related to Data Processing
Question
4 answers
Does it make sense to develop compression methods for large matrices used in chemometrics for multivariate calibration?
The main argument of opponents of this method is “increasing computational power and speed of computers for data processing and unlimited cloud data storage available” do not require compression since the compression slightly reduces the accuracy in multivariate calibration. (Cited from Personal communication).
Relevant answer
Answer
Another possibilities include decoupling your main matrix problem (after preprocessing it) into some combination of matrices with known structures (e.g., Toeplitz, circulant, band diagonal, etc.) and, then solve the subproblems according to your needs. Tensor structures/operations may be optimized for some structures in terms of hardware and networking.
As an engineer, I like to think that the problem and model tend to give some clues regarding simplifications.
The preprocessing stage depends on what you are investigating. Sometimes, outliers have more info than mainstream data.
Finally, the word "compression" has different meanings, levels and varieties. If you change your domain for a given tensor structure, you may get some constrains/compression.
  • asked a question related to Data Processing
Question
3 answers
Hello there!
I am a Masters student working at EPFL with Empatica E4. For the moment, as both the device and the data from it are very new to me, I am testing the solidity of the signals.
At first look, the EDA signal looks very noisy and filtering with Ledalab doesn't seem to solve the problem. However, as I said I don't have experience in processing EDA signals, so I would like to know if anyone can help me going through the basic and most common and efficacious processing steps. Any suggestion is highly appreciated!
Thank you in advance to anyone whose willing to help!
Relevant answer
Answer
I have not tested in yet but this year a Python toolbox called pyEDA was released that can preprocess EDA data: https://github.com/HealthSciTech/pyEDA. The toolbox cannot be installed (via pip or locally) at the moment since there is no setup.py, but this will hopefully change in the future.
  • asked a question related to Data Processing
Question
6 answers
I want to know about theory of the upward continuation and why do we use upward continuation and down ward continuation? Other thing is what is the relationship between the filtration and up/downward continuation?  
Relevant answer
Answer
Just to add to the answers already provided, upward continuation enhances deep structures. So, if your interest is to investigate deep regional structures, then upward contunuation should be your choice of enhancement technique.
  • asked a question related to Data Processing
Question
3 answers
The interactive wavelet plot that was once available on the webpage of colorado (C. Torrence and G. P. Compo, 1998) does not exist anymore. Are there any other trusted sites to compare our plot? And, in what cases we normalize our data by the standard deviation to perform continuous wavelet transform (Morlet)? I have seen that it is not necessary all the time. Few researchers also transform the time series into a series of percentiles believing that the transformed series reacts 'more linearly' to the original signal. So, what actually should we do? I expect an explanation by mainly focusing on data-processing techniques (standardization or normalization or leaving as it is).
Relevant answer
Answer
Thank you Abbas Thajeel Rhaif Alsahlanee and Aparna Sathya Murthy for addressing the question. It was of great help to me. I figured it out through the documentation of statistical methods in python.
  • asked a question related to Data Processing
Question
2 answers
I am currently working on a thesis on "analysis of public perception and acceptance of the COVID-19 vaccination process using the Structural Equation Modeling method". There are 6 variable used in the research : Behavioral Beliefs, Attitudes towards Vaccination, Perceived Norms, Motivation to Comply, Perceived Behavioral Control, and Intentions to Receive Vaccination
However, these results seem to make no sense to me:
  1. attitudes towards vaccination have a significantly negative relationship with motivation to comply
  2. attitudes towards vaccination have a significantly negative relationship with perceived norms";
  3. behavioral beliefs have a significantly negative relationship with attitudes towards vaccination .
I used this journal (Bridging the gap: Using the theory of planned behavior to predict HPV vaccination intentions in men, 2013, Daniel Snipes) as references for the research
Relevant answer
Answer
Dear Harry Gabe Parsaoran please have a look at the following potentially useful articles which might help you in your analysis:
The Protection motivation theory for predict intention of COVID-19 vaccination in Iran: A structural equation modeling approach
and
Influences on Attitudes Regarding Potential COVID-19 Vaccination in the United States
Both articles have been posted as public full texts on RG. Thus they can be freely downloaded as pdf files. I hope they are useful for you.
  • asked a question related to Data Processing
Question
4 answers
Hi All
I'm looking for some papers on Self driving cars that discuss the current capability and work, on how much Data processing these cars do, and what are the steps and techniques applied to improve them
Relevant answer
Answer
@Farzad possibly this could help your cause.
Yeshodara, N. S., Nagojappa, N. S., & Kishore, N. (2014, October). Cloud based self driving cars. In 2014 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM) (pp. 1-7). IEEE.
  • asked a question related to Data Processing
Question
6 answers
Drones, being speedy equipment, is the first choice to implement for the detection of cracks on railway tracks.
I am aware of the method of image capturing and then post-processing of the data.
However, I am looking for a sensor apart from a normal RGB camera, that can be mounted on a drone and can detect the cracks.
Thank You for all the possible help.
Regards,
Garvit
Relevant answer
Answer
Md Anowar Hossain, I agree with your input.
Thank you
  • asked a question related to Data Processing
Question
1 answer
I recently submitted a paper for publication, in which I describe the species of zooplankton identified in water samples using metabarcoding, as well as the percent composition. The percent composition estimated as the number of reads of a species over total reads. These results were compared for different markers A reviewer made the following comments below, but I do not understand exactly what they think I should have done.
Comment by reviewer:
"Are DNA metabarcoding data processing steps, i.e., standardization or rarefaction, data transformations, etc. performed? The methodology lacked this vital data analysis information which makes it difficult to comment on whether proper data processing was explored in the study to support a reliable interpretation of the results."
Relevant answer
Answer
Maybe you don't understand this, because you did not perform any analysis on your data and worked on raw number of reads to calculate your frequences?
This is a possibility, but you should specify it I guess.
I do not really understand the comment neither, maybe the reviewer wanted to refer to any data cleaning process (https://benjjneb.github.io/decontam/vignettes/decontam_intro.html).
What surprises me is that DNA metabarcoding do not provide any quantitative data and you can't make any relation between the number of reads and the abundance of a species in the community sampled (because of PCR stochasticity). These analyses should only be done on presence/absence data, or at best on occurence frequencies (numbre of time a species is present over all samples), but not provide quantification based on reads.
  • asked a question related to Data Processing
Question
3 answers
I'm in the process of doing a meta-analysis and have encountered some problems with the RCT data. One of my outcom is muscle strength. In one study, I have three different measurements of muscle strength for the knee joint (isometric, concentric, eccentric). I wonder how to enter data into the meta analysis. If I give them separately, I increase the number artificially (n). The best form would probably be to combine them within this one study, because in other studies included in the meta analysis, the authors give only one strength measurement.
Thank you all for any help.
Relevant answer
Answer
Yes, I recommend that you either combine the three outcomes in one measure, or choose the one outcome which most resembles the muscle strength outcome in the other studies. If you included all three measures, the present study would have (artificial) greater weight in the meta-analysis, and study samples would not be independent, since the study population is included three times.
  • asked a question related to Data Processing
Question
1 answer
Hello, everyone!
I would like to discuss with you about the method, which you are using for GCMS data processing in R.
I tried to study this question and look at a lot of literature, but unfortunately, I did not find a reliable answer. Maybe there are people among us who use R for data processing (untarget).
At the moment, I have 88 profiles of various samples (in the mzXML format) and my task is to find the difference between them. I loaded the data using the readMSData() method and then use the findChromPeaks() method to detect the peaks with CentWaveParam(). But I am confused by the correctness of the settings a CentWaveParam(). Maybe someone can suggest the tuning method for this setting, or at least examples that were used for GC-MS data (quadrupole).
Thanks for your help!
Relevant answer
Answer
Mr. Koluntaev,
The statistical software does not play any role in data-processing of mass spectrometric outcomes. Why?
Because of, from the currently available standard procedures for data-processing all of them provide lower method performaces that the real capability of the mass spectrometry as an absolute method for quantitative and 3D structural determination. There is a conceptual problem to the currently available methods for quantification, which is concentrated on the fact that they describe the ''intensity'' and the peak position ''m/z'' of the MS peak of the fragment ion over the whole time of the measurement as average values. This has resulted to a lower reliability of the quantitative results from the analyses. In addition, this treatment of MS variables has resulted to a loss of information about analytes at very low concentrations; owing to the fact that that MS operates at attomol and fmol concentrations of the analytes. This drawback of the currently available standard protocols for quantification we have overcomed by developing more recently our (authored) stochastic dynamic concept and model formulas for quantification of the experimental outcome ''intensity'' of the mass spectrometric measurements; however, per span of scan time. The correlative analysis between theory and experimental quantification of the analytes in mixture in solution at concentration levels of pg.(mL)-1 and ng.(L)-1 reported so far shows coefficients of correlations /r/ up to '1'. Therefore, our method is an absolute (or exact) method in chemometric terms.
Please, consider mode detail on our method in the reference sections to my answers to the following discussions [A,B].
[B]
  • asked a question related to Data Processing
Question
6 answers
Hello,
I need help with SNAP, and basically with the Sen2Core plug (v.2.8)
I need to do BOA correction for Sentinel-2 data from level 1C to 2A
I read that it is impossible to use Sen2Core in SNAP in Graph builder to do it in Batch process for many scenes simultaneously,
You have to use Sen2Core in the command-line interpreter (eg cmd in windows). Only then can you make corrections for all scenes in the directory.
Sen2Core works for single scenes, no problem with this in my computer.
but it doesn't work for many scenes, here comes my problem.
I'm using "for / d% i in (*) for L2A_Process.bat% i"
I have the data in a simple path, i.e. C: \ snap \ S2A_MSIL1C_20170531T100031_N0205_R122_T33TTG_20170531T100536.SAFE
but I get a message that "Product metadata file cannot be read"
please help, where could the problem be, any suggestions?
and I have a question, using Sen2Core in SNAP, you can do correction for all resolutions simultaneously by selecting the Resolution: ALL option
From what I read, using Sen2Core in CMD, I can only make corrections for one resolution at a time. It is suggested that you do for 60 first, then 20 and 10m. Is it true? Can't you do it for everyone at once?
Relevant answer
Answer
Dear Paweł Bronisław Dąbek using Sen2Cor by command line is basically the same as in SNAP.
The processor performs the atmospheric correction always to all eligible bands and preserves the original native resolution.
In addition please note that internally it uses all the bands to perform scene classification and atm correction so you will find always as output a product corrected. I also here attach an example of the relevant output if you run Sen2Cor without any further option (c.f. higher resolution bands are resampled).
  • asked a question related to Data Processing
Question
1 answer
I don't have any information about the aquisition of the data, except the year of aquisition!
Relevant answer
Answer
I can help you. do you mind send me some part of the data? I could process it for your reference. yangbo4100064@163.com
  • asked a question related to Data Processing
Question
3 answers
Please see the attachment.
I want to find out the relationship among genes (n=5) that causes low to high mortality in test organisms.
How can I visualize the relationship among genes compared with the mortality rate of the test organism?
Relevant answer
Answer
can be seen and use one of the figures where the correlation has been visualized using R to evaluate the statistically mounted or demounted.
  • asked a question related to Data Processing
Question
8 answers
What are the methodological differences in the processes of examining economic effectiveness or specific selected issues, aspects in the scope of analyzing the effectiveness of a given business activity in a situation of comparison of analyzes carried out for small enterprises and large business entities conducting diversified economic activities?
For small business entities representing the SME sector, those operating in one area of ​​economic activity, the simplest solution is to select economic and financial indicators relevant to the needs, which determine specific issues of efficiency, eg fixed assets, current assets or other classified capital categories, production factors. It is also possible to analyze and measure the effectiveness of specific processes in an enterprise, the effectiveness of measures, specific investment projects, efficiency of logistics processes, work efficiency of employees, etc. For each of the mentioned types of effectiveness tests other economic or financial indicators are used.
However, in the situation of the analysis of complex, multi-factorial processes realized with economic entities, multifaceted processes covering various spheres of activity of a specific enterprise, covering the entirety of a large enterprise operating in various business areas and with the involvement of much larger financial resources for conducted economic efficiency analyzes, then they should Complex indicator models built from many interrelated economic, financial and other indicators can be used.
A good solution in this situation is the involvement of Business Intelligence technology using large data sets describing the functioning of a specific large enterprise, gathered in Big Data database systems. In addition, advanced data processing and analysis can be made using cloud computing technology. In addition, access to data, data update and commissioning of specific analyzes of economic performance research can be carried out from the level of mobile devices, i.e. through the use of the Internet of Things technology.
Do you agree with me on the above matter?
In the context of the above issues, I am asking you the following question:
What are the methodological differences in the processes of examining economic effectiveness or specific selected issues, aspects in the scope of analyzing the effectiveness of a given business activity in a situation of comparison of analyzes carried out for small enterprises and large business entities conducting diversified economic activities?
Please reply
I invite you to the discussion
Thank you very much
Best wishes
Relevant answer
Answer
...significant and inverse relationship between firm size and its efficiency based on
DEA model: in fact the larger the company its efficiency decreases. Thus, according to confirming
the inverse relationship between firm size and firm efficiency, it is recommended to investors and
managers to consider the efficiency index and the desired output with respect to investments
made according to the DEA models to achieve efficiency....
The traditional profit-based criteria have recognized defects the important of which is
being manipulated by various accounting procedures and reliance on the limiting principles of
conservation and retrospection. Thus, it is necessary to find some new parameters in order to
sensibly study companies’ performance. In this regard, data envelopment analysis (DEA) is
considered as a new way to do this. The main effect of this technique is that all previous variables
for assessing performance are simultaneously or individually included. In such models, raw
accounting data, financial ratios, economic variables, and nonfinancial data and factors can be
used (Musavizadeh, 2010)...Razmi et al., 2014
  • asked a question related to Data Processing
Question
5 answers
With the availability of huge amount of remotely sensed data comes the issue of big data processing. I am wondering if there exists any associated new statistical image processing and/or information extraction algorithms that have been developed for processing big data? I have searched the net and not much is there. Also any suggested readings in the subject are highly appreciated as well. Thanks
Relevant answer
Answer
We have entered an era of big data. Our ability to acquire remote sensing data has been improved to an unprecedented level. For a large ground station (e.g., China Remote Sensing Satellite Ground Station (RSGS)), the volume of global data archive could be on the Exabyte level. https://www.mdpi.com/journal/remotesensing/special_issues/rs_bigdata
Regards,
Shafagat
  • asked a question related to Data Processing
Question
7 answers
If somebody wants to mathematically model data, information, and knowledge. Data represents a raw material for processing service delivery solutions to produce information. Knowledge acquired by handling such information by experts in a special field such as computer science, psychology, mathematics, and statistics. How can mathematical models be developed to describe knowledge acquired by individual, population, or community?
Relevant answer
Answer
@George Stoica. Your papers are very helpful for me to taking surveys
  • asked a question related to Data Processing
Question
15 answers
Considering the specifics of the increasingly common IT systems and computerized advanced data processing in Internet information systems, connected to the internet database systems, data processing in the cloud, the increasingly common use of the Internet of Things etc., the following question arises:
What do you think about the security of information processing in Big Data database systems?
Please reply
Best wishes
Relevant answer
Answer
The risk could be in tow form - one you already have mentioned is Security, a vital risk - needs to addressed by collective efforts on a war footing.
Secondly, the size of data itself, how integration takes place among hardware, software, latest internet serervice providers, cloud, etc. across the globe is also a risk.
  • asked a question related to Data Processing
Question
17 answers
Greetings,
I am planning to work with open source paleo-climate data for a thesis, so far the only source for these kind of data i know is : https://www.ncdc.noaa.gov/paleo-search/
Is there any other sources that provides a good amount of paleo-climate data or this the most available source currently ?
With that being said, would any paleo-scientist like to tell me what are some of the special things that you take into consideration while you are dealing with such data, especially because they are from past and mostly climatic reconstruction or proxies ? If you went through that link, you'd see most of them are in (.txt) files, therefore what would be some potential software or programming languages you have used or planning to use, that would be helpful in this regard ? or you'd process it like any usual data (e.g. netCDF are very popular in climatic studies but unfortunately i don't know whether (.nc) files exists for paleo-climate data)?
Any advice or suggestions, in addition to my question would be deeply appreciated.
  • asked a question related to Data Processing
Question
6 answers
Where can i buy GPU for the purpose of COVID-19 image and data processing with Deep learning?
X-ray images
CT-Scan Images
How many core needed?
Which configuration is best?
With in 10K USD and optimum configuration?
Either online or in Saudi?
Is there any assembled configuration possible from vendor?
Relevant answer
Answer
Extensive and Augmented COVID-19 X-Ray and CT Chest Images Dataset
This COVID-19 dataset consists of Non-COVID and COVID cases of both X-ray and CT images. The associated dataset is augmented with different augmentation techniques to generate about 17100 X-ray and CT images. The dataset contains two main folders, one for the X-ray images, which includes two separate sub-folders of 5500 Non-COVID images and 4044 COVID images. The other folder contains the CT images. It includes two separate sub-folders of 2628 Non-COVID images and 5427 COVID images.
Cite it in your research work:
El-Shafai, Walid; E. Abd El-Samie, Fathi (2020), “Extensive and Augmented COVID-19 X-Ray and CT Chest Images Dataset”, Mendeley Data, v2
For any questions, do not hesitate to contact me.
Regards
Walid El-Shafai
  • asked a question related to Data Processing
Question
1 answer
Does Stata allow for error tracking in data processing?
Relevant answer
Answer
If I understood you well, the set trace on command should be works. To begin the check:
set trace on
(Then your code, and if anything is bad, in this mode you can trace it)
set trace off
  • asked a question related to Data Processing
Question
15 answers
Does the indirect effect (in a simple mediation model) of your mediator only consider the IV, or also take into account the control variables in your model?
I am trying to check the robustness of the indirect effect in my simple mediation model (with 6 control variables), with the Sobel/Aroian/Goodman Test. Do I need to include my controls to check the indirect effect or not?*
*I have tried both. The indirect effect is significant as I include my controls, but insignificant as I exclude them. In my initial methodology, my indirect effect is shown to be insignificant (using bootstrapping method).
Relevant answer
Answer
I'd disagree slightly. There is no general reason to use SEM for a mediation model without latent variables. There are specific reasons why SEM approaches might be helpful by making the analysis a single step or by providing fit statistics or in fitting a wider range of models. However, a simple mediation analysis fit as a path model using multiple regression should be equivalent to the SEM formulation. There are also intermediate approaches like piecewise SEM models that can be more attractive than full SEM (if you don't have latent variables).
Generally most regression models including mediation models can be specified in equivalent ways using different modeling approaches.
  • asked a question related to Data Processing
Question
3 answers
I am a beginner in the field of GPU computing and big data processing. I want to perform research or an experiment that could integrate both the technology. Please provide me some suggestions
Relevant answer
Answer
Information-driven decision making based on real-time measurement strategies would be a interesting area. Even, the use of GPU to solve situations (through matrixs) related to scenarios, multiple states, the relationships with the monitored concept, etc in the measurement process would have a lot of applications.
  • asked a question related to Data Processing
Question
2 answers
Hi all, i am doing the global profiling of plant using LC-MS(QTRAP_6500) method. I am using XC-MS as my data analysis tool. So for each m/Z corresponding to a metablite, the Rt varies from 1 to 30. how it possible and which Rt I will select for my metabolite
Relevant answer
Answer
Hi Mrudula,
It is normal to get multiple peak groups for every m/z since you're performing an LCMS run on your data. There could be multiple metabolites that had the same mass/charge and those have been separated using the LC, hence you see them at different RTs. Usually, it is good practice to use known standards for important metabolites so that you can figure out the correct RT for your metabolites. Apart from that, what might be helpful is to look at the label patterns if your data is labeled. The patterns might be able to help you identify the metabolite in question. For example, if you get a M5 isotopologue for a metabolite that only has 4 carbons, that would give you a clue. Additionally, intensity levels can be used to identify low concentration and high concentration metabolites, as well as intensity patterns across defined cohorts.
If your objective is identification, I would suggest going for Data dependent MSMS so you can use the spectral information for identification of metabolites. There are a number of publicly available spectral libraries that can be used. I would suggest checking out El-MAVEN or Skyline if you go that route though. I'm not sure if XCMS Online supports that kind of data.
  • asked a question related to Data Processing
Question
2 answers
I'm working with stream data processing and I'm confused regarding how to define and prove the hypothesis for stream data processing? I need some sample examples.
Relevant answer
Answer
Thank You Ijaz Durrani.
  • asked a question related to Data Processing
Question
6 answers
Is the progressive increase in the digitization of education process instruments a feature of the current technological revolution known as Industry 4.0?
Measuring the impact of digital technology, including new online media on the process of learning and the effects of the education process can be based on a comparison of assessments at a specific time, a specific educational process supported by the use of new online media, social media portals and other technologies typical of the digital age. These technologies currently include mainly new technological solutions, streamlining improvements, innovations etc. regarding, among others, advanced data processing, including data obtained from the Internet, data processing in the cloud, Big Data database systems, using artificial intelligence, etc. These new technologies of advanced information processing co-create the current fourth technological revolution called Industry 4.0.
In view of the above, the current question is: Is the progressive increase in the digitization of education process instruments a feature of the current technological revolution known as Industry 4.0?
Please, answer, comments. I invite you to the discussion.
Relevant answer
Answer
Dear Colleagues and Friends from RG,
In the context of the above discussion, the following question arises:
Do smartphones change the social behavior of children and young people?
On the basis of the above considerations and conclusions from the discussion on interesting issues discussed, I formulated the following thesis that the impact of everyday use of smartphones by young people can significantly modify and shape the social behavior of young people. However, what will be the effects of this impact and changes in social behavior, methods of communication between people, i.e. whether they will be socially and psychophysically positive or rather negative effects, it depends on many factors, which include, among others, the following determinants:
- In what applications are they used as lubricants by young people?
- How much time on a daily scale do children and young people spend using smartphones for various purposes?
- Do children and young people use smartphones mainly for educational purposes, as a tool for finding information related to knowledge learned at school, or is the smartphone a source of entertainment?
- Do children and young people use smartphones, among others, to contact their peers through social media portals such as Instagram, Facebook, Messenger, Snapchat, Pinterest and others?
- Are children and young people already a significant market for the sale of products from various companies advertised on social networks?
- Can excessive use of smartphones by children and young people negatively affect their psychophysical development?
The above discussion inspired me to formulate the following question:
Can the impact of everyday use of smartphones by young people significantly modify and shape the social behavior of young people?
Below I have described the key determinants confirming the formulated research thesis. To the above discussion I would like to add the following conclusion formulated as a summary of my previous considerations on this topic: The positive or negative impact of smartphones on the social behavior and psychophysical health of children and young people can be large.
In recent years, young people use desktop computers less often and more often with laptops and mobile devices or mainly use mobile devices, primarily smartphones. Are these changes in the use of devices with Internet access, and the increasing use of smartphones, conducive to the use of Internet communication, including social media portals and the use of Internet information resources, make young people significantly use the Internet as a useful knowledge resource to the education process? Research shows that smartphones are equipped with new applications offering new types of information services, etc. Therefore, smartphones are less and less used for telephoning because the number of other functions and applications in the field of new Internet information services is increasing.
ICT information technologies, internet technologies and advanced processing technologies for large data sets collected in Big Data database systems, processed with the use of Business Intelligence analytical platforms, with access to the analytical system through smartphones and other devices. The Internet of Things is increasingly used in modern education. Therefore, the possibility of using smartphones as a tool supporting education is also an important issue. ICT information technologies and new media internet technologies, Industry 4.0 advanced data processing technologies, which are increasingly being implemented into educational processes and may also negatively impact pedagogical processes in schools. I also believe that the implementation of ICT information technologies, new media internet technologies, including social media portals and advanced data processing technology Industry 4.0 into educational processes can have a negative impact on the classical theory of children's mental development. Because in young children and adolescents, the process of adopting new concepts (thought process) in pre-school institutes extends from the perception of the senses (sight, hearing, touch), speech and direct manipulation of real objects (teaching resources), to create abstract concepts is the use of new information technologies ICT, Internet technologies and Industry 4.0 should be done under the full control of educators. Because in young children, the child must first "see the object in his head", he first had to "see it and touch with his hand" to learn, so devoting a significant amount of time to viewing various graphic and film spots, advertising on social media websites by children can cause negative effects in the psyche and in the child's psychosomatic development.
To answer the above questions, it is necessary to verify first of all the following question: Do children and young people use smartphones mainly for educational purposes as a tool for searching information related to knowledge learned at school, or rather a smartphone is a source of entertainment? Children and young people are increasingly using smartphones to, among others, contact their peers through social media portals such as Instagram, Facebook, Messenger, Snapchat, Pinterest and others. Many companies in the clothing, cosmetics, toy, perfumeries, etc. have already noticed that children and young people already constitute a significant market for the sale of products of various companies advertised on social networks. However, many studies show that excessive use of smartphones by children and young people can negatively affect their psychophysical development. Many potential negative aspects of the use of smartphones by children and young people have already been defined.
For example, too much time allocated to the use of smartphones by children and young people for entertainment purposes, e.g. to conduct discussions with peers through social media portals and to play computer games reduces the time spent on physical activity and can be an important factor increasing obesity and deterioration of physical and mental health. In addition, children recently watch a lot of movies and cartoons on social networks, including smartphones. If children spend a lot of time on watching movies and cartoons on social networks, including smartphones, this can have a negative impact on the child's psycho-physical development, including causing deterioration of vision if they spend a lot of time watching movies, posts, comments, advertising banners on social networks viewed on your smartphone.
In recent years, there have been more and more situations of addiction of children and young people to the use of smartphones. The problem of youth addiction to smartphones is growing in many countries. A very negative effect is the rapidly growing number of road accidents caused by the fact that while driving a car, motorbike, bicycle drivers use smartphones and more and more pedestrians fall under the cars entering the road while browsing messages on smartphones. Therefore, in some countries more restrictive legal regulations are introduced, including, first of all, a ban on entering the road while looking at the smartphone. In addition, special lighting systems are installed in the pavements to inform about changing lights at pedestrian crossings on the roadways. In this respect, the educational role of parents and teachers in schools is also crucial in considering the impact of long-term use of smartphones by children on children's development.
Therefore, if children or young people use smartphones to learn, as a tool supporting the processes of education, communication with peers from school and friends, and if they use new internet media devices from time to time, this may not be assessed positively. However, if it divides or teenagers use smartphones many hours a day, among others, browsing advertisements on social media portals and worthless memes and films, then it can have a destructive impact on the intellectual and psychological development of children and youth. In this situation, the use of smartphones by children and youth should be limited and controlled by parents, guardians and teachers.
Therefore, the use of smartphones by children and young people should be subject to parental control. The need for this control results from the growing number of different information services available on smartphones, the growing number of children and adolescents dependent on the use of smartphones, the increase in the number of cases of neglecting school duties, the increase in the percentage of children diagnosed with vision defects caused by too long browsing of content posted on the Internet and read on a smartphone etc.
Therefore, information and communication technologies cannot replace every form and method of learning in the educational process of young children and young people. Of course, full implementation of ICT information technologies, new media internet technologies, including social media portals and advanced data processing technology Industry 4.0 into teaching processes in schools cannot be ruled out. This process is already underway. However, it is necessary to bring this process under full control of educators, teachers and parents of children. Already, there is a lot of disturbing information from the media and from ongoing research on the effects of using new media internet technologies, including social media portals by children and young people. Children and teenagers mainly browse social media on smartphones. Many children spend too much time browsing social media portals on smartphones. The result is a reduction in the time spent on physical and sport activities, on learning, book readership decreases and the scale of diagnosed vision defects in adolescents in recent years. In this way, many problems arise that can affect the reduction of educational opportunities for children and adolescents. these problems should be solved systemically at all levels of the education system, i.e. from ministries of education to individual schools.
In addition, students and parents should be made aware of emerging threats through social campaigns in various media. Research shows that the process of implementing ICT information technologies, new media internet technologies, including social media portals and advanced data processing technologies, Industry 4.0 for teaching processes in schools has already begun. Of course, the use of ICT, Internet and Industry 4.0 technologies in education processes does not only generate negative aspects. Therefore, the central institutions of the education system should coordinate the development of these processes in such a way as to maximize the positive aspects of the implementation of ICT information technologies, new media internet technologies, including social media portals and advanced industry 4.0 data processing technology for teaching processes in schools. However, one should not forget about these negative aspects, about already diagnosed developing problems, which should be solved and educating teachers, students and parents about potential threats.
Activation for critical thinking of students is a particularly important determinant of effective education. Modern education instruments are important in this matter, thanks to which analytical techniques, brainstorming, debates, discussions, etc. used in the education process of pupils and students are developed. These techniques should also develop creativity, innovation and teamwork. In my opinion, activation of critical thinking of students and pupils, development of discussion skills in debates, development of creativity, innovation and teamwork of pupils and students correlates perfectly with the development of the concept of modern education 4.0. Currently, in the era of the technological revolution referred to as industry 4.0, new teaching concepts are emerging known as education 4.0.
On the other hand, dynamic development of social media portals on the Internet is currently underway. For young people using smartphones, social media portals are one of the main sources of information.
Probably the next stage in the development of social media portals will be the implementation of artificial intelligence into these portals and into search engines and creating applications such as interactive advisers on individual information websites. Social media portals are at some stages of education, in some education systems they are used to educate students on specific issues and according to the age of the students. But do they really help in the education process or are they just another teaching aid without a significant impact on the learning outcomes? Pupils and students use social media portals to exchange information useful for education. In addition, on Facebook, pupils and students create group profiles where they post joint didactic materials. In addition, they create survey forms for the needs of surveys, the results of which are used for written theses and final essays.
In connection with the above, another key question arises:
Should new online media be used in education processes? In my opinion, yes, new online media should be used in education processes. The issue of communication with the use of new online media is very important in the context of an effectively conducted education process. We currently communicate widely through various online media, including email. Some email inboxes that we use have anti-spam restrictions, which makes communication difficult. The development of communication through various internet media, also through social media portals, is an important issue in education. New media should be used effectively in the education process, but their technical specifications are not always fully suitable for the needs of communication development in the context of the implemented education process. However, whenever possible, new online media should be used in education, because young people use them widely and can be an excellent additional tool in the field of teaching instruments, e.g. for the effective search of necessary, current information.
In solutions regarding the use of smartphones as instruments used in didactics, the following should also be taken into account: During some lessons it may be for a certain period of time, in certain situations of didactic games or presentation of specific processes and lesson topics, the teacher may allow the use of devices such as virtual reality spheres and augmented reality. In addition, the teacher may also include other mobile devices such as laptops, tablets, smartphones, etc. in the education process. In specific situations, these devices would play the role of teaching instruments to support the teaching processes conducted by the teacher. However, if the use of laptops, tablets, smartphones, smartwatches, etc. during school lessons is not a teaching instrument, it is not part of the educational process, it should be prohibited during the lesson to use these devices. Using laptops, tablets, smartphones, smartwatches and other mobile devices enabling browsing the Internet resources during the lesson may interfere with students' active participation in the lesson and the teacher may be distracted in conducting the lesson. On the other hand, during some lessons it may be for a certain period of time, in certain situations of searching for information on the Internet, the teacher may allow the use of these devices if laptops, tablets, smartphones, etc. would act as didactic instruments enabling finding the necessary information.
On the other hand, thanks to information technologies, new online media and Industry 4.0 technologies, the logistics of information and communication between institutions and enterprises is being improved. The improvement of communication between individuals, institutions and enterprises is achieved through the use of new online information and communication media in combination with the Internet of Things technology. The current development of mobile devices, mainly smartphones and their applications for international communication and information transfer makes it possible to improve information logistics. Therefore, the use of online new media technologies and advanced information processing technologies, i.e. industry 4.0 typical for the current technological revolution, can significantly increase the efficiency of communication processes between cooperating members of national and international working teams in institutions and enterprises.
In connection with the above, there are also many positive applications of smartphones, including those used to support the development of science. Smartphones facilitate communication between scientists and facilitate the search for scientific information that can be useful for conducting scientific research. Smarfons are increasingly replacing laptops and other computers. In a situation where scientists, teachers, students and students would smartly search and read publications published e.g. on the Research Gate website, this is a perfect, positive example of using a smartphone as a tool supporting the processes of education and science development.
It is worth adding the following issue to these considerations. At present, smartphones dominate in the use of communication and the use of various information services. In addition, some Internet users also use tablets and smatwatches. Google glasses with Internet access a few years ago were to be an innovative hit that would revolutionize the mobile Internet, but they did not get adopted on a larger scale. Perhaps no significant demand has arisen for this type of device yet. Perhaps this will change in the future. Perhaps the era of this type of mobile devices will appear in the future when the next generations of compatible devices will appear and 5G Internet will become more widespread.
According to the above, in my opinion the impact of everyday use of smartphones by young people can significantly modify and shape the social behavior of young people. However, what will be the effects of this impact and changes in social behavior, ways of communication between people, i.e. whether they will be socially and psychophysically positive or rather negative effects, it depends on many factors that I described above.
Do you agree with me on the above matter?
I am currently conducting research in the field of education, specifically the use of technical devices, teaching facilities and the use by students of available IT devices, mobile devices and smartphones as an aid in the education process. Does any of you conduct research on similar issues or on other topics? If in similar, similar topics, I invite you to cooperation.
In view of the above, in order to more fully identify the above issues, it is necessary to conduct research that will facilitate the formulation of answers to the following questions:
- What types of information services are currently used for smartphones?
- Is browsing social media on smartphones starting to dominate over other applications and internet information services?
- What kind of information services available for smartphones will develop in the future?
- Does using smartphones change sociological behavioral issues?
- Does the use of smartphones change the standards of human behavior?
- Does using smartphones change the standards of communication between people?
- Do children and young people use smartphones mainly for educational purposes, as a tool for finding information related to knowledge learned at school, or is the smartphone a source of entertainment?
- Do children and young people use smartphones, among others, to contact their peers through social media portals such as Instagram, Facebook, Messenger, Snapchat, Pinterest and others?
- Are children and young people already a significant market for the sale of products from various companies advertised on social networks?
- Can excessive use of smartphones by children and young people negatively affect their psychophysical development?
- What social problems are derived from the addiction of children and young people to the use of smartphones?
- Should laptops, tablets, smartphones, smartwatches etc. be used during school lessons?
- Is too much time allocated for the use of smartphones by children and young people for entertainment purposes, e.g. to conduct discussions with peers through social media portals and to play computer games reduces the time spent on physical activity and can be an important factor increasing obesity and deterioration of physical and mental health?
- Should children and teenagers be limited and the use of smartphones limited? Should you fully control the use of smartphones by children and young people?
- What do you think are the potential other effects of using smartphones on children and young people?
- What Internet of Things devices will in the future take over most of the functions of current smartphones? Will it be e.g. smart glasses like Google smart glasses?
- In what direction will the evolution of mobile devices take place in the Internet of things in the future?
- What do you think about this topic?
- What is your opinion on this topic?
Please reply
I invite you to discussion
thank you very much
Best wishes
Dariusz Prokopowicz
  • asked a question related to Data Processing
Question
3 answers
I have a spectrum recovery algorithm based on OMP. I want to use it for wireless sensors to optimize their data procession and data transfer as they generate loads of sensed parametric data.
My algorithm is in matlab and I need to reduce its execution time to make it fast enough to make my sensor nodes able to learn faster and adapt accordingly.
Relevant answer
Dear Charushila,
welcome,
There are tow levels of optimizations:
The algorithmic optimization
In such algorithmic optimization you construct your logical and mathematical solution of the problem to reduce the mathematical operations required to get the results especially the multiplications and divisions.
The code building such that you minimize the redundant calcualtions.
If you have enough memory you can trade off against the processing. Memory accessing time may be smaller than calculating of a specific parameter.
You have to analyse the program to optimize it.
There is methods and tips published in the literature that you can apply for the reduction of the execution time.
You need only to use the key word: matlab code optimization
Best wishes
  • asked a question related to Data Processing
Question
2 answers
I use StatSoft Statistica 12 for Data Processing and developing the algorithm of building  of confidence interval. And I need test arrays of pseudo random numbers to check the algorithm efficiency.
Relevant answer
Answer
Thank You, David!
Now I've found this option in Maple 17 for an arbitrary distribution.
Kind regards
Viktor
  • asked a question related to Data Processing
Question
6 answers
Dear all,
I have a huge database of crude hourly meteorological data, and I am looking for some tutorials for processing it, filling gaps, detect outliers, etc. preferably using R (but I could also use Python and Julia).
Any suggestion?
Thanks in advance!
Relevant answer
Answer
Python provides with high-level functionality to meet all your needs, see e.g.: https://github.com/simongeek/PandasDA
Anaconda Python distribution makes everything easier (https://www.continuum.io/downloads).
Best
  • asked a question related to Data Processing
Question
3 answers
My recorded FACS data only includes FSC-A and SSC-A, (and my color channels) no FSC-H or FSC-H.
Normally I gate for singlets first with FSC-A vs FSC-H.
I can not do this now.
Did I forget to record the FSC and SSC -H?
Is there a way around this? Can I get the height values by data processing?
How bad would it be to skip the singlet gating step?
Thank you and best regards,
David
Relevant answer
Answer
If you have the data of FSC-A and FCS-Width you can recalculate the FCS-H. FCS-A is only a calculated value, not directly measured.
  • asked a question related to Data Processing
Question
3 answers
The relative standard curve method in qPCR is a standard curve-based method for relative real time PCR data processing. Can anyone explain the calculations of this method to me? I read something about pooling of cDNA samples and using them as a calibrator is this right?
Relevant answer
Answer
Dear Omnia Adel Badr, this topic is quite wide. The best storage of appropriate data is on https://www.gene-quantification.de/ reading these articles and guides is very useful. But I understand that it is very laborous. In two words...
1. Standard curve means serial dilutions of your target genes. One of the possible ways is cDAN/DNA (depends on your start material and goals) serial dillution. For example 5-fold (you can also use 2-fold or 10-fold, but for me 5-fold is better). It is callibration curve that can provide you quantitative information about target gene copy number in your samples (depends on your choise: cells, genome equivalents, ng/ul concentration and oth).